英文摘要 |
This study investigated the influence of testlet effect on the recovery of item and person parameters. Simulated data was used to compare the parameter recovery given by the traditional item response theory (IRT) model, the testlet response theory (TRT) model and the bi-factor model. The Q3 index was computed for all the data sets to serve as a tool for detecting the local item dependency. There are six testlets in each test with 5 items for each testlet. Three testlet slope (0.0, 0.6, & 1.2), and two sample sizes (500 & 1000 examinees) were manipulated for simulating the item response vectors. For each combination of the conditions, item response data was simulated 100 times. The simulated item response data was calibrated by using the three models described above separately, and Q3 was calculated for each data set. The main findings are as the following: (1) The accuracy of item parameter recovery was higher for the testlet-based models than that of traditional item-based model (i.e., 3PL model). Among the three models used, the TRT model performs best, followed by the bi-factor model and then the IRT model; (2) The greater sample size is, the better accuracy of item parameter recovery is gained. Similarly, the greater testlet slope is, the greater estimation error is found; (3) The Q3 indexes calculated for the paired items from the same testlet were found larger than those for the item pairs from different testlets. This indicates that the Q3 index performs well in terms of detecting the local dependence between items. |