中文摘要 |
檢核差異試題功能(differential item functioning, DIF)時,若測驗有較多DIF試題且有利同一組別時,會造成型一錯誤膨脹。有學者提出先定錨後檢核(DIF-free-then-DIF, DFTD)策略(Wang, Shih, & Sun, 2012)以控制DIF檢核時之型一錯誤。DFTD策略可被應用在不同的DIF檢核方法上,然而目前尋找定錨題的方法均可能找到具有DIF的定錨題,進而影響DIF檢核效果,因此本研究之研究目的有二:(一)探討DFTD策略中,影響尋找定錨題方法表現之因素;(二)為探討DFTD策略中,影響DIF檢核方法成效之因素,進而提出使用DFTD策略的建議情境。本研究使用標準MIMIC法(the standard multiple indicators, multiple causes method, M-ST)、量尺淨化MIMIC法(the MIMIC method with scale purification, M-SP)與迭代MIMIC法(the iterative MIMIC method, M-IT)來選取四道定錨題以執行DFTD策略,並探討定錨題對於DFTD策略之影響。變異數分析的結果顯示,M-IT尋找定錨題的正確率優於M-SP,M-SP優於M-ST,故建議DFTD策略中應以M-IT或M-SP選取定錨題;此外,在DFTD策略中,DIF百分比、樣本數、DIF型態及試題反應理論(item response theory)模式是明顯影響選擇定錨題之正確率以及DIF試題之檢核力的關鍵因素;型一錯誤則受到DIF型態、DIF百分比及樣本數等三個變項之影響。由於樣本數可以由研究者控制,故而研究者在使用MIMIC法結合DFTD策略時,樣本數以R500/F500、資料以符合二參數對數模式(two-parameter logistic model)為宜。 |
英文摘要 |
Conventional differential item functioning (DIF) assessment methods tend to yield an inflated type I error rate and a deflated power rate when the tests contain many DIF items that favor the same group. To control type I error rates in DIF assessments under similar conditions, the DIF-free-then-DIF (DFTD) strategy is proposed. The DFTD strategy consists of two steps: (1) selecting a set of items that is most likely to be DIF-free, and (2) assessing DIF for other items using the designated items as anchors. To explore the variables that influence the performance of the DFTD strategy in assessing DIF, a series of simulation studies was implemented in this study. Three multiple indicators, multiple causes (MIMIC) methods, namely the standard MIMIC method (M-ST), the M IMIC method with scale purification (M-SP), and the iterative MIMIC method (M-IT), were used to select four items as an anchor set before implementing the DFTD strategy. The results of the analysis of variance showed significant differences among M-IT, M-SP, and M-ST in identifying DIFfree items, with M-IT performing better than M-SP, and M-SP performing better than M-ST. The analysis also found that the main effects of DIF patterns, DIF percentages, sample sizes, and item response theory (IRT) models, as well as their interactions, were significant in terms of their accuracy in identifying the DIF-free items. Based on the results, the M-SP and M-IT methods are recommended for use in identifying DIF-free items, especially when there are many DIF items in a test. The same set of variables significantly determined the power rates of these methods in assessing DIF. However, the type I error rates in the DIF assessments were significantly influenced by the DIF patterns, DIF percentages, and sample sizes. Based on the results of this study, it is recommended that R500/F500, as well as data fits two-parameter logistic model (2PLM), be adopted when applying the DFTD strategy with t he MIMIC method in assessing DIF. |