中文摘要 |
許多國際大型測驗,多採用可能值方法來進行群體能力參數的估計。而可能值的資料型態,亦可讓資料分析者進行統計特性的描述。此外,一般大型測驗所評量的範圍都涵蓋了不同的認知向及難度,無法由單一受試者於短期間內全部完成,測驗題目都會進行不同的等化設計以減輕受試者負擔並達成測驗的目的。本研究係各以定錨不等組(non-equivalent groups with anchor test design,NEAT)及平衡不完全區塊(balanced incomplete block design,BIB)的垂直等化設計,並以可能值方法、納入背景變項的期望後驗法、期望後驗法及最大概度估計法等各種方法進行群體能力的平均數與標準差的估計,主要的目的在於探討可能值方法及其它估計法在群體參數估計的效果。本研究結果顯示在各種不同的等化設計下,群體能力平均數與標準差的估計,納入背景變項估計方法皆有較好的估計效果,特別是群體能力標準差的估計,可能值方法的估計結果遠優於各種估計方法。 |
英文摘要 |
The purpose of this paper is to explore the performance of plausible values method under BIB and NEAT designs based on simulated data. The major focus of large-scale assessments is always on the population statistics, such as means and standard deviations, and the plausible value method is usually used to estimate the population parameters. For large-scale assessments the spectrum of subject matter is usually wide, but the testing time is short. Therefore, in order to cover the proficiency domain sufficiently, multiple booklets are used. Balanced incomplete block design (BIB) and non-equivalent groups with anchor test design (NEAT) are two popular test equating methods for this condition. The experimental results show that the estimating method based on plausible values estimate better than that of other methods in equating designs, and as the test length increase, population parameters (means and standard deviations) are well estimated. In these experimental situations, the estimations of population parameters are not affected by sample size (16,128 and 10,920). Both linking designs, BIB and NEAT, can lead to more precision estimates by using plausible value method. |