月旦知識庫
 
  1. 熱門:
 
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
中華心理學刊 本站僅提供期刊文獻檢索。
  【月旦知識庫】是否收錄該篇全文,敬請【登入】查詢為準。
最新【購點活動】


篇名
判斷試題是否具有良好鑑別力的客觀方法
並列篇名
AN OBJECTIVE PROCEDURE FOR DETERMINING IF ITEMS HAVE GOOD DISCRIMINATION
作者 王文中洪來發
中文摘要
題目難度和鑑別力的分析是最主要的兩種題目分析。傳統上,常用鑑別力指數、點二系列或二系列相關係數來表示題目的鑑別力。由於這些指數並沒有客觀的統計程序來判定題目是否具有良好的鑑別力,實用上通常以主觀的標準。例如鑑別力指數或點二系列相關係數大於0.3,就表示該題其有良好鑑別力。反之,則無。本研究試圖提出客觀的統計程序來檢定試題是否具有良好的鑑別力。首先定義具有良好鑑別力的題目應該要能同樣有效的區辨目標母體的所有分數點。接著利用題目答對率與測驗總分的線性關係,說明良好鑑別力的具體意義,並據以推導出在logistic迴歸模式中,迴歸係數各應為多少才能呼應良好鑑別力的要求。我們並說明如何求得題目的logistic參數估計值,以及如何檢定題目是否具有良好的鑑別力。透過電腦模擬分析,發現線性模式和logistic模式可以有不錯的配適,尤其是當測驗總分呈現常態或卡方分佈,且樣本數小於5000時。我們以大學聯考英文科的50題選擇題和500位考生的資料,進行實例分析。結果發現本研究提出的方法和古典試題分析的結果大致相同,不過本研究的方法具有客觀的統計意義,而古典的方法沒有。
英文摘要
Item difficulty and discrimination analyses are the two most important item analyses. Several indices such as the index of discrimination, the point-biserial correlation, or the biserial correlation, and Fleiss' s odds ratio have been proposed to depict item discrimination power. Although we could test if these conventional discrimination indices are significantly different from zero, there is no objective criterion to determine how large they should be for an item to have good discrimination. Practically, test analysts usually use 0.3 or 0.4 as a cut-point. If the index of discrimination or the point-biserial correlation exceeds the cut-point, the item is flagged as exhibiting good discrimination power. In addition to the drawback of no objective criterion available, these indices depend on sample characteristics and item difficulty, for example, they will yield higher values for item difficulty (i.e., passing rate) close to 0.5 than for it at the extremes of difficulty. Although the cut-point may be a useful guideline, a statistical procedure is preferred. This study attempts to establish an objective statistical procedure for determining if an item has good discrimination or not. To do so, good discrimination is first defined. An item is said to have good discrimination if it discriminates every score point equally well for the target population. We find this definition appropriate because every score point is considered equally important. We use a linear relationship between the probability of passing an item and the test score to depict how the regression should look like when an item has good discrimination. For binary outcome variables, the logistic distribution can better depict the relationship between probabilities of passing an item and test scores. In order to hold the equal discrimination power assumption, we then derive a logistic regression curve that is closest to the ideal discrimination line. Once this 'theoretical' logistic regression curve is derived, the observed logistic regression curve, derived from test data, could be compared to the theoretical logistic regression curve. If the observed logistic regression curve is statistically different from the theoretical one, the item is said not to have good discrimination. A simulation study was conducted to compare the detection of item discrimination with the linear regression model and the logistic regression model when the underlying test score distributions follow the normal, uniform, or chi-square distribution, and the sample sizes are 40, 100, 500, 2000, or 5000. When the test scores follow the normal or chi-square distribution, the linear model and the logistic regression model yield almost identical results. Only when the sample sizes are extremely large, say up to 5000, would these two models yield different results. A real data set with 50 multiple-choice items and 500 examinees was analyzed to illustrate the similarity and difference between the proposed method of logistic regression model and the conventional item discrimination indices. Five items were arbitrarily chosen and analyzed. The item difficulties (percentage of correct responses) of these five items are between 0.50 and 0.81. Only one item is flagged as not exhibiting good discrimination with a p value of 0.000. Basically, the three conventional discrimination indices lead to almost identical results. This is expected because all of these procedures are invented to depict item discrimination, however, only the proposed objective procedure is statistically sound.
起訖頁 253-262
關鍵詞 試題鑑別力logistic迴歸模式概似比檢定Pearson殘差卡方檢定Item discriminationLogistic regression modelLikelihood ratio testPearson chi-squared test
刊名 中華心理學刊  
期數 200212 (44:2期)
出版單位 台灣心理學會
該期刊-上一篇 探索性因素分析國內應用之評估:1993至1999
該期刊-下一篇 少數族群之族群認同與個人整體自尊的關係:以卑南族青少年為例
 

新書閱讀



最新影音


優惠活動




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄