拾級而上：第四學習階段國語文閱讀表現水準描述芻議

謝佩蓉

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	拾級而上：第四學習階段國語文閱讀表現水準描述芻議
並列篇名	Step-by-Step Progression: A Preliminary Discussion on Performance Level Descriptors for Mandarin Reading in the Fourth Learning Stage
作者	謝佩蓉
中文摘要	研究目的本研究目的在於建立第四學習階段國語文閱讀表現量尺及表現水準描述，並提出範例試題使表現水準描述具象化。主要理論或概念架構標準設定的方法各有所長，大多數研究採用「以內容為基礎」的方法，但設定程序可能受到人為因素影響。本研究比照國際大型教育調查所採用的「量尺定錨法」，包括TIMSS和PISA均採用此方法，讓資料使用者理解量尺分數所代表的意義。研究設計／方法／對象採用調查研究法蒐集標準設定所需實證資料，以縱貫性研究設計測量學生國語文閱讀學習表現，再以國際大型調查採用的量尺定錨法設定表現標準並產出表現水準描述。研究對象以107學年度和108學年度的七年級學生為母群體，107學年度為第一組、108學年度為第二組追蹤樣本。採二階段分層叢集抽樣設計，第一階段採「依比例的機率」抽取學校、第二階段針對抽中的學校進行校內班級隨機抽樣，所抽中的班級，全班學生均為樣本。「國語文閱讀」於第一組追蹤樣本七年級時分派2,803人，八年級分派2,807人；第二組追蹤樣本於七年級分派2,565人，八年級分派2,780人。評量工具為國語文閱讀素養導向電腦化線上測驗，總計344題，採部分平衡不完全區塊組成題本，信、效度證據良好。以試題反應理論之部分給分模式估計試題難度。研究發現或結論表現水準的四個切截點分數為400、475、550、625，並命名為M1、M2、M3及M4，撰寫各表現水準的描述，具體描寫學生的能力發展。依閱讀題組的閱讀特性，可概分為評量學生的一般閱讀和數位閱讀兩種素養。以找出訊息、理解及評價與省思三向度的閱讀認知歷程，描述達到每一水準學生所展現的兩種類型閱讀素養。整體而言，當認知複雜性逐漸提升時，試題的難度通常也會隨之增加。從本研究的實證資料可知，即便是認知複雜度較低的任務，也可能會出現令學生感到困難的題目；而認知複雜度較高的試題，可能反而較為容易。理論或實務創見／貢獻／建議本研究以嚴謹大規模調查之實證證據支持標準設定的有效性，並根據實證數據選取合適的範例題。所完成的表現水準描述揭示各閱讀歷程於不同水準可能涉及的能力差異，有助於教師精準掌握學生閱讀理解歷程中的困難點。建議未來研究能建構第三和第五學習階段國語文閱讀素養表現水準描述，使學習階段能力描述具備縱貫脈絡，並進一步研發涵蓋不同難度層級的數位閱讀試題，補充評價與省思M1至M3的數位閱讀表現描述，確立完整的數位閱讀素養能力階層。此外，可以表現水準描述為基礎，針對不同能力層級學生設計教學方案，發揮表現水準描述作為「學習地圖」在實務現場的應用潛力。
英文摘要	Purpose The study aims to develop a performance scale and performance level descriptors (PLDs) for Mandarin reading in the fourth learning stage, and to provide sample items to illustrate the PLDs. Main Theories or Conceptual Frameworks Each standard-setting method has its own strengths. Most studies use a “content based standard-setting method” approach, but the standard-setting process can be influenced by subjective factors. This study, therefore, adopts the “scale anchoring” method, which is commonly employed in international large-scale educational assessments such as TIMSS and PISA, to help data users interpret the meaning represented by scale scores. Research Design/Methods/Participants A survey research method was employed to collect empirical data required for standard setting. A longitudinal study design was used to measure students’ performance in Mandarin reading, and the scale anchoring method was adopted to set performance standards and generate performance descriptors. The study population consisted of 7th-grade students from the 107th and 108th academic years, with the 107th academic year as Panel 1 and the 108th academic year as Panel 2 for followup samples. A two-stage stratified cluster sampling design was implemented: in the first stage, schools were selected using “probabilities proportional to size” sampling; in the second stage, classes within the selected schools were randomly sampled. All students in the sampled classes were included in the study. In Panel 1, 2,803 students were assigned for the Mandarin reading assessment in 7th grade and 2,807 in 8th grade. In Panel 2, 2,565 students were assigned in 7th grade and 2,780 in 8th grade. The assessment tool was a competency-based computerized online test of Mandarin reading, consisting of 344 items. Test booklets were assembled using a partially balanced incomplete block design. The assessment provided strong evidence of reliability and validity. Item difficulty was estimated using the partial credit model of item response theory. Research Findings or Conclusions The four cut scores for performance levels were set at 400, 475, 550, and 625, corresponding to levels M1, M2, M3, and M4. Descriptions for each performance level were developed to portray students’ developmental abilities. Reading tasks were categorized into two literacy types: general reading and digital reading. The reading cognitive processes were classified into three dimensions: locating information, understanding, and evaluating and reflecting. Performance level descriptors (PLDs) detailed the literacy skills demonstrated by students at each level for both types of reading. In general, item difficulty increased with cognitive complexity. However, the empirical data revealed that tasks with lower cognitive complexity could still challenge students, while tasks with higher cognitive complexity might be relatively easier. Theoretical or Practical Insights/Contributions/Recommendations This study validates the effectiveness of standard setting with large-scale empirical data and selects appropriate sample items based on empirical evidence. The resulting performance level descriptors (PLDs) reveal the differences in reading processes across performance levels, enabling teachers to accurately identify students’ difficulties in reading comprehension. Future research is recommended to develop PLDs for Mandarin reading literacy in the third and fifth learning stages, providing a longitudinal framework for PLDs across learning stages. Additionally, the development of digital reading items covering various difficulty levels is suggested to enrich the PLDs for evaluating and reflecting processes at M1 to M3 levels. Furthermore, the PLDs can serve as a “learning map” for designing instructional plans tailored to students at different performance levels, enhancing their practical application in teaching and learning contexts.
起訖頁	001-057
關鍵詞	大型評量、素養導向評量、量尺定錨法、標準設定、閱讀理解、large-scale assessments、competency-based assessment、scale anchoring、standard setting、reading comprehension
刊名	教育研究集刊
期數	202506 (71:2期)
出版單位	國立臺灣師範大學教育學系
該期刊-下一篇	近十年108課綱研究脈絡及發展趨勢：基於文獻計量學分析