  1. 熱門:
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
ROCLING論文集 本站僅提供期刊文獻檢索。

Automatic Correction for Graphemic Chinese Misspelled Words
作者 張道行蘇守彥陳學志
不論華語為母語或外語的學習,錯別字是相當重要的議題。許多研究對於正在求學階段的學生提出矯正錯別字的建議,以及對教師提出教學正字的策略建議。儘管學生在求學時對錯別字的產生作了許多的防範和矯正,但有時候在撰寫文件時,還是會有錯別字產生而不自覺,因此除了在教學上強調錯別字辨認外,如何在使用文字過程中提示錯別字發生成為重要的問題。利用部件組字與形構資料庫,可以得知字的形體結構和組成的部件元素,探討字形相似性的混淆,進而找出造成錯誤的別字。然而,如何由程式自動又正確地找出文件中的別字並不是容易的事情。現階段在字形的錯別字偵測皆有研究者在各領域進行研究和應用,然而正確率距離實際需要仍有一段距離。若是能仔細分析別字的型態、機率以及發生時的語境,應該能夠更精確且快速的偵測出別字並有效的更正。本文利用bi-gram字詞比值、bi-gram詞性比值和候選詞相似度三種特徵,嘗試利用分類模型:SVM、Neural Network和線性迴歸法對別字偵查與校正。
No matter that learning Chinese as a first or second language, a quite important issue, misspelled words, needs to be addressed. Many studies proposed that there was a suggestion of correcting misspelled words for students who are still schooling as well as a suggestion of teaching and learning strategies of Chinese characters for teachers. Although in schooling, it does to prevent students who do lots of precautions and corrections from generating misspelled words; students sometimes are unconscious of their misspelled words while writing. As a result, in addition to emphasize the recognition of misspelled words in teaching, mentioning how to prevent from generating misspelled words during the process of using words becomes a critical issue. Nevertheless, it is not an easy matter to find misspelled words automatically and correctly within documents by using formula. Currently, there are researchers conducting research on graphemic misspelled words detection and applying it to different fields. But the accuracy is still far from the real demand. If it can analyze the model, probability and context of misspelled words in detail, it could be detecting the misspelled words more quickly and precisely as well as correcting those words effectively. We had been already accumulated quite research experiences on graphemic misspelled words. This project will combine with resources provided by the mainline project to process the problem of graphemic misspelled words. If it can achieve a breakthrough, it will not only offer a quite effective auxiliary tool for teaching Chinese misspelled words, but assist in establishing a learning tool of Chinese character errors corpus more quickly.
起訖頁 125-139
關鍵詞 別字偵測別字校正字形相似Misspelled Words DetectionMisspelled Words CorrectionGraphemic Similarit
刊名 ROCLING論文集  
期數 2012 (2012期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 以聲符部件為主之漢字學習系統設計研究
該期刊-下一篇 利用機器學習於中文法律文件之標記、案件分類及量刑預測




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄