Reliable and Cost-Effective PoS-Tagging

Yu-Fang Tsai; Keh-Jiann Chen

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	Reliable and Cost-Effective PoS-Tagging
並列篇名	Reliable and Cost-Effective PoS-Tagging
作者	Yu-Fang Tsai (Yu-Fang Tsai)、Keh-Jiann Chen
英文摘要	In order to achieve fast and high quality Part-of-speech (PoS) tagging, algorithms should be high accuracy and require less manually proofreading. To evaluate a tagging system, we proposed a new criterion of reliability, which is a kind of cost-effective criterion, instead of the conventional criterion of accuracy. The most cost-effective tagging algorithm is judged according to amount of manual editing and achieved final accuracy. The reliability of a tag-ging algorithm is defined to be the estimated best accuracy of the tagging under a fixed amount of proofreading. We compared the tagging accuracies and reliabilities among different tagging algorithms, such as Markov bi-gram model, Bayesian classifier, and context-rule classifier. According to our experiments, for the best cost-effective tagging algorithm, in average, 20% of sam-ples of ambivalence words need to be rechecked to achieve an estimated final accuracy of 99%. The tradeoffs between amount of proofreading and final accuracy for different algo- rithms are also compared. It concludes that an algorithm with highest accuracy may not always be the most reliable algorithm.
起訖頁	1-14
刊名	ROCLING論文集
期數	2003 (2003期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	Auto-Discovery of NVEF Word-Pairs in Chinese
該期刊-下一篇	Chinese Word Auto-Confirmation Agent