  1. 熱門:
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
ROCLING論文集 本站僅提供期刊文獻檢索。

Some Issues on Applying SA-class Bigram Language Models
作者 張照煌陳正德
This paper investigates some issues on application of class-based Chinese language models, especially the SA -class bigram model in which the word classes are automatically clustered by simulated annealing. The studied issues include (1) using test-set perplexity as a quality measure for evaluating performance of language models across domains, subdomains, and character codings; (2) using the SA-class bigram model to different applications OCR postprocessing, syllable-to-character conversion, and linguistic decoding for speech recognition; (3) comparing the model with other language models - least-word, word-frequency, inter-word character bigram, and word bigram; and (4) deciding appropriate number of classes based on corpus size. The experimental results show that the test-set perplexity is indeed a good measure for performance evaluation of language models, and the SA -class bigram language model is not only theoretically plausible but also practically feasible - high performance with less resource requirement.
起訖頁 171-186
刊名 ROCLING論文集  
期數 1994 (1994期)
出版單位 國立高雄師範大學輔導與諮商研究所
該期刊-上一篇 An Estimation of the Entropy of Chinese - A New Approach to Constructing Class-based n-gram Models
該期刊-下一篇 A Text Conversion System Between Simplified and Complex Chinese Characters Based on OCR Approaches




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄