Using Duration Information in Cantonese Connected-Digit Recognition

Zhu, Yu; Lee, Tan

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	Using Duration Information in Cantonese Connected-Digit Recognition
作者	Zhu, Yu (Zhu, Yu)、Lee, Tan (Lee, Tan)
中文摘要	This paper presents an investigation on the use of explicit statistical duration models for Cantonese connected-digit recognition. Cantonese is a major Chinese dialect. The phonetic compositions of Cantonese digits are generally very simple. Some of them contain only a single vowel or nasal segment. This makes it difficult to attain high accuracy in the automatic recognition of Cantonese digit strings. Recognition errors are mainly due to the insertion or deletion of short digits. It is widely admitted that the hidden Markov model does not impose effective control on the duration of the speech segments being modeled. Our approach uses a set of statistical duration models that are built explicitly from automatically segmented training data. They parametrically describe the distributions of various absolute and relative duration features. The duration models are used to assess recognition hypotheses and produce probabilistic duration scores. The duration scores are added with an empirically determined weight to the acoustic score. In this way, a hypothesis that is competitive in acoustic likelihood, but unfavorable in temporal organization, will be pruned. The conventional Viterbi search algorithms for connected-word recognition are modified to incorporate both state-level and word-level duration features. Experimental results show that absolute state duration gives the most noticeable improvement in digit recognition accuracy. With the use of duration information, insertion errors are much reduced, while deletion errors increase slightly. It is also found that explicit duration models are more effective for slow speech than for fast speech.
起訖頁	1-16
關鍵詞	Explicit duration modeling、Duration features、Connected-digit recognition、Cantonese、Hidden Markov models
刊名	中文計算語言學期刊
期數	200603 (11:1期)
出版單位	中華民國計算語言學學會
該期刊-下一篇	Modeling Cantonese Pronunciation Variations for Large-Vocabulary Continuous Speech Recognition