Some Studies on Min-Nan Speech Processing

Wei-Chih Kuo; Chen-Chung Ho; Xiang-Rui Zhong; Zhen-Feng Liang; Hsiu-Min Yu; Yih-Ru Wang; Sin-Horng Chen

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	Some Studies on Min-Nan Speech Processing
作者	Wei-Chih Kuo (Wei-Chih Kuo)、Chen-Chung Ho (Chen-Chung Ho)、Xiang-Rui Zhong (Xiang-Rui Zhong)、Zhen-Feng Liang (Zhen-Feng Liang)、Hsiu-Min Yu (Hsiu-Min Yu)、Yih-Ru Wang (Yih-Ru Wang)、Sin-Horng Chen (Sin-Horng Chen)
中文摘要	In this paper, three studies of Min-Nan speech processing are presented. The first study concerns the implementation of a high-performance Min-Nan TTS system. On the basis of the waveform templates of 877 base-syllables used as basic synthesis units and through the application of the RNN-based prosody generation method and the PSOLA algorithm for prosody modification, this Min-Nan TTS system can convert texts, represented in both Han-Luo (漢羅) and Chinese logographic writing systems, into natural Min-Nan speech. An informal, subjective listening test confirms that the system performs well and the synthetic speech sounds natural for well-tokenized Min-Nan texts and for automatically tokenized Chinese logographic texts. The second investigation concerns the realization of a Min-Nan speech recognizer. It adopts the initial-final-based HMM approach with a simple base-syllable bigram language model. A base-syllable recognition rate of 65.1% has been achieved. Finally, a model-based tone labeling method is presented. This method adopts a statistical model to eliminate the affections of all factors other than tone on the syllable pitch contour for automatic tone labeling. Experimental results confirm that this method outperforms the conventional VQ-based approach.
起訖頁	391-410
關鍵詞	Min-Nan Text-to-Speech System、Speech Recognition、Model-Based Tone Labeling
刊名	中文計算語言學期刊
期數	200712 (12:4期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	A System Framework for Integrated Synthesis of Mandarin, Min-Nan, and Hakka Speech
該期刊-下一篇	Construction and Automatization of a Minnan Child Speech Corpus with some Research Findings