基於發音知識以建構頻譜HMM之國語語音合成方法

古鴻炎; 賴名彥; 洪尉翔; 陳彥樺

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	基於發音知識以建構頻譜HMM之國語語音合成方法
並列篇名	A Mandarin Speech Synthesis Method Using Articulation-knowledge Based Spectral HMM Structure
作者	古鴻炎、賴名彥、洪尉翔、陳彥樺
中文摘要	在有限語料的情況下，本論文提出一種HMM的結構設計，來掌握各個語音單元之文脈相依的頻譜特性，以便改進合成語音的流暢度。此外，在決策樹之文脈分群方法之外，我們依據音素的發音知識，來作文脈分群而大幅降低文脈組合數量。為了評估所提出的HMM結構，我們使用三種不同的HMM結構方式去建造對應的國語語音合成系統，以作相互的比較。在這些系統裡，使用的韻律參數值是一樣的，都是使用之前研究的ANN模組來產生；但是頻譜係數則是使用各自的HMM模型來產生；至於信號波形的合成，則都是使用之前研究的基於諧波加雜音模型（HNM）的信號合成模組。聽測實驗的結果顯示，使用本論文提出的HMM結構所合成出的語音，比用其它HMM結構所合成的明顯地更為流暢；此外，依據錄音語句與合成語句之間的平均頻譜距離的量測結果，也顯示本論文的HMM結構，比其它HMM結構更能夠降低頻譜距離。
英文摘要	In this paper, a new HMM structure is proposed to work with a limited training corpus in order to obtain improved synthetic-speech fluency. Spectral fluency is improved because this HMM structure can model the context-dependent spectral characteristics of a speech unit. In addition, instead of using a decision tree to cluster contexts, the knowledge of phoneme articulation is based to cluster contexts and reduce the enormous quantity of context combinations. To evaluate the proposed HMM structure, we construct three Mandarin speech synthesis systems each uses one different HMM structure for comparisons. In these systems, the prosodic parameters are all generated with same ANN modules studied previously but the spectral coefficients are generated with different HMM adopted by its corresponding system. As to the synthesis of signal waveform, the signal model, harmonic plus noise model (HNM), studied previously is commonly adopted in the three systems. According to the results of listening tests, the speech synthesized by the system using the proposed HMM structure is indeed more fluent than the speeches synthesized by the other two systems. In addition, average spectral distances are measured between recorded sentences and synthetic sentences. The results show that the HMM structure proposed here also obtains smaller average spectral distance than the other two HMM structures.
起訖頁	78-88
關鍵詞	語音合成、HMM 結構、發音知識、頻譜流暢度、離散倒頻譜係數、Speech Synthesis、HMM Structure、Articulation Knowledge、Spectral Fluency、Discrete Cepstral Coefficients
刊名	ROCLING論文集
期數	2014 (2014期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	中文轉客文文轉音系統中的客語斷詞處理之研究
該期刊-下一篇	Some Prosodic Characteristics of Taiwan English Accent