Linguistic Analysis for English/Mandarin Speech Synthesis System
作者 洪翌翔黃奕欽鄧廣豐
本論文將藉由文脈分析的處理,實作出一套中英夾雜的語音系統。在語音模型的建模上,採取統計式模型中的隱藏式馬可夫模型(Hidden Markov Model)做為基礎針對中文以及英文進行處理。在系統的實作中,首先在合成語音前先將文字做前語言處理切割成中文和英文的部分,接著將中文與英文分別已預先訓練好的的中文/英文之語音模型分別進行合成,最終將各自合成的部份進行語音段的串接。其中,由於中文以及英文為不同的語言,為了維持整段話的連貫性,若整個句子以中文句當作主體,並且將此中英夾雜句中的英文字的部份,透過其詞性分析(POS Analysis)找出其詞性後,將此英文字置換成與其詞性相同的中文字(Substitute Word,縮寫為SW),使其與原英文字的詞性相同,在中文主體句中,則透過置換過後的中文句來進行文脈分析,挑選合適的中文語音模型,並用來為合成整段中文句子,並且將合成好的英文部分替換回該句中完成中英文夾雜的句子。透過實驗分析顯示,透過文脈的分析,能夠幫助合成的句子的語流較為順暢,因而提升中英夾雜句的何成語音更為自然。
In this study, we analysis the effect of the linguistic information for the English/Mandarin speech synthesis system. In order to construct the acoustic models for both languages, we adopted the Hidden Markov Model. For the system implementation, we firstly detected the language segments for each language of the input bilingual sentence, and then independently generate the feature sequences for each language. However, for generating fluent synthesized speech, the linguistic information should be taken into account. Here, if the bilingual sentence is mainly written in Mandarin with a few English words, we firstly analyze the Part-Of-Speech information for the English words. Then, we adopted some substitute words (SW) to translate the English parts into Mandarin which have the same POS tags as their corresponding English words. Finally, The entire sentence consists of only one language and could be analyzed linguistically and keep its context information. Finally, the synthesized speech should be more fluent since the contextual linguistic information is used for choosing the suitable acoustic model sequence. In order to construct the original bilingual speech utterance, the English segment is substituted back to the synthesized speech. Experimental results showed that adding the contextual linguistic information is indeed helpful for generating fluent speech for the bilingual sentences.
起訖頁 368-377
關鍵詞 中英夾雜句隱藏式馬可夫模型文脈分析語音串接語音合成English/Mandarin bilingual sentenceHidden Markov ModelLinguistic analysisSpeech concatenationSpeech synthesis
刊名 ROCLING論文集  
期數 2019 (2019期)
出版單位 中華民國計算語言學學會
