英文摘要 |
In this study, we analysis the effect of the linguistic information for the English/Mandarin speech synthesis system. In order to construct the acoustic models for both languages, we adopted the Hidden Markov Model. For the system implementation, we firstly detected the language segments for each language of the input bilingual sentence, and then independently generate the feature sequences for each language. However, for generating fluent synthesized speech, the linguistic information should be taken into account. Here, if the bilingual sentence is mainly written in Mandarin with a few English words, we firstly analyze the Part-Of-Speech information for the English words. Then, we adopted some substitute words (SW) to translate the English parts into Mandarin which have the same POS tags as their corresponding English words. Finally, The entire sentence consists of only one language and could be analyzed linguistically and keep its context information. Finally, the synthesized speech should be more fluent since the contextual linguistic information is used for choosing the suitable acoustic model sequence. In order to construct the original bilingual speech utterance, the English segment is substituted back to the synthesized speech. Experimental results showed that adding the contextual linguistic information is indeed helpful for generating fluent speech for the bilingual sentences. |