中文摘要 |
基週軌跡(pitch contour)對於合成出高自然度的語音信號是相當的重要的,因此本論文研究提出了一種新的基週軌跡產生方法,此方法就是把類神經網路(artificial neural network, ANN)預測模組、全域變異數匹配(global-variance matching, GVM)與真實基週軌跡挑選(real contour selection, RCS)模組作結合,用以產生基週軌跡。在此,我們先分析出各個訓練音節的基週軌跡,然後使用離散餘弦轉換(discrete cosine transform, DCT)將各個基週軌跡轉換成對應的DCT係數之向量,然後就可拿各個訓練語句的DCT向量序列、及對應的語境參數去訓練ANN權重值與GVM參數。在基週軌跡產生的實驗中,我們以量測變異數比值(variance ratio, VR)來作為客觀評估的依據,由實驗結果得知,GVM與RCS模組有助於提升VR值;此外,主觀聽測實驗的結果顯示,ANN加GVM所產生的基週軌跡,其自然度比僅使用ANN模組的高,並且ANN加GVM加RCS的基週軌跡自然度,更高於ANN加GVM的。 |
英文摘要 |
Pitch contours are important for synthesizing highly natural speech signal. In this paper, we study a new pitch-contour generation method. The method proposed is to combine ANN prediction module with global-variance matching (GVM) and real contour selection (RCS) modules. Here, a syllable pitch contour is first analyzed and then transformed via discrete cosine transform (DCT) to a DCT-coefficient vector. Each sequence of DCT vectors analyzed from a training sentence plus contextual parameters are then used to train the ANN weights and GVM parameters. In pitch-contour generation experiments, we measure variance-ratio (VR) values for objective evaluations. The modules, GVM and RCS, are shown to be helpful to promote VR values. In addition, in subjective evaluation, the pitch-contour generation method, ANN + GVM, is shown to be more natural than the method, ANN only. Also, the method, ANN + GVM + RCS, is shown to be better than ANN + GVM. |