強健性語音辨識中分頻段調變頻譜補償之研究

黃勝源; 杜文祥; 洪志偉

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	強健性語音辨識中分頻段調變頻譜補償之研究
並列篇名	A Study of Sub-band Modulation Spectrum Compensation for Robust Speech Recognition
作者	黃勝源、杜文祥、洪志偉
中文摘要	雖然語音科技進步迅速，但自動語音辨識仍是一門值得繼續研究開發的課題。因為目前多數的語音辨識系統應用於不受干擾的安靜環境，雖然能得到相當滿意的辨識效果，但若將其應用於實際的環境中，語音訊號往往會因為環境雜訊的影響，導致辨識效能有明顯地衰減，發展多年的強健性技術即是針對此項缺點作改進。在諸多強健性技術中，有一類方法為對語音特徵作統計上的正規化，傳統上，這些方法都是對全頻段的語音特徵時間序列做正規化處理，然而，在分析此類方法的效能上，通常是以其調變頻譜的正規化程度作為效能的依據，因此，如果直接在語音特徵之調變頻譜上作正規化，應亦可達到不錯的效果。另外，由於不同頻率的調變頻率成份具有不相等的重要性，但是傳統之特徵時間序列正規化法相對忽略了此性質，基於這些觀察，在本論文中，我們提出了一系列的分頻段調變頻譜統計正規化法，此類方法可以分別正規化不同頻段的統計特性，進而提升語音特徵在雜訊環境下的強健性能；在國際通用的Aurora-2連續數字資料庫之語音辨識上，我們所提出的新方法相對於基礎實驗的辨識率而言，可以達到高達65%的相對錯誤降低率，而這些新的調變頻譜正規化法相對於時間序列正規化法而言，於相對錯誤降低率上也有7%至32%的進步空間，此足以驗證這些新方法能夠更有效地提昇語音辨識系統在雜訊環境下的辨識效能。
英文摘要	In this paper, we propose a novel scheme in performing feature statistics normalization techniques for robust speech recognition. In the proposed approach, the processed temporal-domain feature sequence is first converted into the modulation spectral domain. The magnitude part of the modulation spectrum is decomposed into non-uniform sub-band segments, and then each sub-band segment is individually processed by the well-known normalization methods, like mean normalization (MN), mean and variance normalization (MVN) and histogram equalization (HEQ). Finally, we reconstruct the feature stream with all the modified sub-band magnitude spectral segments and the original phase spectrum using the inverse DFT. With this process, the components that correspond to more important modulation spectral bands in the feature sequence can be processed separately. For the Aurora-2 clean-condition training task, the new proposed sub-band spectral MN, MVN and HEQ provide relative error rate reductions of 18.66% and 23.58% over the conventional temporal MVN and HEQ, respectively.
起訖頁	39-52
關鍵詞	語音辨識、調變頻譜、統計正規化、強健性語音特徵參數
刊名	ROCLING論文集
期數	2009 (2009期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	Noise-Robust Speech Features Based on Cepstral Time Coefficients
該期刊-下一篇	Web Mining for Unsupervised Classification