強健性語音辨識中基於小波轉換之分頻統計補償技術的研究

范顥騰; 杜文祥; 洪志偉

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	強健性語音辨識中基於小波轉換之分頻統計補償技術的研究
並列篇名	A Study of Sub-band Feature Statistics Compensation Techniques Based on a Discrete Wavelet Transform for Robust Speech Recognition
作者	范顥騰、杜文祥、洪志偉
中文摘要	本論文主要是發展語音特徵強健化技術，來改進雜訊環境下語音辨識的效能。我們改良原始全頻帶式的特徵序列統計正規化技術，使用著名的離散小波轉換來對語音特徵時間序列進行分頻帶的處理，進而發展出兩種新的特徵統計補償法，分別為分頻式平均值與變異數正規化法與分頻式統計圖等化法。在這兩種新方法中，我們將經由離散小波轉換所得之分頻帶的序列，分別以平均值與變異數正規化法與統計圖等化法處理，再將處理後的各分頻帶之特徵序列，藉由反離散小波轉換組合成新的特徵序列。如此處理的特點為，可以將特徵序列作不等切的調變頻帶切割，進而對語音辨識較重要的低調變頻帶作個別的強健性處理。從Aurora-2連續數字資料庫的實驗結果證實，我們提出的分頻式新方法在各種雜訊環境下都優於傳統全頻帶式之方法，與基礎實驗結果相比較，其相對錯誤降低率皆在50%以上，顯示了我們所提出之新方法能十分有效地提昇語音特徵在雜訊環境下的強健性。
英文摘要	The environmental mismatch caused by additive noise and/or channel distortion often degrades the performance of a speech recognition system seriously. Various robustness techniques have been proposed to reduce this mismatch, and one category of them aims to normalize the statistics of speech features in both training and testing conditions. In general, these statistics normalization methods deal with the speech feature sequences in a full-band manner, which somewhat ignores the fact that different modulation frequency components have unequal importance for speech recognition. With the above observations, in this paper we propose that the speech feature streams be processed in a sub-band manner. The processed temporal-domain feature sequence is first decomposed into non-uniform sub-bands using discrete wavelet transform (DWT), and then each sub-band stream is individually processed by the well-known normalization methods, like mean and variance normalization (MVN) and histogram equalization (HEQ). Finally, we reconstruct the feature stream with all the modified sub-band streams using inverse DWT. With this process, the components that correspond to more important modulation spectral bands in the feature sequence can be processed separately. For the Aurora-2 clean-condition training task, the new proposed sub-band MVN and HEQ provide relative error rate reductions of 20.32% and 16.39% over the conventional MVN and HEQ, respectively. These results reveal that the proposed methods significantly enhance the robustness of speech features in noise-corrupted environments.
起訖頁	251-264
關鍵詞	離散小波轉換、語音辨識、強健性語音特徵參數、speech recognition、discrete wavelet transform、robust speech features
刊名	ROCLING論文集
期數	2009 (2009期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	資源受限運算環境下華英混雜語音辨識系統
該期刊-下一篇	併合式倒頻譜統計正規化技術於強健性語音辨識之研究

新書閱讀

元照讀書館

優惠活動

月旦品評家

元照讀書館

．研討會新訊

月旦知識庫

月旦法律分析庫
月旦醫事法網
月旦會計財稅網

期刊數位服務

社群平台

讀者服務

關於元照

讀者服務專線：+886-2-23756688　傳真：+886-2-23318496
地址：臺北市館前路28 號 7 樓　客服信箱