A Comparative Study of Histogram Equalization (HEQ) for Robust Speech Recognition

Lin, Shih-hsiang; Yeh, Yao-ming; Chen, Berlin

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	A Comparative Study of Histogram Equalization (HEQ) for Robust Speech Recognition
作者	Lin, Shih-hsiang (Lin, Shih-hsiang)、Yeh, Yao-ming (Yeh, Yao-ming)、Chen, Berlin (Chen, Berlin)
中文摘要	The performance of current automatic speech recognition (ASR) systems often deteriorates radically when the input speech is corrupted by various kinds of noise sources. Quite a few techniques have been proposed to improve ASR robustness over the past several years. Histogram equalization (HEQ) is one of the most efficient techniques that have been used to reduce the mismatch between training and test acoustic conditions. This paper presents a comparative study of various HEQ approaches for robust ASR. Two representative HEQ approaches, namely, the table-based histogram equalization (THEQ) and the quantile-based histogram equalization (QHEQ), were first investigated. Then, a polynomial-fit histogram equalization (PHEQ) approach, exploring the use of the data fitting scheme to efficiently approximate the inverse of the cumulative density function of training speech for HEQ, was proposed. Moreover, the temporal average (TA) operation was also performed on the feature vector components to alleviate the influence of sharp peaks and valleys caused by non-stationary noises. All the experiments were carried out on the Aurora 2 database and task. Very encouraging results were initially demonstrated. The best recognition performance was achieved by combing PHEQ with TA. Relative word error rate reductions of 68% and 40% over the MFCC-based baseline system, respectively, for clean- and multi- condition training, were obtained.
起訖頁	217-238
關鍵詞	Automatic speech recognition、Robustness、Histogram equalization、Data fitting、Temporal average
刊名	中文計算語言學期刊
期數	200706 (12:2期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	Improve Parsing Performance by Self-Learning