英文摘要 |
This paper proposes a novel scheme that enhance the modulation spectrum of speech features in noise speech recognition via non-negative matrix factorization (NMF). In the presented approach, we apply NMF to obtain a set of non-negative basis spectra vectors which derived from the clean speech to represent the important components for speech recognition. The difference compared to the conventional NMF-based scheme that leverages iterative search to update the full-band modulation spectra is two: first, we apply the orthogonal projection to update the low sub-band modulation spectra. Second, we process the low half-band of the modulation spectrum rather than the full-band. The presented new process improves the computation efficiency without the cost of degarded recognition performance. In the Aurora-2 database and task, the presented new NMF-based approach can achieve the average error reduction rate of over 58% relative to the baseline MFCC. |