中文摘要 |
In this study, discriminative HMM training and its performance are investigated in both clean and noisy environments. Recognition error is defined at string, word, phone, and acoustic levels and treated in a unified framework in discriminative training. With an acoustic level, high-resolution error measurement, a discriminative criterion of minimum divergence (MD) is proposed. Using speaker-independent, continuous digit databases, Aurora2, the recognition performance of recognizers, which are trained in terms of different error measures and different training modes, is evaluated under various noise and SNR conditions. Experimental results show that discriminatively trained models perform better than the maximum likelihood baseline systems. Specifically, in MWE and MD training, relative error reductions of 13.71% and 17.62% are obtained with multi-training on Aurora2, respectively. Moreover, compared with ML training, MD training becomes more effective as the SNR increases. |