中文摘要 |
This paper presents a study on speaker-independent continuous Mandarin syllable recognition under telephone environments. It compares and contrasts several cepstral bias removal techniques for compensation of telephone channel effects, including cepstral mean subtraction(CMS) , signal bias removal(SBR) and stochastic matching(SM). Then some modifications and combinations of these techniques are investigated for further improvement of environmental robustness over the telephone. To better estimate contextual acoustics and co-articulation in spontaneous Mandarin telephone speech, the between-syllable context-dependent phone-like units (such as triphones, biphones and demiphones) are used to train the speech models, In addition, the discriminative capabilities of the speech models are further enhanced using the minimum classification error(NCE) algorithms. Experimental results showed that the achieved recognition rates for Mandarin base syllables are as high as 59.53%, leading to an improvement of 27.81% in the error rates. |