英文摘要 |
A phonetic representation of a language is used to describe the corresponding pronunciation and synthesize the acoustic model of any vocabulary. In order to obtain better phonetic representation, context-dependent units are used to model co-articulation effects between phones and have been broadly in speech recognition. However, this representation generally increases the number of recognition units. A phonetic representation with smaller phonetic units such as SAMPA-C for Mandarin Chinese can be applied to reduce the number of recognition units. Nevertheless, smaller phonetic units such as SAMPA-C will contain confusion characters and generally degrade the recognition performance. In this paper, a statistical method based on chi-square testing is used to investigate the confusion characteristics among phonetic units and develop a more reliable phonetic set, named modified SAMPA-C. Finally, experiments on continuous Mandarin telephone speech recognition were conducted. Experimental results show an encouraging improvement on recognition performance can be obtained. In addition, the proposed approaches represent a good compromise between the demands of accurate acoustic modeling. |