英文摘要 |
This paper presents Yanhui (宴會), a software based, high performance Mandarin text-to-speech system, developed at Apple-ISS Research Center, Singapore. This system uses diphone concatenative synthesis method to produce Mandarin speech and applies extensive prosody control in pitch and duration to achieve natural sounding speech. The system takes a freeform Chinese text and generates the corresponding natural speech sound. Yanhui (宴會) is running real-time and software-only on commercial personal computers. This paper describes the various algorithms and techniques used in the front-end and back-end. |