中文摘要 |
The speech fundamental frequency (henceforth F0) contour plays an important role in expressing the affective information of an utterance. The most popular F0 modeling approaches mainly use the concept of separating the F0 contour into a global trend and local variation. For Mandarin, the global trend of the F0 contour is caused by the speaker’s mood and emotion. In this paper, the authors address the problem of affective intonation. For modeling affective intonation, an affective corpus has been designed and established, and all intonations are extracted with an iterative algorithm. Then, the concept of eigen-intonation is proposed based on the technique of Principal Component Analysis on the affective corpus and all the intonations are transformed to the lower-dimensional eigen sub-space spanned by eigen-intonations. A model of affective intonations is established in the sub-space. As a result, the corresponding emotion (maybe a mixed emotion) can be expressed by speech whose intonation is modified according to the above model. The experiments are performed with the affective Mandarin corpus, and the experimental results show that the intonation modeling approach proposed in this paper is efficient for both intonation representation and speech synthesis. |