英文摘要 |
Human perception on the singing voice differs with the factors of the singing voice and the subjects. On one hand, the background knowledge influences the understanding of voice for each subject. On the other hand, the difference of the voices presented to the subjects also affects the perception. In this paper, we discuss two factors reflecting on the similarity before and after singing voice conversion: prosodic features and subjects’ familiarity to the singers. Three experiments were conducted. The first experiment tested the subjects’ ability to identify the singer. The second experiment synthesized the singing voice with different singers’ prosodic features, and let the subjects score the similarity. The third experiment presented timbre-converted singing voice with different combinations of prosodic features from two singers to the subjects for them to judge the similarity to the target singer. The results show that, first, the number of prosodic features contained in the synthesized voice is positively correlated with the scores in identification and similarity. Also, subjects who are more familiar personally with the target singers have better identification scores than target-unfamiliar subjects on the timbre-converted singing voices. |