中文摘要 |
This paper proposes a three-tier prosodic hierarchy, including prosodic word,
intermediate phrase and intonational phrase tiers, for Mandarin that emphasizes the
use of the prosodic word instead of the lexical word as the basic prosodic unit. Both
the surface difference and perceptual difference show that this is helpful for
achieving high naturalness in text-to-speech conversion. Three approaches, the basic
CART approach, the bottom-up hierarchical approach and the modified hierarchical
approach, are presented for locating the boundaries of three prosodic constituents in
unrestricted Mandarin texts. Two sets of features are used in the basic CART method:
one contains syntactic phrasal information and the other does not. The one with
syntactic phrasal information results in about a 1% increase in accuracy and an 11%
decrease in error-cost. The performance of the modified hierarchical method
produces the highest accuracy, 83%, and lowest error cost when no syntactic phrasal
information is provided. It shows advantages in detecting the boundaries of
intonational phrases at locations without breaking punctuation. 71.1% precision and
52.4% recall are achieved. Experiments on acceptability reveal that only 26% of the
mis-assigned break indices are real infelicitous errors, and that the perceptual
difference between the automatically assigned break indices and the manually
annotated break indices are small. |