| 英文摘要 |
Vietnamese is an isolating language with rich productive compounding, but no morphosyntactic, phonotactic or phonological evidence to assume a linguistic level between the syllable and the phrase (Schiering et al. 2010). We model an artificial listener with a Random Forest Classifier, to study the phonetic distinguishability of compounds vs. phrases, following Nguyen and Ingram (2007). This Machine Learning algorithm represents the maximal potential for a system to differentiate the two classes based on phonetics alone. It ranks the importance of each phonetic correlate to the differentiation of these classes. This allows an interpretation beyond whether a difference on a particular phonetic dimension exists including how important this difference is. The results confirm that the two classes can only be phonetically separated under circumstances of maximal contrast, and that maximal contrast is realized through juncture marking. Furthermore, we show that the two classes cannot be perfectly separated even under conditions of maximal contrast and additionally that there is an across-the-board preference for a compound interpretation from the phonetic data, even when the Random Forest Classifier was trained on maximal contrast data. |