英文摘要 |
The study aims to explore the salient linguistic features of Chinese lexical items from different L1s learners. The research method is corpus-based, including comparing the learner corpus and the native-speaker corpus, as well as sub-corpora for different L1s. The learner corpus which consists of more than 1.14 million Chinese words from novice proficiency to advanced learners’ texts is mainly from the computer-based writing Test of Chinese as a Foreign Language (TOCFL). The sub-corpora of Japanese, English, Korean, Vietnamese, Indonesia and Thai are observed. Japanese corpus is top 1, which occupies twenty four percent of the total data, followed by English, Korean, and etc. And the native corpus is from the Academia Sinica balanced corpus. Through the overuse or underuse linguistic forms and keyword-keyness analysis, some salient features are discovered. For examples, comparative to Chinese learners with other L1s, English language background learners show the unusual high frequency on pronouns and unusual low frequency on sentential final particles in Chinese writing. And Japanese as well as Korean background learners tend to overuse the post form ‘de hua’ instead of ‘ruguo’ when expressing the ‘if’ sentence, and overuse ‘suoyi’ instead of ‘yinwei’ when expressing the cause-effect relation. The article also provides possible explanations for these results from the aspects of learners’ native language typology, linguistic structure, syntactic category and culture. |