中文摘要 |
This article describes an approach to constructing a language resource through automatically sketching grammatical relations of words in an untagged corpus based on dependency parses. Compared to the handcrafted, rule-based Word Sketch Engine (Kilgarriff et al. 2004), this approach provides more details about the different syntagmatic usages of each word such as various types of modification a given word can undergo and other grammatical functions it can fulfill. As a way to properly evaluate the approach, we attempt to evaluate the auto-generated result in terms of the distributional thesaurus function, and compare this with items in an existing thesaurus. Our results have been tailored for the purpose of Chinese learning and, to the best of our knowledge, the resulting resource is the first of its kind in Chinese. We believe it will have a great impact on both Chinese corpus linguistics and Teaching Chinese as a Second Language (TCSL).
本文描述自動建立語言學習資源的方法,藉由依存剖析器對文本的分析,我們可以描繪中文字詞間的語法關係。與先前研究相比,本資源可以提供更周延的字詞用法,例如各式各樣的修飾關係,這在語言教學上將有所應用。雖然其他語言的資源也試圖藉由剖析文本來描繪字詞關係,然而我們尚未在中文資源裡看到針對自訂文本來描繪字詞的語言資源,因此我們提出此方法並評估其產生同義詞的功能。我們並針對語言學習開放分析結果的介面,相信對中文語言學和教學有所助益。 |