語料庫為本的兩岸對應詞彙發掘

洪嘉馡

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	語料庫為本的兩岸對應詞彙發掘
並列篇名	A Corpus-Based Approach to the Discovery of Cross-Strait Lexical Contrasts
作者	洪嘉馡
中文摘要	近年來，語言學界對於漢語詞彙的研究，不論在語音、語義或語用上的分析，發現兩岸對使用漢語時的詞彙差異越來越顯著。這些差異無疑造成了知識與信息交流的障礙。而兩岸卻又的確是使用漢字體系的書寫系統，只有字形上有可預測的規律性對應。本文在共同文字系統的基礎上，以兩岸詞彙對比的特性，來探討一些與詞彙語義相關的基本問題。本文由語料庫為出發點，探索兩岸對於漢語詞彙在使用上的差異現象，例如：相關共現詞彙 (collocation) 的差異、較容易與台灣詞彙共同出現或與大陸詞彙共同出現的差異、特定語境下的特殊用法的差異、語言使用習慣的差異等等。並由這些分析中建立從語料庫中抽取兩岸對應詞彙的研究方法。
英文摘要	Studies of cross-strait lexical contrasts in the use of Mandarin Chinese reveal that a divergence has become increasingly evident. This divergence is apparent in phonological, semantic, and pragmatic analyses and has become an obstacle to knowledge-sharing and information exchange. Given the wide range of divergences, it seems that Chinese character forms offer the most reliable regular mapping between cross-strait usage contrasts. We propose a new approach to discovery of cross-strait contrasts in this paper anchored on the regular correspondences of characters. Our approach is corpus-based and collocation-driven. We use known contrast pairs as seeds in a corpus containing data from both the PRC and Taiwan. Collocation patterns in terms of both lexical lists and grammatical functions of these contrast pairs are studied to semi-automatically discover additional contrast pairs. This approach obtains both NLP applicability and linguistic felicity since it yields both the contrast pairs as well as their usage contexts.
起訖頁	221-238
關鍵詞	兩岸詞彙對應、共現詞彙、GigaWord Corpus、Chinese Word Sketch、cross-strait lexical contrasts、collocation
刊名	語言暨語言學
期數	200804 (9:2期)
出版單位	中央研究院語言學研究所
該期刊-上一篇	唐宋詞單字領字研究
該期刊-下一篇	中文動詞語意網的建構：由陳述動詞出發