A Synchronous Chinese Language Corpus from Different Speech Communities: Construction and Applications

Benjamin K. T'sou; Hing-Lung Lin; Godfrey Liu; Terence Chan; Jerome Hu; Ching-hai Chew; John K.P. Tse

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	A Synchronous Chinese Language Corpus from Different Speech Communities: Construction and Applications
作者	Benjamin K. T'sou (Benjamin K. T'sou)、Hing-Lung Lin (Hing-Lung Lin)、Godfrey Liu (Godfrey Liu)、Terence Chan (Terence Chan)、Jerome Hu (Jerome Hu)、Ching-hai Chew (Ching-hai Chew)、John K.P. Tse (John K.P. Tse)
中文摘要	Similar to other languages such as English, Spanish and Arabic, Chinese is used by a large number of speakers in distinct speech communities which, despite sharing the unity of language, vary in interesting ways, and a systematic study of such linguistic variation is invaluable to appreciate the diversity and richness of the underlying cultures. This paper describes Project LIVAC (Linguistic Variation in Chinese Communities), which focuses on the development of a Chinese corpus, based on data taken concurrently at regular intervals from multiple Chinese speech communities. The resulting database and computerized concordance from the approximately 20 million word corpus with uniform time reference points extending across two years enable linguists and social scientists to undertake meaningful qualitative and quantitative comparative analysis of the development of linguistic and cultural variation. To facilitate these studies, a framework for integrating the corpus with specific corpus analysis applications is proposed. Based on this framework, a prototype retrieval system, which supports longitudinal studies on word and concept distribution, as well as lexical and other linguistic variation, is designed and implemented.
起訖頁	91-104
刊名	中文計算語言學期刊
期數	199702 (2:1期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	MAT--A Project to Collect Mandarin Speech Data Through Telephone Networks in Taiwan
該期刊-下一篇	中央研究院古籍全文資料庫的發展概要