Japanese-Chinese Cross-Language Information Retrieval: An Interlingua Approach

Hasan, Maruf; Matsumoto,Yuji

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	Japanese-Chinese Cross-Language Information Retrieval: An Interlingua Approach
作者	Hasan, Maruf (Hasan, Maruf)、Matsumoto,Yuji (Matsumoto,Yuji)
中文摘要	Electronically available multilingual information can be divided into two major categories: (1) alphabetic language information (English-like alphabetic languages) and (2) ideographic language information (Chinese-like ideographic languages). The information available in non-English alphabetic languages as well as in ideographic languages (especially, in Japanese and Chinese) is growing at an incredibly high rate in recent years. Due to the ideographic nature of Japanese and Chinese, complicated with the existence of several encoding standards in use, efficient processing (representation, indexing, retrieval, etc.) of such information became a tedious task. In this paper, we propose a Han Character (Kanji) oriented Interlingua model of indexing and retrieving Japanese and Chinese information. We report the results of mono- and cross- language information retrieval on a Kanji space where documents and queries are represented in terms of Kanji oriented vectors. We also employ a dimensionality reduction technique to compute a Kanji Conceptual Space (KCS) from the initial Kanji space, which can facilitate conceptual retrieval of both mono- and cross- language information for these languages. Similar indexing approaches for multiple European languages through term association (e.g., latent semantic indexing) or through conceptual mapping (using lexical ontology such as, WordNet) are being intensively explored. The Interlingua approach investigated here with Japanese and Chinese languages, and the term (or concept) association model investigated with the European languages are similar; and these approaches can be easily integrated. Therefore, the proposed Interlingua model can pave the way for handling multilingual information access and retrieval efficiently and uniformly.
起訖頁	59-85
關鍵詞	跨語言資訊檢索、多語言資訊處理、Cross-language Information Retrieval、 Multilingual Information Processing、 Latent Semantic Indexing
刊名	中文計算語言學期刊
期數	200008 (5:2期)
出版單位	中華民國計算語言學學會
該期刊-上一篇	Design and Evaluation of Approaches to Automatic Chinese Text Categorization
該期刊-下一篇	Compiling Taiwanese Learner Corpus of English