  1. 熱門:
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
中文計算語言學期刊 本站僅提供期刊文獻檢索。

Strategies of Processing Japanese Names and Character Variants in Traditional Chinese Text
作者 Chuan-Jie Lin (Chuan-Jie Lin)Jia-Cheng Zhan (Jia-Cheng Zhan)Yen-Heng Chen (Yen-Heng Chen)Chien-Wei Pao (Chien-Wei Pao)
This paper proposes an approach to identify word candidates that are not Traditional Chinese, including Japanese names (written in Japanese Kanji or Traditional Chinese characters) and word variants, when doing word segmentation on Traditional Chinese text. When handling personal names, a probability model concerning formats of names is introduced. We also propose a method to map Japanese Kanji into the corresponding Traditional Chinese characters. The same method can also be used to detect words written in character variants. After integrating generation rules for various types of special words, as well as their probability models, the F-measure of our word segmentation system rises from 94.16% to 96.06%. Another experiment shows that 83.18% of the 862 Japanese names in a set of 109 human-annotated documents can be successfully detected.
起訖頁 87-108
關鍵詞 Semantic Chinese Word SegmentationJapanese Name IdentificationCharacter Variants
刊名 中文計算語言學期刊  
期數 201209 (17:3期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 Enhancement of Feature Engineering for Conditional Random Field Learning in Chinese Word Segmentation Using Unlabeled Data
該期刊-下一篇 Evaluation of TTS Systems in Intelligibility and Comprehension Tasks: a Case Study of HTS-2008 and Multisyn Synthesizers




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄