  1. 熱門:
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
ROCLING論文集 本站僅提供期刊文獻檢索。

Web Mining for Unsupervised Classification
Web Mining for Unsupervised Classification
作者 Wei-Yen Day (Wei-Yen Day)Chun-Yi Chi (Chun-Yi Chi)Ruey-Cheng Chen (Ruey-Cheng Chen)Pu-Jen Cheng (Pu-Jen Cheng)Pei-Sen Liu (Pei-Sen Liu)
Data acquisition is a major concern in text classification. The excessive human efforts required by conventional methods to build up quality training collection might not always be available to research workers. In this paper, we look into possibilities to automatically collect training data by sampling the Web with a set of given class names. The basic idea is to populate appropriate keywords and submit them as queries to search engines for acquiring training data. Two methods are presented in this study: One method is based on sampling the common concepts among the classes, and the other based on sampling the discriminative concepts for each class. A series of experiments were carried out independently on two different datasets, and the result shows that the proposed methods significantly improve classifier performance even without using manually labeled training data. Our strategy for retrieving Web samples, we find that, is substantially helpful in conventional document classification in terms of accuracy and efficiency.
起訖頁 53-67
關鍵詞 Unsupervised classificationtext classificationWeb mining
刊名 ROCLING論文集  
期數 2009 (2009期)
出版單位 中華民國計算語言學學會
該期刊-上一篇 強健性語音辨識中分頻段調變頻譜補償之研究
該期刊-下一篇 Query Formulation by Selecting Good Terms




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄