社會科學研究中的文字探勘應用：以文意為基礎的文件分類及其問題

陳世榮

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	社會科學研究中的文字探勘應用：以文意為基礎的文件分類及其問題
並列篇名	Text Mining for Social Studies: Meaning-based Document Classification and Its Problems
作者	陳世榮
中文摘要	隨著電子典藏技術的精進，文字探勘技術逐漸受到重視，本文以社會科學研究在文意區別上的需求，評估監督式機器學習對非結構、複雜文本的分類效果，並就所見問題提出分析與建議。本文從文字探勘與內容分析文意區別上的差異與共通性出發，繼而以新聞報導為分析資料，針就特定文件意向，遵循一般文字探勘程序，以支持向量機與簡易貝式分類器執行文件分類評估。分析結果指出，文字探勘對於複雜文意的判讀效果值得肯定，但經由共詞網絡分析也發現，文件的編撰風格將影響文件分類的效果。建議研究者在資料處理初期，應反覆評估研究目的、資料特性與分類器模型間的契合度。
英文摘要	Along with the growing development of electronic information storage, textmining has increasingly gained attention from scholars and practitioners acrossvarious disciplines. In response to the need for meaning differentiation in socialstudies, the study aims to evaluate supervised machine learning classifiers interms of the performance of document classification. Setting out from the comparisonbetween traditional content analysis and text mining, the evaluation followsa normal procedure of text mining and applies Support Vector Machine andNaïve Bayes classifiers on non-structural, complex social texts extracted fromnews media. The outcomes of the analysis validate that text mining managesclassification well for documents with complex meaning. However, a further cowordnetwork analysis in the study finds that the editing style of data may affectclassifiers’ performance. It is suggested that, in the early stage of data processing,greater care must be given to the fit between research problems, editing styles,and classifiers.
起訖頁	683-718
關鍵詞	文字探勘、文意區別、文件分類、機器學習、共詞網絡分析、text mining、meaning differentiation、document classification、machine learning、co-word network analysis
刊名	人文及社會科學集刊
期數	201512 (27:4期)
出版單位	中央研究院人文社會科學研究中心（原：中央研究院中山人文社會科學研究所）
該期刊-上一篇	古董的價格：中國文物拍賣市場的社會鑲嵌

新書閱讀

元照讀書館

優惠活動

月旦品評家

元照讀書館

．研討會新訊

月旦知識庫

月旦法律分析庫
月旦醫事法網
月旦會計財稅網

期刊數位服務

社群平台

讀者服務

關於元照

讀者服務專線：+886-2-23756688　傳真：+886-2-23318496
地址：臺北市館前路28 號 7 樓　客服信箱