運用三種資料探勘方法預測子宮頸癌存活情形之比較

何子銘; 盧瑜芬; 許家瑋; 白健佑; 白璐; 周雨青; 孫建安; Wetter, Thomas; 林金定; 楊燦; 朱基銘

月旦知識庫會員登入｜元照網路書店｜月旦品評家

熱門：

首頁

臺灣期刊 法律公行政治醫事相關財經社會學教育其他

大陸期刊 核心重要期刊

DOI文章

	本站僅提供期刊文獻檢索。　　【月旦知識庫】是否收錄該篇全文，敬請【登入】查詢為準。最新【購點活動】
篇名	運用三種資料探勘方法預測子宮頸癌存活情形之比較
並列篇名	Predicting Cervical Cancer Survivability: A Comparison of Three Data Mining Methods
作者	何子銘、盧瑜芬、許家瑋 (Chia-Wei Hsu)、白健佑、白璐、周雨青 (You-Ching Chou)、孫建安 (Chien-An Sun)、Wetter, Thomas (Wetter, Thomas)、林金定 (Jin-Ding Lin)、楊燦 (Tsan Yang)、朱基銘
中文摘要	本研究目的在探究以人工智慧(Artificial Intelligence)方法與資料探勘技術(Data Mining)在子宮頸癌預測模式的運用，分別應用類神經網路(Artificial Neural Network)、決策樹(Decision Tree)以及邏輯斯迴歸(Logistic Regression)三種演算法，由預測準確率以及對預測結果的解釋能力做為演算法的評估指標。本研究採用資料探勘技術，以美國SEER (the Surveillance, Epidemiology, and End Results) 1973-2000年癌症登記資料庫(CIPUD, Cancer Incidence Public-Use Database)中433,272筆資料記錄及72個變項進行資料分析，再將資料進行10折交叉驗證(10-Fold cross-validation)，用類神經網路、決策樹以及邏輯斯迴歸三種演算法來比較預測存活準確度。結果顯示：預測準確率分別如下邏輯斯迴歸分析模型為0.8974（敏感度0.9047，特異度0.8830）；決策樹分析模型(C5)為0.8732（敏感度0.8639，特異度0.8966）；類神經網路分析模型為0.7406（敏感度0.7394，特異度0.7473）。邏輯斯迴歸演算結果預測準確度出現極端值1.0 (100%)、0.9942 (99.42%)，明顯高出預測準確度的平均值0.8981。在決策樹的模型中，預測結果普遍比邏輯斯迴歸高，但相差不大。在類神經網路模型中，預測準確度平均為0.7776，明顯低於邏輯斯迴歸及決策樹，在其10折的準確度也顯示出不穩定的狀況，標準差為0.0786，為三種模型中最高。以預測準確度的平均值而言，邏輯斯迴歸分析(0.8981)及決策樹分析(0.8926)優於類神經網路分析(0.7776)，而且類神經網路模型10折交叉驗證的預測準確度標準差(0.0786)最大；這樣的情形顯示其預測能力相對於邏輯斯迴歸及決策樹模型表現不佳。
英文摘要	Objective: The purpose of the study was to investigate the use of artificial intelligence methods and data mining technology for predicting cervical cancer survivability. The 3 models of artificial neural network, decision tree and logistic regression were investigated and their accuracy values for predicting cervical cancer survivability were evaluated. Methods and material: The Surveillance, Epidemiology, and End Results (SEER), a large dataset, was used to develop the 3 prediction models. The 3 models were 2 popular data mining algorithms, which were artificial neural network and decision tree; and 1 common statistical model, which was logistic regression. The 10-fold cross-validation analysis also measured the unbiased estimation of 3 prediction results for comparing their performances. Results: The results of accuracy of 3 models were respectively 0.8981 of logistic regression, 0.8930 of decision tree and 0.7776 of artificial neural network. The results of logistic regression were ever 1.0 and 0.9942 accuracy. In 10-fold cross-validation analysis, the standard deviation of accuracy of artificial neural network was 0.0786 and it was the worst one among the 3 prediction models. Conclusions: In this research, artificial neural network performed the model for predicting cervical cancer survivability worse (lowest prediction accuracy and largest variation of accuracy in 10-fold cross-validation analysis) than logistic regression and decision tree.
起訖頁	192-203
關鍵詞	cervical cancer survivability、data mining、k-fold cross-validation、SEER
刊名	台灣家庭醫學雜誌
期數	200609 (16:3期)
出版單位	台灣家庭醫學醫學會
該期刊-上一篇	過重者代謝症候群與胰島素阻抗之相關性探討
該期刊-下一篇	瀰漫性泛細支氣管炎：病例報告及文獻回顧