月旦知識庫
 
  1. 熱門:
 
首頁 臺灣期刊   法律   公行政治   醫事相關   財經   社會學   教育   其他 大陸期刊   核心   重要期刊 DOI文章
電子商務學報 本站僅提供期刊文獻檢索。
  【月旦知識庫】是否收錄該篇全文,敬請【登入】查詢為準。
最新【購點活動】


篇名
混合型資料集的K-means分群演算法
並列篇名
A k-means Based Clustering Algorithm for Mixed-Attribute Data Sets
作者 黃宇翔王品鈞方志強
中文摘要
叢集分析為資料探勘分群技術之一,由於目前網路環境快速發展,資料屬性的種類與數量大量增加,導致傳統分群技術執行的效能大幅降低,傳統k-means分群方法將難以應付。因此後續的相關研究則是針對數值、類別、順序等屬性資料的處理作為研究的重點。本研究以Ahmad and Dey(2007)所提出k-means之衡量距離定義為基礎,針對三種屬性同時存在的資料集做叢集分析,並以各自不同的衡量距離定義作為分群考量,提出基因演算法以求得最佳衡量指標最好之群心組合,希望能提供各界應用,解決因三種混合的資料屬性所造成分群困難的實務問題。
英文摘要
Clustering is one of the most important analysis methods in data mining. In the wake of the fast development of networks technology, various types of data attribute and large numbers of data items cause the substantial inefficiency of data processing for clustering. Among different clustering approaches, partitioning clustering is relatively easier to implement and faster to perform than other ones. Different types of data attributes make clustering complicated. Most of literature focuses on numerical and categorical attributes or only ordinal attributes, respectively, but the results turn out to be less satisfactory in terms of accuracy and execution time. The proposed clustering approach, based on Ahmad and Dey (2007) k-means method, is advantageous in dealing with the three attributes: numerical, categorical and ordinal attributes simultaneously in which Euclidean distance is used to define the numerical similarity, the frequency of each value’s rank is used to indicate the categorical similarity, and the normalized distance is used to measure the ordinal similarity. The effectiveness of the proposed approach is evaluated by the use of an essential concept of clustering which is to minimize the ratio of the within cluster errors to the between cluster errors. A generic algorithm is also developed for reducing the execution time in dealing with the clustering of the three types of attributes at the same time. We hope the proposed method can provide a useful clustering technique for applications in practice.
起訖頁 1-28
關鍵詞 叢集分析k-means順序屬性距離量度Clustering analysisk-meansordinal attributedistance measure
刊名 電子商務學報  
期數 201706 (19:1期)
出版單位 中華企業資源規劃學會
該期刊-下一篇 新型態之電子投票機制設計
 

新書閱讀



最新影音


優惠活動




讀者服務專線:+886-2-23756688 傳真:+886-2-23318496
地址:臺北市館前路28 號 7 樓 客服信箱
Copyright © 元照出版 All rights reserved. 版權所有,禁止轉貼節錄