英文摘要 |
With the rapid progress of information technology, more and more amounts of data are produced and stored in the databases. Data mining helps to extract the useful information and be used widely in different areas, data clustering is an analytic mode that especially most frequent used. Data clustering plays an important role in various fields. Data clustering describes the process of grouping data into clusters such that the data in each cluster share a high degree of similarity while being very dissimilar to data from other clusters. Dissimilarities are evaluated according to the attribute values describing the objects. Usually, distance measures are used. Data clustering algorithms have been developed in recent years. K-means is fast, easily implemented and finds most local optima for data clustering. However, the crucial shortcoming of K-means is the difficultly of recognizing arbitrary shapes. This paper presents a modified k-means based on the concept of distance, and the proposed algorithm may enhance the stability in data clustering results. The simulation reveals that the proposed DK-means yields good accurate clustering results. |