英文摘要 |
There are many kinds of Association Rule algorithms and Apriori algorithm is the earliest and most representative one. Apriori algorithm, also called basket analysis, tries to find out the large items from the item set to induce the association rules. Once an item appears in the item set, it is recorded as 1, otherwise it is recorded as 0. The large item whose support is larger than a preset threshold will be induced. However, much data in our life, such as pressure, height and age, are represented as continuous numerical value. In order to induce the association rules from these continuous data, this study proposes a fuzzy Apriori algorithm. First, the numerical data is transformed into fuzzy sets and the membership function for each fuzzy set is created. Then the fuzzy membership value of a fuzzy set for a numerical data will be derived and used to represent its frequency in the item set. Finally, the Apriori algorithm is applied to these fuzzy value and the support and confidence equations were derived to induce the association rules. According to the statistics by Health Promotion Administration, diabetes is one of the ten leading causes of death in Taiwan. Millions of people die from diabetes in Taiwan. There are around two millions of diabetes patients in this country, and diabetes patient number is still increasing every year. It is found that most of the attributes related to diabetes are numerical data. Therefore, this study applied the proposed fuzzy Apriori algorithm to the data of Pima India diabetes in UCI database as an example to mine the association rules for the people who have diabetes. The results of this research show that the proposed fuzzy Apriori algorithm do find some association rules for the numerical diabetes data, and the rules are useful for diagnosis of diabetes. |