英文摘要 |
Recently, data mining has been applied in business information and intelligence systems for discovering interesting patterns and knowledge to support decision making processes. One of the most basic and important tasks of data mining is the mining of frequent itemsets, which are sets of items frequently purchased by customers. Many methods have been proposed for this problem. However, mining the complete set of frequent itemsets often leads to a huge solution space. Fortunately, this problem can be reduced to the mining of Frequent Closed Itemsets (FCIs), which results in a much smaller yet representative set of purchase patterns of the customers. Still, there are redundancies in the databases that can be eliminated to enhance both space and time efficiency. In this paper, we propose a novel data structure, the Transaction Pattern List (TPL), for eliminating data redundancies, and design the algorithm TPLFCI-Mining for mining FCIs efficiently with the TPL. Our algorithm is evaluated under more rigorous conditions than previously proposed methods. Experimental results show that our method is efficient for both sparse and dense databases, and is scalable for large databases even at low support thresholds. |