英文摘要 |
Recently, the concept of utilizing data cubes stored in a data warehouse to facilitate association rule mining has attracted lots of attention. Researchers have proposed data cube based mining methods and proven that such cube-based approaches can significantly reduce the mining time. However, these studies all assume that the data warehouse can store all possible data cubes, disregarding the issue of how to select an appropriate subset of materialized data cubes with respect to a limited storage in order to minimize the total execution time of association queries. On the other hand, most researches for data cube selection problem focused mainly on SQL or OLAP queries; there is no work addressing the data cube se- lection issue for association queries. The main purpose of this study is to investigate under a limited storage and a given set of users' association queries how we can select appropriate set of data cubes to materialize to reduce the query execution time. To this end, we define a cost model for data cube selection problem for online association mining and elaborate the cost estimation for association query. We implement and compare various heuristic algorithms to select suitable data cubes subject to the space constraint. |