Abstract
We investigate the general model of mining associations in a temporal database, where the exhibition periods of items are allowed to be different from one to another. The database is divided into partitions according to the time granularity imposed. Such temporal association rules allow us to observe short-term but interesting patterns that are absent when the whole range of the database is evaluated altogether. Prior work may omit some temporal association rules and thus have limited practicability. To remedy this and to give more precise frequent exhibition periods of frequent temporal itemsets, we devise an efficient algorithm Twain (standing for TWo end AssocIation miNer.) Twain not only generates frequent patterns with more precise frequent exhibition periods, but also discovers more interesting frequent patterns. Twain employs Start time and End time of each item to provide precise frequent exhibition period while progressively handling itemsets from one partition to another. Along with one scan of the database, Twain can generate frequent 2-itemsets directly according to the cumulative filtering threshold. Then, Twain adopts the scan reduction technique to generate all frequent k-itemsets (k > 2) from the generated frequent 2-itemsets. Theoretical properties of Twain are derived as well in this article. The experimental results show that Twain outperforms the prior works in the quality of frequent patterns, execution time, I/O cost, CPU overhead and scalability.
- Agarwal, R., Aggarwal, C., and Prasad, V. 2000. A tree projection algorithm for generation of frequent itemsets. J. Para. Distrib. Comput. (Special Issue on High Performance Data Mining). Google ScholarDigital Library
- Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, ACM, New York, 207--216. Google ScholarDigital Library
- Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, 478--499. Google ScholarDigital Library
- Ale, J. and Rossi, G. 2000. An approach to discovering temporal association rules. In Proceedings of the ACM Symposium on Applied Computing 1, ACM, New York, 294--300. Google ScholarDigital Library
- Ayad, A. M., El-Makky, N. M., and Taha, Y. 2001. Incremental mining of constrained association rules. In Proceedings of the 1st ACM-SIAM Conference on Data Mining. ACM, New York.Google Scholar
- Besemann, C. and Denton, A. 2005. Integration of profile hidden Markov model output into association rule mining. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, ACM, New York, 538--543. Google ScholarDigital Library
- Bettini, C., Wang, X., and Jajodia, S. 1998. Mining temporal relationships with multiple granularities in time sequences. Bulle. IEEE Comput. Soc. Tech. Comm. Data Eng.Google Scholar
- Blanchard, J., Guillet, F., Gras, R., and Briand, H. 2005. Using information-theoretic measures to assess association rule interestingness. In Proceedings of the 5th IEEE International Conference on Data Mining. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
- Chang, C.-Y., Chen, M.-S., and Lee, C.-H. 2002. Mining general temporal association rules for items with different exhibition periods. In Proceedings of the 2nd IEEE International Conference on Data Mining. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
- Chen, J., He, H., Williams, G., and Jin, H. 2004. Temporal sequence associations for rare events. In Proceedings of the 8th Pacific Asia Conference on Knowledge Discovery and Data Mining.Google Scholar
- Chen, X. and Petr, I. 2000. Discovering temporal association rules: algorithms, language and system. In Proceedings of the 16th IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
- Chen, X., Petrounias, I., and Heathfield, H. 1998. Discovery of association rules in temporal databases. In Proceedings of the Issues and Applications of Database Technology.Google Scholar
- Cohen, E., Datary, M., Fujiwaraz, S., Gionisx, A., Indyk, P., Motwanik, R., Ullman, J. D., and Yangyy, C. 2001. Finding interesting associations without support pruning. IEEE Trans. Knowl. Data Eng., 64--78. Google ScholarDigital Library
- Han, J. and Fu, Y. 1995. Discovery of multiple-level association rules from large databases. In Proceedings of the 21th International Conference on Very Large Data Bases, 420--431. Google ScholarDigital Library
- Han, J. and Kamber, M. 2000. Data Mining: Concepts and Techniques. Morgan-Kaufmann. San Francisco, CA. Google ScholarDigital Library
- Han, J. and Pei, J. 2000. Mining frequent patterns by pattern-growth: Methodology and implications. ACM SIGKDD Explorations (Special Issue on Scaleble Data Mining Algorithms). ACM, New York. Google ScholarDigital Library
- Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., and Hsu, M.-C. 2000a. FreeSpan: Frequent pattern-projected sequential pattern mining. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York. 355--359. Google ScholarDigital Library
- Han, J., Pei, J., and Yin, Y. 2000b. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, ACM, New York. 486--493. Google ScholarDigital Library
- Harms, S. K. and Deogun, J. S. 2004. Sequential association rule mining with time lags. Journal of Intelligent Informatics Systems. Google ScholarDigital Library
- Jiang, N. and Gruenwald, L. 2006. An efficient algorithm to mine online data streams. In Proceedings of the 2006 KDD TDM Workshop.Google Scholar
- Ke, Y., Cheng, J., and Ng, W. 2006. MIC framework: An information-theoretic approach to quantitative association rule mining. In Proceedings of the 22nd IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
- Kifer, D., Bucila, C., Gehrke, J., and White, W. 2002. DualMiner: A dual-pruning algorithm for itemsets with constraints. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York. Google ScholarDigital Library
- Lakshmanan, L., Ng, R., Han, J., and Pang, A. 1998. Exploratory mining and pruning optimization of constrained associations rules. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data. ACM, New York. Google ScholarDigital Library
- Lakshmanan, L. V. S., Ng, R., Han, J., and Pang, A. 1999. Optimization of constrained frequent set queries with 2-variable constraints. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, ACM, New York. 157--168. Google ScholarDigital Library
- Lee, C.-H., Chen, M.-S., and Lin, C.-R. 2003. Progressive partition miner: An efficient algorithm for mining general temporal association rules. IEEE Trans. Knowl. Data Eng. 15, 4 (Aug.), 1004--1017. Google ScholarDigital Library
- Lee, C.-H., Lin, C.-R., and Chen, M.-S. 2001a. On mining general temporal association rules in a publication database. In Proceedings of the 1st IEEE International Conference on Data Mining. (Nov.). IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
- Lee, C.-H., Lin, C.-R., and Chen, M.-S. 2001b. Sliding-window filtering: An efficient algorithm for incremental mining. In Proceedings of the 10th ACM International Conference on Information and Knowledge Management. ACM, New York. Google ScholarDigital Library
- Lin, J.-L. and Dunham, M. 1998. Mining association rules: Anti-skew algorithms. In Proceedings of the 14th IEEE International Conference on Data Engineering, IEEE Computer Society Press, Los Alamitos, CA. 486--493. Google ScholarDigital Library
- Liu, B., Hsu, W., and Ma, Y. 1999. Mining association rules with multiple minimum supports. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York. Google ScholarDigital Library
- Mueller, A. 1995. Fast sequential and parallel algorithms for association rule mining: A comparison. Tech. Rep. CS-TR-3515, Dept. of Computer Science, Univ. of Maryland, College Park, MD. Google ScholarDigital Library
- Muhonen, B. G. J. and Toivonen, H. 2005. Mining non-derivable association rules. In Proceedings of the 5th ACM SIAM Conference on Data Mining. ACM, New York.Google Scholar
- Ng, R. and Han, J. 1994. Efficient and effective clustering methods for spatial data mining. In Proceedings of the 20th International Conference on Very Large Data Bases, 144--155. Google ScholarDigital Library
- Park, J.-S., Chen, M.-S., and Yu, P. S. 1997. Using a hash-based method with transaction trimming for mining association rules. IEEE Trans. Knowl. Data Eng. 9, 5 (Oct.), 813--825. Google ScholarDigital Library
- Pei, J. and Han, J. 2000. Can we push more constraints into frequent pattern mining? In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York. Google ScholarDigital Library
- Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings of the 17th IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
- Savasere, A., Omiecinski, E., and Navathe, S. 1995. An efficient algorithm for mining association rules in large databases. In Proceedings of the 21th International Conference on Very Large Data Bases, 432--444. Google ScholarDigital Library
- Srikant, R. and Agrawal, R. 1995. Mining generalized association rules. In Proceedings of the 21th International Conference on Very Large Data Bases. 407--419. Google ScholarDigital Library
- Srikant, R. and Agrawal, R. 1996. Mining quantitative association rules in large relational tables. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. ACM, New York. Google ScholarDigital Library
- Steinbach, M., Tan, P.-N., and Kumar, V. 2005. Support envelopes: A technique for exploring the structure of association patterns. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarDigital Library
- Szymon and Jaroszewicz. 2006. Polynomial association rules with applications to logistic regression. In Proceedings of the 11st ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarDigital Library
- Tansel, A. and Ayan, N. 1998. Discovery of association rules in temporal databases. In Proceedings of the AAAI on Knowledge Discovery in Databases.Google Scholar
- Toivonen, H. 1996. Sampling large databases for association rules. In Proceedings of the 22nd International Conference on Very Large Data Bases, 134--145. Google ScholarDigital Library
- Wang, K., He, Y., and Han, J. 2000. Mining frequent itemsets using support constraints. In Proceedings of the 26th International Conference on Very Large Data Bases. Google ScholarDigital Library
- Xuan-Hieu, P., Le-Minh, N., Tu-Bao, H., and Susumu, H. 2005. Improving discriminative sequential learning with rare-but-important associations. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarDigital Library
- Yang, C., Fayyad, U., and Bradley, P. 2001. Efficient discovery of error-tolerant frequent itemsets in high dimensions. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarDigital Library
- Zheng, Z., Kohavi, R., and Mason, L. 2001. Real world performance of association rule algorithms. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, F. Provost and R. Srikant, Eds. ACM, New York. 401--406. Google ScholarDigital Library
Index Terms
- Twain: Two-end association miner with precise frequent exhibition periods
Recommendations
Developing Novel and Effective Approach for Association Rule Mining Using Progressive Sampling
ICCEE '09: Proceedings of the 2009 Second International Conference on Computer and Electrical Engineering - Volume 01A challenging task in data mining is the process of discovering association rules from a large database. Most of the existing association rule mining algorithms make repeated passes over the entire database to determine the frequent itemsets, which is ...
SRIHASS - a similarity measure for discovery of hidden time profiled temporal associations
Mining and visualization of time profiled temporal associations is an important research problem that is not addressed in a wider perspective and is understudied. Visual analysis of time profiled temporal associations helps to better understand hidden ...
Knowledge Discovery from Series of Interval Events
Data warehousing and knowledge discoveryKnowledge discovery from data sets can be extensively automated by using data mining software tools. Techniques for mining series of interval events, however, have not been considered. Such time series are common in many applications. In this paper, we ...
Comments