article

Twain: Two-end association miner with precise frequent exhibition periods

Authors:
Jen-Wei Huang

National Taiwan University, Taipei, Taiwan

National Taiwan University, Taipei, Taiwan
View Profile

,
Bi-Ru Dai

National Taiwan University of Science and Technology

National Taiwan University of Science and Technology
View Profile

,
Ming-Syan Chen

National Taiwan University, Taipei, Taiwan

National Taiwan University, Taipei, Taiwan
View Profile

Authors Info & Claims

ACM Transactions on Knowledge Discovery from Data Volume 1 Issue 2pp 8–eshttps://doi.org/10.1145/1267066.1267069

Published:01 August 2007Publication History

ACM Transactions on Knowledge Discovery from Data

Abstract

We investigate the general model of mining associations in a temporal database, where the exhibition periods of items are allowed to be different from one to another. The database is divided into partitions according to the time granularity imposed. Such temporal association rules allow us to observe short-term but interesting patterns that are absent when the whole range of the database is evaluated altogether. Prior work may omit some temporal association rules and thus have limited practicability. To remedy this and to give more precise frequent exhibition periods of frequent temporal itemsets, we devise an efficient algorithm Twain (standing for TWo end AssocIation miNer.) Twain not only generates frequent patterns with more precise frequent exhibition periods, but also discovers more interesting frequent patterns. Twain employs Start time and End time of each item to provide precise frequent exhibition period while progressively handling itemsets from one partition to another. Along with one scan of the database, Twain can generate frequent 2-itemsets directly according to the cumulative filtering threshold. Then, Twain adopts the scan reduction technique to generate all frequent k-itemsets (k > 2) from the generated frequent 2-itemsets. Theoretical properties of Twain are derived as well in this article. The experimental results show that Twain outperforms the prior works in the quality of frequent patterns, execution time, I/O cost, CPU overhead and scalability.

References

Agarwal, R., Aggarwal, C., and Prasad, V. 2000. A tree projection algorithm for generation of frequent itemsets. J. Para. Distrib. Comput. (Special Issue on High Performance Data Mining). Google ScholarDigital Library
Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, ACM, New York, 207--216. Google ScholarDigital Library
Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, 478--499. Google ScholarDigital Library
Ale, J. and Rossi, G. 2000. An approach to discovering temporal association rules. In Proceedings of the ACM Symposium on Applied Computing 1, ACM, New York, 294--300. Google ScholarDigital Library
Ayad, A. M., El-Makky, N. M., and Taha, Y. 2001. Incremental mining of constrained association rules. In Proceedings of the 1st ACM-SIAM Conference on Data Mining. ACM, New York.Google Scholar
Besemann, C. and Denton, A. 2005. Integration of profile hidden Markov model output into association rule mining. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, ACM, New York, 538--543. Google ScholarDigital Library
Bettini, C., Wang, X., and Jajodia, S. 1998. Mining temporal relationships with multiple granularities in time sequences. Bulle. IEEE Comput. Soc. Tech. Comm. Data Eng.Google Scholar
Blanchard, J., Guillet, F., Gras, R., and Briand, H. 2005. Using information-theoretic measures to assess association rule interestingness. In Proceedings of the 5th IEEE International Conference on Data Mining. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
Chang, C.-Y., Chen, M.-S., and Lee, C.-H. 2002. Mining general temporal association rules for items with different exhibition periods. In Proceedings of the 2nd IEEE International Conference on Data Mining. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
Chen, J., He, H., Williams, G., and Jin, H. 2004. Temporal sequence associations for rare events. In Proceedings of the 8th Pacific Asia Conference on Knowledge Discovery and Data Mining.Google Scholar
Chen, X. and Petr, I. 2000. Discovering temporal association rules: algorithms, language and system. In Proceedings of the 16th IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
Chen, X., Petrounias, I., and Heathfield, H. 1998. Discovery of association rules in temporal databases. In Proceedings of the Issues and Applications of Database Technology.Google Scholar
Cohen, E., Datary, M., Fujiwaraz, S., Gionisx, A., Indyk, P., Motwanik, R., Ullman, J. D., and Yangyy, C. 2001. Finding interesting associations without support pruning. IEEE Trans. Knowl. Data Eng., 64--78. Google ScholarDigital Library
Han, J. and Fu, Y. 1995. Discovery of multiple-level association rules from large databases. In Proceedings of the 21th International Conference on Very Large Data Bases, 420--431. Google ScholarDigital Library
Han, J. and Kamber, M. 2000. Data Mining: Concepts and Techniques. Morgan-Kaufmann. San Francisco, CA. Google ScholarDigital Library
Han, J. and Pei, J. 2000. Mining frequent patterns by pattern-growth: Methodology and implications. ACM SIGKDD Explorations (Special Issue on Scaleble Data Mining Algorithms). ACM, New York. Google ScholarDigital Library
Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., and Hsu, M.-C. 2000a. FreeSpan: Frequent pattern-projected sequential pattern mining. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York. 355--359. Google ScholarDigital Library
Han, J., Pei, J., and Yin, Y. 2000b. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, ACM, New York. 486--493. Google ScholarDigital Library
Harms, S. K. and Deogun, J. S. 2004. Sequential association rule mining with time lags. Journal of Intelligent Informatics Systems. Google ScholarDigital Library
Jiang, N. and Gruenwald, L. 2006. An efficient algorithm to mine online data streams. In Proceedings of the 2006 KDD TDM Workshop.Google Scholar
Ke, Y., Cheng, J., and Ng, W. 2006. MIC framework: An information-theoretic approach to quantitative association rule mining. In Proceedings of the 22nd IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
Kifer, D., Bucila, C., Gehrke, J., and White, W. 2002. DualMiner: A dual-pruning algorithm for itemsets with constraints. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York. Google ScholarDigital Library
Lakshmanan, L., Ng, R., Han, J., and Pang, A. 1998. Exploratory mining and pruning optimization of constrained associations rules. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data. ACM, New York. Google ScholarDigital Library
Lakshmanan, L. V. S., Ng, R., Han, J., and Pang, A. 1999. Optimization of constrained frequent set queries with 2-variable constraints. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, ACM, New York. 157--168. Google ScholarDigital Library
Lee, C.-H., Chen, M.-S., and Lin, C.-R. 2003. Progressive partition miner: An efficient algorithm for mining general temporal association rules. IEEE Trans. Knowl. Data Eng. 15, 4 (Aug.), 1004--1017. Google ScholarDigital Library
Lee, C.-H., Lin, C.-R., and Chen, M.-S. 2001a. On mining general temporal association rules in a publication database. In Proceedings of the 1st IEEE International Conference on Data Mining. (Nov.). IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
Lee, C.-H., Lin, C.-R., and Chen, M.-S. 2001b. Sliding-window filtering: An efficient algorithm for incremental mining. In Proceedings of the 10th ACM International Conference on Information and Knowledge Management. ACM, New York. Google ScholarDigital Library
Lin, J.-L. and Dunham, M. 1998. Mining association rules: Anti-skew algorithms. In Proceedings of the 14th IEEE International Conference on Data Engineering, IEEE Computer Society Press, Los Alamitos, CA. 486--493. Google ScholarDigital Library
Liu, B., Hsu, W., and Ma, Y. 1999. Mining association rules with multiple minimum supports. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York. Google ScholarDigital Library
Mueller, A. 1995. Fast sequential and parallel algorithms for association rule mining: A comparison. Tech. Rep. CS-TR-3515, Dept. of Computer Science, Univ. of Maryland, College Park, MD. Google ScholarDigital Library
Muhonen, B. G. J. and Toivonen, H. 2005. Mining non-derivable association rules. In Proceedings of the 5th ACM SIAM Conference on Data Mining. ACM, New York.Google Scholar
Ng, R. and Han, J. 1994. Efficient and effective clustering methods for spatial data mining. In Proceedings of the 20th International Conference on Very Large Data Bases, 144--155. Google ScholarDigital Library
Park, J.-S., Chen, M.-S., and Yu, P. S. 1997. Using a hash-based method with transaction trimming for mining association rules. IEEE Trans. Knowl. Data Eng. 9, 5 (Oct.), 813--825. Google ScholarDigital Library
Pei, J. and Han, J. 2000. Can we push more constraints into frequent pattern mining? In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York. Google ScholarDigital Library
Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings of the 17th IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarDigital Library
Savasere, A., Omiecinski, E., and Navathe, S. 1995. An efficient algorithm for mining association rules in large databases. In Proceedings of the 21th International Conference on Very Large Data Bases, 432--444. Google ScholarDigital Library
Srikant, R. and Agrawal, R. 1995. Mining generalized association rules. In Proceedings of the 21th International Conference on Very Large Data Bases. 407--419. Google ScholarDigital Library
Srikant, R. and Agrawal, R. 1996. Mining quantitative association rules in large relational tables. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. ACM, New York. Google ScholarDigital Library
Steinbach, M., Tan, P.-N., and Kumar, V. 2005. Support envelopes: A technique for exploring the structure of association patterns. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarDigital Library
Szymon and Jaroszewicz. 2006. Polynomial association rules with applications to logistic regression. In Proceedings of the 11st ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarDigital Library
Tansel, A. and Ayan, N. 1998. Discovery of association rules in temporal databases. In Proceedings of the AAAI on Knowledge Discovery in Databases.Google Scholar
Toivonen, H. 1996. Sampling large databases for association rules. In Proceedings of the 22nd International Conference on Very Large Data Bases, 134--145. Google ScholarDigital Library
Wang, K., He, Y., and Han, J. 2000. Mining frequent itemsets using support constraints. In Proceedings of the 26th International Conference on Very Large Data Bases. Google ScholarDigital Library
Xuan-Hieu, P., Le-Minh, N., Tu-Bao, H., and Susumu, H. 2005. Improving discriminative sequential learning with rare-but-important associations. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarDigital Library
Yang, C., Fayyad, U., and Bradley, P. 2001. Efficient discovery of error-tolerant frequent itemsets in high dimensions. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarDigital Library
Zheng, Z., Kohavi, R., and Mason, L. 2001. Real world performance of association rule algorithms. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, F. Provost and R. Srikant, Eds. ACM, New York. 401--406. Google ScholarDigital Library

Index Terms

Twain: Two-end association miner with precise frequent exhibition periods
1. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Developing Novel and Effective Approach for Association Rule Mining Using Progressive Sampling
ICCEE '09: Proceedings of the 2009 Second International Conference on Computer and Electrical Engineering - Volume 01

A challenging task in data mining is the process of discovering association rules from a large database. Most of the existing association rule mining algorithms make repeated passes over the entire database to determine the frequent itemsets, which is ...
Read More
SRIHASS - a similarity measure for discovery of hidden time profiled temporal associations

Mining and visualization of time profiled temporal associations is an important research problem that is not addressed in a wider perspective and is understudied. Visual analysis of time profiled temporal associations helps to better understand hidden ...
Read More
Knowledge Discovery from Series of Interval Events
Data warehousing and knowledge discovery

Knowledge discovery from data sets can be extensively automated by using data mining software tools. Techniques for mining series of interval events, however, have not been considered. Such time series are common in many applications. In this paper, we ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

ACM Transactions on Knowledge Discovery from Data Volume 1, Issue 2
August 2007
89 pages
ISSN:1556-4681
EISSN:1556-472X
DOI:10.1145/1267066
Issue’s Table of Contents

Copyright © 2007 ACM
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 August 2007
Published in tkdd Volume 1, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Association
temporal
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 741
  Total Downloads
- Downloads (Last 12 months)4
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Twain: Two-end association miner with precise frequent exhibition periods

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Developing Novel and Effective Approach for Association Rule Mining Using Progressive Sampling

SRIHASS - a similarity measure for discovery of hidden time profiled temporal associations

Knowledge Discovery from Series of Interval Events

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Twain: Two-end association miner with precise frequent exhibition periods

ACM Transactions on Knowledge Discovery from Data

Abstract

References

Cited By

Index Terms

Recommendations

Developing Novel and Effective Approach for Association Rule Mining Using Progressive Sampling

SRIHASS - a similarity measure for discovery of hidden time profiled temporal associations

Knowledge Discovery from Series of Interval Events

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media