skip to main content
article

Twain: Two-end association miner with precise frequent exhibition periods

Published:01 August 2007Publication History
Skip Abstract Section

Abstract

We investigate the general model of mining associations in a temporal database, where the exhibition periods of items are allowed to be different from one to another. The database is divided into partitions according to the time granularity imposed. Such temporal association rules allow us to observe short-term but interesting patterns that are absent when the whole range of the database is evaluated altogether. Prior work may omit some temporal association rules and thus have limited practicability. To remedy this and to give more precise frequent exhibition periods of frequent temporal itemsets, we devise an efficient algorithm Twain (standing for TWo end AssocIation miNer.) Twain not only generates frequent patterns with more precise frequent exhibition periods, but also discovers more interesting frequent patterns. Twain employs Start time and End time of each item to provide precise frequent exhibition period while progressively handling itemsets from one partition to another. Along with one scan of the database, Twain can generate frequent 2-itemsets directly according to the cumulative filtering threshold. Then, Twain adopts the scan reduction technique to generate all frequent k-itemsets (k > 2) from the generated frequent 2-itemsets. Theoretical properties of Twain are derived as well in this article. The experimental results show that Twain outperforms the prior works in the quality of frequent patterns, execution time, I/O cost, CPU overhead and scalability.

References

  1. Agarwal, R., Aggarwal, C., and Prasad, V. 2000. A tree projection algorithm for generation of frequent itemsets. J. Para. Distrib. Comput. (Special Issue on High Performance Data Mining). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Agrawal, R., Imielinski, T., and Swami, A. 1993. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, ACM, New York, 207--216. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Agrawal, R. and Srikant, R. 1994. Fast algorithms for mining association rules in large databases. In Proceedings of the 20th International Conference on Very Large Data Bases, 478--499. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Ale, J. and Rossi, G. 2000. An approach to discovering temporal association rules. In Proceedings of the ACM Symposium on Applied Computing 1, ACM, New York, 294--300. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ayad, A. M., El-Makky, N. M., and Taha, Y. 2001. Incremental mining of constrained association rules. In Proceedings of the 1st ACM-SIAM Conference on Data Mining. ACM, New York.Google ScholarGoogle Scholar
  6. Besemann, C. and Denton, A. 2005. Integration of profile hidden Markov model output into association rule mining. In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, ACM, New York, 538--543. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Bettini, C., Wang, X., and Jajodia, S. 1998. Mining temporal relationships with multiple granularities in time sequences. Bulle. IEEE Comput. Soc. Tech. Comm. Data Eng.Google ScholarGoogle Scholar
  8. Blanchard, J., Guillet, F., Gras, R., and Briand, H. 2005. Using information-theoretic measures to assess association rule interestingness. In Proceedings of the 5th IEEE International Conference on Data Mining. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Chang, C.-Y., Chen, M.-S., and Lee, C.-H. 2002. Mining general temporal association rules for items with different exhibition periods. In Proceedings of the 2nd IEEE International Conference on Data Mining. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chen, J., He, H., Williams, G., and Jin, H. 2004. Temporal sequence associations for rare events. In Proceedings of the 8th Pacific Asia Conference on Knowledge Discovery and Data Mining.Google ScholarGoogle Scholar
  11. Chen, X. and Petr, I. 2000. Discovering temporal association rules: algorithms, language and system. In Proceedings of the 16th IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Chen, X., Petrounias, I., and Heathfield, H. 1998. Discovery of association rules in temporal databases. In Proceedings of the Issues and Applications of Database Technology.Google ScholarGoogle Scholar
  13. Cohen, E., Datary, M., Fujiwaraz, S., Gionisx, A., Indyk, P., Motwanik, R., Ullman, J. D., and Yangyy, C. 2001. Finding interesting associations without support pruning. IEEE Trans. Knowl. Data Eng., 64--78. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Han, J. and Fu, Y. 1995. Discovery of multiple-level association rules from large databases. In Proceedings of the 21th International Conference on Very Large Data Bases, 420--431. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Han, J. and Kamber, M. 2000. Data Mining: Concepts and Techniques. Morgan-Kaufmann. San Francisco, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Han, J. and Pei, J. 2000. Mining frequent patterns by pattern-growth: Methodology and implications. ACM SIGKDD Explorations (Special Issue on Scaleble Data Mining Algorithms). ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., and Hsu, M.-C. 2000a. FreeSpan: Frequent pattern-projected sequential pattern mining. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, New York. 355--359. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Han, J., Pei, J., and Yin, Y. 2000b. Mining frequent patterns without candidate generation. In Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, ACM, New York. 486--493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Harms, S. K. and Deogun, J. S. 2004. Sequential association rule mining with time lags. Journal of Intelligent Informatics Systems. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Jiang, N. and Gruenwald, L. 2006. An efficient algorithm to mine online data streams. In Proceedings of the 2006 KDD TDM Workshop.Google ScholarGoogle Scholar
  21. Ke, Y., Cheng, J., and Ng, W. 2006. MIC framework: An information-theoretic approach to quantitative association rule mining. In Proceedings of the 22nd IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kifer, D., Bucila, C., Gehrke, J., and White, W. 2002. DualMiner: A dual-pruning algorithm for itemsets with constraints. In Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Lakshmanan, L., Ng, R., Han, J., and Pang, A. 1998. Exploratory mining and pruning optimization of constrained associations rules. In Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Lakshmanan, L. V. S., Ng, R., Han, J., and Pang, A. 1999. Optimization of constrained frequent set queries with 2-variable constraints. In Proceedings of the 1999 ACM SIGMOD International Conference on Management of Data, ACM, New York. 157--168. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Lee, C.-H., Chen, M.-S., and Lin, C.-R. 2003. Progressive partition miner: An efficient algorithm for mining general temporal association rules. IEEE Trans. Knowl. Data Eng. 15, 4 (Aug.), 1004--1017. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Lee, C.-H., Lin, C.-R., and Chen, M.-S. 2001a. On mining general temporal association rules in a publication database. In Proceedings of the 1st IEEE International Conference on Data Mining. (Nov.). IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Lee, C.-H., Lin, C.-R., and Chen, M.-S. 2001b. Sliding-window filtering: An efficient algorithm for incremental mining. In Proceedings of the 10th ACM International Conference on Information and Knowledge Management. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Lin, J.-L. and Dunham, M. 1998. Mining association rules: Anti-skew algorithms. In Proceedings of the 14th IEEE International Conference on Data Engineering, IEEE Computer Society Press, Los Alamitos, CA. 486--493. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. Liu, B., Hsu, W., and Ma, Y. 1999. Mining association rules with multiple minimum supports. In Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Mueller, A. 1995. Fast sequential and parallel algorithms for association rule mining: A comparison. Tech. Rep. CS-TR-3515, Dept. of Computer Science, Univ. of Maryland, College Park, MD. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Muhonen, B. G. J. and Toivonen, H. 2005. Mining non-derivable association rules. In Proceedings of the 5th ACM SIAM Conference on Data Mining. ACM, New York.Google ScholarGoogle Scholar
  32. Ng, R. and Han, J. 1994. Efficient and effective clustering methods for spatial data mining. In Proceedings of the 20th International Conference on Very Large Data Bases, 144--155. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Park, J.-S., Chen, M.-S., and Yu, P. S. 1997. Using a hash-based method with transaction trimming for mining association rules. IEEE Trans. Knowl. Data Eng. 9, 5 (Oct.), 813--825. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Pei, J. and Han, J. 2000. Can we push more constraints into frequent pattern mining? In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Pei, J., Han, J., Mortazavi-Asl, B., Pinto, H., Chen, Q., Dayal, U., and Hsu, M.-C. 2001. PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth. In Proceedings of the 17th IEEE International Conference on Data Engineering. IEEE Computer Society Press, Los Alamitos, CA. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Savasere, A., Omiecinski, E., and Navathe, S. 1995. An efficient algorithm for mining association rules in large databases. In Proceedings of the 21th International Conference on Very Large Data Bases, 432--444. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Srikant, R. and Agrawal, R. 1995. Mining generalized association rules. In Proceedings of the 21th International Conference on Very Large Data Bases. 407--419. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Srikant, R. and Agrawal, R. 1996. Mining quantitative association rules in large relational tables. In Proceedings of the 1996 ACM SIGMOD International Conference on Management of Data. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Steinbach, M., Tan, P.-N., and Kumar, V. 2005. Support envelopes: A technique for exploring the structure of association patterns. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Szymon and Jaroszewicz. 2006. Polynomial association rules with applications to logistic regression. In Proceedings of the 11st ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Tansel, A. and Ayan, N. 1998. Discovery of association rules in temporal databases. In Proceedings of the AAAI on Knowledge Discovery in Databases.Google ScholarGoogle Scholar
  42. Toivonen, H. 1996. Sampling large databases for association rules. In Proceedings of the 22nd International Conference on Very Large Data Bases, 134--145. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. Wang, K., He, Y., and Han, J. 2000. Mining frequent itemsets using support constraints. In Proceedings of the 26th International Conference on Very Large Data Bases. Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Xuan-Hieu, P., Le-Minh, N., Tu-Bao, H., and Susumu, H. 2005. Improving discriminative sequential learning with rare-but-important associations. In Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Yang, C., Fayyad, U., and Bradley, P. 2001. Efficient discovery of error-tolerant frequent itemsets in high dimensions. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. ACM, New York. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Zheng, Z., Kohavi, R., and Mason, L. 2001. Real world performance of association rule algorithms. In Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, F. Provost and R. Srikant, Eds. ACM, New York. 401--406. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Twain: Two-end association miner with precise frequent exhibition periods

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM Transactions on Knowledge Discovery from Data
      ACM Transactions on Knowledge Discovery from Data  Volume 1, Issue 2
      August 2007
      89 pages
      ISSN:1556-4681
      EISSN:1556-472X
      DOI:10.1145/1267066
      Issue’s Table of Contents

      Copyright © 2007 ACM

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 1 August 2007
      Published in tkdd Volume 1, Issue 2

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • article

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader