Skip to main content
Log in

Mining high utility itemsets by dynamically pruning the tree structure

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Mining high utility itemsets is one of the most important research issues in data mining owing to its ability to consider nonbinary frequency values of items in transactions and different profit values for each item. Mining such itemsets from a transaction database involves finding those itemsets with utility above a user-specified threshold. In this paper, we propose an efficient concurrent algorithm, called CHUI-Mine (Concurrent High Utility Itemsets Mine), for mining high utility itemsets by dynamically pruning the tree structure. A tree structure, called the CHUI-Tree, is introduced to capture the important utility information of the candidate itemsets. By recording changes in support counts of candidate high utility items during the tree construction process, we implement dynamic CHUI-Tree pruning, and discuss the rationality thereof. The CHUI-Mine algorithm makes use of a concurrent strategy, enabling the simultaneous construction of a CHUI-Tree and the discovery of high utility itemsets. Our algorithm reduces the problem of huge memory usage for tree construction and traversal in tree-based algorithms for mining high utility itemsets. Extensive experimental results show that the CHUI-Mine algorithm is both efficient and scalable.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24

Similar content being viewed by others

References

  1. Adnan M, Alhajj R (2009) DRFP-tree: disk-resident frequent pattern tree. Appl Intell 30(2):84–97

    Article  Google Scholar 

  2. Agrawal R, Srikant R (1994) Fast algorithms for mining association rules. In: Proceedings 20th international conference very large data bases (VLDB’94), pp 487–499

    Google Scholar 

  3. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2009) Efficient tree structures for high utility pattern. IEEE Trans Knowl Data Eng 21(12):1708–1721

    Article  Google Scholar 

  4. Ahmed CF, Tanbeer SK, Jeong B-S, Lee Y-K (2011) HUC-Prune: an efficient candidate pruning technique to mine high utility patterns. Appl Intell 34(2):181–198

    Article  Google Scholar 

  5. Barber B, Hamilton HJ (2003) Extracting share frequent itemsets with infrequent subsets. Data Min Knowl Discov 7(2):153–185

    Article  MathSciNet  Google Scholar 

  6. Chan R, Yang Q, Shen Y-D (2003) Mining high utility itemsets. In: Proceedings of the 3rd IEEE international conference on data mining (ICDM’03), pp 19–26

    Chapter  Google Scholar 

  7. Erwin A, Gopalan RP, Achuthan NR (2007) CTU-Mine: an efficient high utility itemset mining algorithm using the pattern growth approach. In: Proceedings of the 7th IEEE international conference on computer and information technology (CIT’07), pp 71–76

    Chapter  Google Scholar 

  8. Erwin A, Gopalan RP, Achuthan NR (2007) A bottom-up projection based algorithm for mining high utility itemsets. In: Proceedings of the 2nd international workshop on integrating artificial intelligence and data mining (AIDM’07), pp 3–10

    Google Scholar 

  9. Erwin A, Gopalan RP, Achuthan NR (2008) Efficient mining of high utility itemsets from large datasets. In: Proceedings of the 12th Pacific-Asia conference on advances in knowledge discovery and data mining (PAKDD’08), pp 554–561

    Chapter  Google Scholar 

  10. Grahne G, Zhu J (2005) Fast algorithms for frequent itemset mining using FP-trees. IEEE Trans Knowl Data Eng 17(10):1347–1362

    Article  Google Scholar 

  11. Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Discov 15(1):55–86

    Article  MathSciNet  Google Scholar 

  12. Han J, Kamber M (2006) Data mining: concepts and techniques, 2nd edn. Morgan Kaufmann, San Francisco

    Google Scholar 

  13. Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Discov 8(1):53–87

    Article  MathSciNet  Google Scholar 

  14. Hu J, Mojsilovic A (2007) High-utility pattern mining: a method for discovery of high-utility item sets. Pattern Recognit 40(11):3317–3324

    Article  MATH  Google Scholar 

  15. Kaya M, Alhajj R (2008) Online mining of fuzzy multidimensional weighted association rules. Appl Intell 29(1):13–34

    Article  Google Scholar 

  16. Lee C-H (2007) IMSP: an information theoretic approach for multi-dimensional sequential pattern mining. Appl Intell 26(3):231–242

    Article  MATH  Google Scholar 

  17. Lan G-C, Hong T-P, Tseng VS (2012) A projection-based approach for discovering high average-utility itemsets. J Inf Sci Eng 28(1):193–209

    Google Scholar 

  18. Li H-F, Huang H-Y, Lee S-Y (2011) Fast and memory efficient mining of high-utility itemsets from data streams: with and without negative item profits. Knowl Inf Syst 28(3):495–522

    Article  Google Scholar 

  19. Li H-F (2011) MHUI-max: an efficient algorithm for discovering high-utility itemsets from data streams. J Inf Sci 37(5):532–545

    Article  Google Scholar 

  20. Li Y-C, Yeh J-S, Chang C-C (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217

    Article  Google Scholar 

  21. Lin C-W, Hong T-P, Lu W-H (2011) An effective tree structure for mining high utility itemsets. Expert Syst Appl 38(6):7419–7424

    Article  Google Scholar 

  22. Liu Y, Liao W-K, Choudhary AN (2005) A two phase algorithm for fast discovery of high utility of itemsets. In: Proceedings of the 9th Pacific-Asia conference on knowledge discovery and data mining (PAKDD’05), pp 689–695

    Google Scholar 

  23. Maunz A, Helma C, Kramer S (2011) Efficient mining for structurally diverse subgraph patterns in large molecular databases. Mach Learn 83(2):193–218

    Article  MATH  MathSciNet  Google Scholar 

  24. Piatetsky-Shapiro G (2007) Data mining and knowledge discovery 1996 to 2005: overcoming the hype and moving from “university” to “business” and “analytics”. Data Min Knowl Discov 15(1):99–105

    Article  MathSciNet  Google Scholar 

  25. Rauch J (2005) Logic of association rules. Appl Intell 22(1):9–28

    Article  MATH  Google Scholar 

  26. Roscoe AW (2010) Understanding concurrent systems. Springer, London

    Book  MATH  Google Scholar 

  27. Shie B-E, Yu PS, Tseng VS (2013) Mining interesting user behavior patterns in mobile commerce environments. Appl Intell 38(3):418–435

    Article  Google Scholar 

  28. Song W, Yang BR, Xu ZY (2008) Index-BitTableFI: an improved algorithm for mining frequent itemsets. Knowl-Based Syst 21(6):507–513

    Article  Google Scholar 

  29. Tatti N, Cule B (2012) Mining closed strict episodes. Data Min Knowl Discov 25(1):34–66

    Article  MATH  MathSciNet  Google Scholar 

  30. Tseng M-C, Lin Y-Y, Jeng R (2008) Updating generalized association rules with evolving taxonomies. Appl Intell 29(3):306–320

    Article  Google Scholar 

  31. Tseng VS, Wu C-W, Shie B-E, Yu PS (2010) UP-Growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’10), pp 253–262

    Chapter  Google Scholar 

  32. IBM data generator. http://www.cs.loyola.edu/~cgiannel/assoc_gen.html

  33. Frequent Itemset mining implementations repository. http://fimi.ua.ac.be/

  34. Wang S-L, Patel D, Jafari A, Hong T-P (2007) Hiding collaborative recommendation association rules. Appl Intell 27(1):67–77

    Article  MATH  Google Scholar 

  35. Wang Y-T, Cheng J-T (2011) Mining periodic movement patterns of mobile phone users based on an efficient sampling approach. Appl Intell 35(1):32–40

    Article  Google Scholar 

  36. Yao H, Hamilton HJ, Butz CJ (2004) A foundational approach to mining itemset utilities from databases. In: Proceedings of the 4th SIAM international conference on data mining (SDM’04), pp 482–486

    Chapter  Google Scholar 

  37. Yao H, Hamilton HJ (2006) Mining itemset utilities from transaction databases. Data Knowl Eng 59(3):603–626

    Article  Google Scholar 

  38. Yu G, Shao S, Luo B, Zeng X (2009) A hybrid method for high-utility itemsets mining in large high-dimensional data. Int J Data Warehous Min 5(1):57–73

    Article  Google Scholar 

  39. Zaki MJ, Gouda K (2003) Fast vertical mining using diffsets. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’03), pp 326–335

    Google Scholar 

  40. Zhang S, Chen F, Wu X, Zhang C, Wang R (2012) Mining bridging rules between conceptual clusters. Appl Intell 36(1):108–118

    Article  Google Scholar 

Download references

Acknowledgements

We would like to express our deep gratitude to the anonymous reviewers of this paper. The work is partly supported by the National Natural Science Foundation of China (61105045), Funding Project for Academic Human Resources Development in Institutions of Higher Learning Under the Jurisdiction of Beijing Municipality (PHR201108057), and North China University of Technology (CCXZ201303).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Song.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Song, W., Liu, Y. & Li, J. Mining high utility itemsets by dynamically pruning the tree structure. Appl Intell 40, 29–43 (2014). https://doi.org/10.1007/s10489-013-0443-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-013-0443-7

Keywords

Navigation