Skip to main content
Top
Published in: Peer-to-Peer Networking and Applications 4/2021

22-08-2020

High utility itemset mining using path encoding and constrained subset generation

Authors: Vamsinath Javangula, Suvarna Vani Koneru, Haritha Dasari

Published in: Peer-to-Peer Networking and Applications | Issue 4/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In this paper a two phase approach for high utility itemset mining has been proposed. In the first phase potential high utility itemsets are generated using potential high utility maximal supersets. The transaction weighted utility measure is used in ascertaining the potential high utility itemsets. The maximal supersets are obtained from high utility paths ending in the items in the transaction database. The supersets are constructed without using any tree structures. The prefix information of an item in a transaction is stored in the form of binary codes. Thus, the prefix information of a path in a transaction is encoded as binary codes and stored in the node containing the item information. The potential high utility itemsets are generated from the maximal supersets using a modified set enumeration tree. The high utility itemsets are then obtained from the set enumeration tree by calculating the actual utility by scanning the transaction database. The experiments highlight the superior performance of the system compared to other similar systems in the literature.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Agrawal R, Srikant R (1994) “Fast algorithms for mining association rules.” In Proc. 20th int. conf. very large data bases. VLDB 1215:487–499 Agrawal R, Srikant R (1994) “Fast algorithms for mining association rules.” In Proc. 20th int. conf. very large data bases. VLDB 1215:487–499
2.
go back to reference Agrawal, R., Imieliński, T., & Swami, A. (1993). “Mining association rules between sets of items in large databases.” In Acm sigmod record 22 (2). ACM: 207–216 Agrawal, R., Imieliński, T., & Swami, A. (1993). “Mining association rules between sets of items in large databases.” In Acm sigmod record 22 (2). ACM: 207–216
3.
go back to reference Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Elsevier, New YorkMATH Han J, Kamber M, Pei J (2011) Data mining: concepts and techniques. Elsevier, New YorkMATH
5.
go back to reference Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Disc 15(1):5586MathSciNetCrossRef Han J, Cheng H, Xin D, Yan X (2007) Frequent pattern mining: current status and future directions. Data Min Knowl Disc 15(1):5586MathSciNetCrossRef
6.
go back to reference Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87MathSciNetCrossRef Han J, Pei J, Yin Y, Mao R (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87MathSciNetCrossRef
7.
go back to reference Sethi, K. K., & Ramesh, D. (2017). “HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing.” The Journal of Supercomputing: 1–17 Sethi, K. K., & Ramesh, D. (2017). “HFIM: a Spark-based hybrid frequent itemset mining algorithm for big data processing.” The Journal of Supercomputing: 1–17
8.
go back to reference Yao, H., Hamilton, H. J., & Butz, C. J. (2004). “A foundational approach to mining itemset utilities from databases.” In Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics: 482–486 Yao, H., Hamilton, H. J., & Butz, C. J. (2004). “A foundational approach to mining itemset utilities from databases.” In Proceedings of the 2004 SIAM International Conference on Data Mining. Society for Industrial and Applied Mathematics: 482–486
9.
go back to reference Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721CrossRef Ahmed CF, Tanbeer SK, Jeong BS, Lee YK (2009) Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans Knowl Data Eng 21(12):1708–1721CrossRef
10.
go back to reference Lan GC, Hong TP, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107CrossRef Lan GC, Hong TP, Tseng VS (2014) An efficient projection-based indexing approach for mining high utility itemsets. Knowl Inf Syst 38(1):85–107CrossRef
11.
go back to reference Li YC, Yeh JS, Chang CC (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217CrossRef Li YC, Yeh JS, Chang CC (2008) Isolated items discarding strategy for discovering high utility itemsets. Data Knowl Eng 64(1):198–217CrossRef
12.
go back to reference Liu Y, Liao WK, Choudhary AN (2005) A two-phase algorithm for fast discovery of high utility Itemsets. In PAKDD 3518:689–695 Liu Y, Liao WK, Choudhary AN (2005) A two-phase algorithm for fast discovery of high utility Itemsets. In PAKDD 3518:689–695
13.
go back to reference Tseng VS, Shie BE, Wu CW, Philip SY (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786CrossRef Tseng VS, Shie BE, Wu CW, Philip SY (2013) Efficient algorithms for mining high utility itemsets from transactional databases. IEEE Trans Knowl Data Eng 25(8):1772–1786CrossRef
14.
go back to reference Tseng, V. S., Wu, C. W., Shie, B. E., & Yu, P. S. (2010). “UP-Growth: an efficient algorithm for high utility itemset mining.” In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining ACM: 253–262 Tseng, V. S., Wu, C. W., Shie, B. E., & Yu, P. S. (2010). “UP-Growth: an efficient algorithm for high utility itemset mining.” In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining ACM: 253–262
15.
go back to reference Fournier-Viger, P., Lin, J. C. W., Duong, Q. H., & Dam, T. L. (2016). “FHM+: faster high-utility itemset mining using length upperbound reduction.” In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer International Publishing: 115–127 Fournier-Viger, P., Lin, J. C. W., Duong, Q. H., & Dam, T. L. (2016). “FHM+: faster high-utility itemset mining using length upperbound reduction.” In International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer International Publishing: 115–127
16.
go back to reference Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381CrossRef Krishnamoorthy S (2015) Pruning strategies for mining high utility itemsets. Expert Syst Appl 42(5):2371–2381CrossRef
17.
go back to reference Yun U, Ryang H, Ryu KH (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878CrossRef Yun U, Ryang H, Ryu KH (2014) High utility itemset mining with techniques for reducing overestimated utilities and pruning candidates. Expert Syst Appl 41(8):3861–3878CrossRef
18.
go back to reference Chan, R., Yang, Q., & Shen, Y. D. (2003). “Mining high utility itemsets.” In Data Mining ICDM Third IEEE International Conference on IEEE: 19–26 Chan, R., Yang, Q., & Shen, Y. D. (2003). “Mining high utility itemsets.” In Data Mining ICDM Third IEEE International Conference on IEEE: 19–26
19.
go back to reference Uday KR, Yashwanth RT, Fournier-Viger P, Toyoda M, Krishna RP, Kitsuregawa M (2019) Efficiently Finding High Utility-Frequent Itemsets Using Cutoff and Suffix Utility. In: Yang Q, Zhou ZH, Gong Z, Zhang ML, Huang SJ (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture notes in computer science, vol 11440. Springer, Cham Uday KR, Yashwanth RT, Fournier-Viger P, Toyoda M, Krishna RP, Kitsuregawa M (2019) Efficiently Finding High Utility-Frequent Itemsets Using Cutoff and Suffix Utility. In: Yang Q, Zhou ZH, Gong Z, Zhang ML, Huang SJ (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2019. Lecture notes in computer science, vol 11440. Springer, Cham
20.
go back to reference Nguyen LT, Nguyen P, Nguyen TD, Vo B, Fournier-Viger P, Tseng VS (2019) Mining high-utility itemsets in dynamic profit databases. Knowl-Based Syst 175:130–144CrossRef Nguyen LT, Nguyen P, Nguyen TD, Vo B, Fournier-Viger P, Tseng VS (2019) Mining high-utility itemsets in dynamic profit databases. Knowl-Based Syst 175:130–144CrossRef
21.
go back to reference Sethi KK, Ramesh D, Edla DR (2018) P-FHM+: parallel high utility itemset mining algorithm for big data processing. Procedia Comput Sci 132:918–927CrossRef Sethi KK, Ramesh D, Edla DR (2018) P-FHM+: parallel high utility itemset mining algorithm for big data processing. Procedia Comput Sci 132:918–927CrossRef
22.
go back to reference Arybarzan N, Bidgoli B, Reshnehlab M (2018) negFIN: an efficient algorithm for fast mining frequent itemsets. Expert Syst Appl 105:129–143CrossRef Arybarzan N, Bidgoli B, Reshnehlab M (2018) negFIN: an efficient algorithm for fast mining frequent itemsets. Expert Syst Appl 105:129–143CrossRef
Metadata
Title
High utility itemset mining using path encoding and constrained subset generation
Authors
Vamsinath Javangula
Suvarna Vani Koneru
Haritha Dasari
Publication date
22-08-2020
Publisher
Springer US
Published in
Peer-to-Peer Networking and Applications / Issue 4/2021
Print ISSN: 1936-6442
Electronic ISSN: 1936-6450
DOI
https://doi.org/10.1007/s12083-020-00980-9

Other articles of this Issue 4/2021

Peer-to-Peer Networking and Applications 4/2021 Go to the issue

Premium Partner