Skip to main content
Top
Published in: Arabian Journal for Science and Engineering 2/2022

09-09-2021 | Research Article-Computer Engineering and Computer Science

High Occupancy Itemset Mining with Consideration of Transaction Occupancy

Authors: Subrata Datta, Kalyani Mali, Udit Ghosh

Published in: Arabian Journal for Science and Engineering | Issue 2/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Discovering high occupancy itemsets is an interesting area of research in data mining. Occupancy computation in traditional approaches is restricted to the occupied portions of the itemsets in the supporting transactions only. It can’t distinguish between the occupancies of the same itemset in different supporting transactions of equal lengths. If itemset size is equal to the transaction length, occupancy becomes highest. The fact promotes the generation of undesirable itemsets especially the isolated ones. Furthermore, average occupancies of the itemsets having equal size become equal though they appear in different transactions of equal lengths. To address the above issues, this paper introduces the concept of transaction occupancy (TO) and thereafter presents a new computational model of itemset occupancy (IO) in account of transaction occupancy. Transaction occupancy refers to the occupied portion in the database by the transactions. This paper proposes an efficient list-structure-based algorithm called HOIMTO (high occupancy itemset mining with transaction occupancy) to discover the high occupancy itemsets (HOIs) from the transactional databases. A new itemset occupancy upper bound (IOUB) is also introduced in this paper to reduce the candidate search space. Experimental studies show the effectiveness of the proposed approach in terms of itemset generation, runtime, memory usages and scalability.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Chee, C.H.; Jaafar, J.; Aziz, I.A.; Hasan, M.H.; Yeoh, W.: Algorithms for frequent itemset mining. Artif. Intell. Rev. 52, 2603–2621 (2019)CrossRef Chee, C.H.; Jaafar, J.; Aziz, I.A.; Hasan, M.H.; Yeoh, W.: Algorithms for frequent itemset mining. Artif. Intell. Rev. 52, 2603–2621 (2019)CrossRef
2.
go back to reference Fournier-Viger, P.; Lin, J.C.W.; Vo, B.; Chi, T.T.; Zhang, J.; Le, H.B.: A survey of itemset mining. WIREs Data Min. Knowl. Discov. 7(4), e1207 (2017) Fournier-Viger, P.; Lin, J.C.W.; Vo, B.; Chi, T.T.; Zhang, J.; Le, H.B.: A survey of itemset mining. WIREs Data Min. Knowl. Discov. 7(4), e1207 (2017)
3.
go back to reference Luna, J.M.; Fournier-Viger, P.; Ventura, S.: Frequent itemset mining: a 25 years review. WIREs Data Min. Knowl. Discov. 9(6), e1329 (2019) Luna, J.M.; Fournier-Viger, P.; Ventura, S.: Frequent itemset mining: a 25 years review. WIREs Data Min. Knowl. Discov. 9(6), e1329 (2019)
4.
go back to reference Raj, S.; Ramesh, D.; Sreenu, M.; Sethi, K.K.: EAFIM: efficient apriori-based frequent itemset mining algorithm on Spark for big transactional data. Knowl. Inf. Syst. 62, 3565–3583 (2020)CrossRef Raj, S.; Ramesh, D.; Sreenu, M.; Sethi, K.K.: EAFIM: efficient apriori-based frequent itemset mining algorithm on Spark for big transactional data. Knowl. Inf. Syst. 62, 3565–3583 (2020)CrossRef
6.
go back to reference Mengash, H.A.: Using data mining techniques to predict student performance to support decision making in university admission systems. IEEE Access 8, 55462–55470 (2020)CrossRef Mengash, H.A.: Using data mining techniques to predict student performance to support decision making in university admission systems. IEEE Access 8, 55462–55470 (2020)CrossRef
7.
go back to reference Agarwal, R.; Imielinski, T.; Swami, A.: Mining association rules between sets of items in large datasets. In: Proceedings of the ACM SIGMOD’93, pp. 207–216 (1993) Agarwal, R.; Imielinski, T.; Swami, A.: Mining association rules between sets of items in large datasets. In: Proceedings of the ACM SIGMOD’93, pp. 207–216 (1993)
9.
10.
go back to reference Jin, J.; Sun, W.; Al-Turjman, F.; Khan, M.B.; Yang, X.: Activity pattern mining for healthcare. IEEE Access 8, 56730–56738 (2020)CrossRef Jin, J.; Sun, W.; Al-Turjman, F.; Khan, M.B.; Yang, X.: Activity pattern mining for healthcare. IEEE Access 8, 56730–56738 (2020)CrossRef
11.
go back to reference Huang, J.Y.; Liao, I.E.; Chung, Y.F.; Chen, K.T.: Shielding wireless sensor network using Markovian intrusion detection system with attack pattern mining. Inf. Sci. 231, 32–44 (2013)MathSciNetCrossRef Huang, J.Y.; Liao, I.E.; Chung, Y.F.; Chen, K.T.: Shielding wireless sensor network using Markovian intrusion detection system with attack pattern mining. Inf. Sci. 231, 32–44 (2013)MathSciNetCrossRef
12.
go back to reference Seeja, R.K.; Zareapoor, M.: FraudMiner: a novel credit card fraud detection model based on frequent itemset mining. Sci. World J. 2014, art. id. 252797 (2014) Seeja, R.K.; Zareapoor, M.: FraudMiner: a novel credit card fraud detection model based on frequent itemset mining. Sci. World J. 2014, art. id. 252797 (2014)
13.
go back to reference Verma, Y.; Yadav, A.; Katara, P.: Mining of cancer core-genes and their protein interactome using expression profiling based PPI network approach. Gene Rep. 18, art. 10583 (2020) Verma, Y.; Yadav, A.; Katara, P.: Mining of cancer core-genes and their protein interactome using expression profiling based PPI network approach. Gene Rep. 18, art. 10583 (2020)
14.
go back to reference Bin, C.; Gu, T.; Sun, Y.; Chang, L.: A personalized POI route recommendation system based on heterogeneous tourism data and sequential pattern mining. Multimedia Tools Appl. 78, 35135–35156 (2019)CrossRef Bin, C.; Gu, T.; Sun, Y.; Chang, L.: A personalized POI route recommendation system based on heterogeneous tourism data and sequential pattern mining. Multimedia Tools Appl. 78, 35135–35156 (2019)CrossRef
15.
go back to reference Li, Y.C.; Yeh, J.S.; Chang, C.C.: Isolated items discarding strategy for discovering high utility itemsets. Data Knowl. Eng. 64(1), 198–217 (2008)CrossRef Li, Y.C.; Yeh, J.S.; Chang, C.C.: Isolated items discarding strategy for discovering high utility itemsets. Data Knowl. Eng. 64(1), 198–217 (2008)CrossRef
16.
go back to reference Datta, S.; Mali, K.; Ghosh, S.: Weighted association rule mining over unweighted databases using inter-item link based automated weighting scheme. Arab. J. Sci. Eng. 46, 3169–3188 (2021)CrossRef Datta, S.; Mali, K.; Ghosh, S.: Weighted association rule mining over unweighted databases using inter-item link based automated weighting scheme. Arab. J. Sci. Eng. 46, 3169–3188 (2021)CrossRef
17.
go back to reference Tang, L.; Zhang, L.; Luo, P.; Wang, M.: Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. In: Proceedings of CIKM’12, pp. 75–84 (2012) Tang, L.; Zhang, L.; Luo, P.; Wang, M.: Incorporating occupancy into frequent pattern mining for high quality pattern recommendation. In: Proceedings of CIKM’12, pp. 75–84 (2012)
18.
go back to reference Deng, Z.H.: Mining high occupancy itemsets. Future Gener. Comput. Syst. 102, 222–229 (2020)CrossRef Deng, Z.H.: Mining high occupancy itemsets. Future Gener. Comput. Syst. 102, 222–229 (2020)CrossRef
19.
go back to reference Liu, Q.; Ge, Y.; Li, Z.; Chen, E.; Xiong, H.: Personalized travel package recommendation. In: Proceedings of IEEE ICDM’11, pp. 407–416 (2011) Liu, Q.; Ge, Y.; Li, Z.; Chen, E.; Xiong, H.: Personalized travel package recommendation. In: Proceedings of IEEE ICDM’11, pp. 407–416 (2011)
20.
go back to reference Liu, Q.; Chen, E.; Xiong, H.; Ge, Y.; Li, Z.; Wu, X.: A cocktail approach for travel package recommendation. IEEE TKDE 26(2), 278–293 (2014) Liu, Q.; Chen, E.; Xiong, H.; Ge, Y.; Li, Z.; Wu, X.: A cocktail approach for travel package recommendation. IEEE TKDE 26(2), 278–293 (2014)
21.
go back to reference Yu, Z.; Xu, H.; Yang, Z.; Guo, B.: Personalized travel package with multi-point-of-interest recommendation based on crowdsourced user footprints. IEEE Trans. Human-Machine Syst. 46(1), 151–158 (2016)CrossRef Yu, Z.; Xu, H.; Yang, Z.; Guo, B.: Personalized travel package with multi-point-of-interest recommendation based on crowdsourced user footprints. IEEE Trans. Human-Machine Syst. 46(1), 151–158 (2016)CrossRef
22.
go back to reference Zhu, G.; Wang, Y.; Cao, J.; Bu, Z.; Yang, S.; Liang, W.; Liu, J.: Neural attentive travel package recommendation via exploiting long-term and short-term behaviors. Knowl.-Based Syst. 211, art. 106511 (2021) Zhu, G.; Wang, Y.; Cao, J.; Bu, Z.; Yang, S.; Liang, W.; Liu, J.: Neural attentive travel package recommendation via exploiting long-term and short-term behaviors. Knowl.-Based Syst. 211, art. 106511 (2021)
23.
go back to reference Zhang, X.; Duan, F.; Zhang, L.; Cheng, F.; Jin, Y.; Tang, K.: Pattern recommendation in task-oriented applications: a multi-objective perspective. IEEE Comput. Intell. Magazine 12(3), 43–53 (2017)CrossRef Zhang, X.; Duan, F.; Zhang, L.; Cheng, F.; Jin, Y.; Tang, K.: Pattern recommendation in task-oriented applications: a multi-objective perspective. IEEE Comput. Intell. Magazine 12(3), 43–53 (2017)CrossRef
24.
go back to reference Zhang, L.; Tang, L.; Luo, P.; Chen, E.; Jiao, L.; Wang, M.; Liu, G.: Harnessing the wisdom of the crowds for accurate web page clipping. In: Proceeding of KDD’12, pp. 570–578 (2012) Zhang, L.; Tang, L.; Luo, P.; Chen, E.; Jiao, L.; Wang, M.; Liu, G.: Harnessing the wisdom of the crowds for accurate web page clipping. In: Proceeding of KDD’12, pp. 570–578 (2012)
25.
go back to reference Fasanghari, M.; Montazer, G.A.: Design and implementation of fuzzy expert system for Tehran Stock Exchange portfolio recommendation. Expt. Syst. Appl. 37, 6138–6147 (2010)CrossRef Fasanghari, M.; Montazer, G.A.: Design and implementation of fuzzy expert system for Tehran Stock Exchange portfolio recommendation. Expt. Syst. Appl. 37, 6138–6147 (2010)CrossRef
26.
go back to reference Gao, Q.; Xu, D.L.: An empirical study on the application of the evidential reasoning rule to decision making in financial investment. Knowl.-Based Syst. 164, 226–234 (2019)CrossRef Gao, Q.; Xu, D.L.: An empirical study on the application of the evidential reasoning rule to decision making in financial investment. Knowl.-Based Syst. 164, 226–234 (2019)CrossRef
27.
go back to reference Zhong, H.; Liu, C.; Zhong, J.; Xiong, H.: Which startup to invest in: a personalized portfolio strategy. Ann. Oper. Res. 263, 339–360 (2018) Zhong, H.; Liu, C.; Zhong, J.; Xiong, H.: Which startup to invest in: a personalized portfolio strategy. Ann. Oper. Res. 263, 339–360 (2018)
28.
go back to reference Zhang, L.; Luo, P.; Tang, L.; Chen, E.; Liu, Q.; Wang, M.; Xiong, H.: Occupancy-based frequent pattern mining. ACM TKDD 10(2), art. 14 (2015). Zhang, L.; Luo, P.; Tang, L.; Chen, E.; Liu, Q.; Wang, M.; Xiong, H.: Occupancy-based frequent pattern mining. ACM TKDD 10(2), art. 14 (2015).
29.
go back to reference Shen, B.; Wen, Z.; Zhao, Y.; Zhou, D.; Zheng, W.: OCEAN: fast discovery of high utility occupancy itemsets. In: Proceedings of PAKDD’16, pp. 354–365 (2016) Shen, B.; Wen, Z.; Zhao, Y.; Zhou, D.; Zheng, W.: OCEAN: fast discovery of high utility occupancy itemsets. In: Proceedings of PAKDD’16, pp. 354–365 (2016)
30.
go back to reference Gan, W.; Lin, J.C.W.; Fournier-Viger, P.; Chao, H.C.; Yu, P.S.: HUOPM: high utility occupancy pattern mining. IEEE Trans. Cybern. 50(3), 1195–1208 (2020)CrossRef Gan, W.; Lin, J.C.W.; Fournier-Viger, P.; Chao, H.C.; Yu, P.S.: HUOPM: high utility occupancy pattern mining. IEEE Trans. Cybern. 50(3), 1195–1208 (2020)CrossRef
31.
go back to reference Chen, C.M.; Chen, L.; Gan, W.; Qiu, L.; Ding, W.: Discovering high utility-occupancy patterns from uncertain data. Inf. Sci. 546, 1208–1229 (2021)MathSciNetCrossRef Chen, C.M.; Chen, L.; Gan, W.; Qiu, L.; Ding, W.: Discovering high utility-occupancy patterns from uncertain data. Inf. Sci. 546, 1208–1229 (2021)MathSciNetCrossRef
32.
go back to reference Gan, W.; Lin, J.C.W.; Fournier-Viger, P.; Chao, H.C.; Zhan, J.; Zhang, J.: Exploiting highly qualified pattern with frequency and weight occupancy. Knowl. Inf. Syst. 56, 165–196 (2018)CrossRef Gan, W.; Lin, J.C.W.; Fournier-Viger, P.; Chao, H.C.; Zhan, J.; Zhang, J.: Exploiting highly qualified pattern with frequency and weight occupancy. Knowl. Inf. Syst. 56, 165–196 (2018)CrossRef
34.
go back to reference Adhikari, J.: Occupancy based pattern mining: current status and future directions. Int. J. Next-Gener. Comput. 11(1), 36–51 (2020) Adhikari, J.: Occupancy based pattern mining: current status and future directions. Int. J. Next-Gener. Comput. 11(1), 36–51 (2020)
35.
go back to reference Datta, S.; Mali, K.; Ghosh, S.; Singh, R.; Das, S.: Interesting pattern mining using item influence. In: S.C. Satapathy et al. (eds.) Advances in Decision Sciences, Image Processing, Security and Computer Vision, LAIS, Vol. 3, pp. 426–434. Springer (2020) Datta, S.; Mali, K.; Ghosh, S.; Singh, R.; Das, S.: Interesting pattern mining using item influence. In: S.C. Satapathy et al. (eds.) Advances in Decision Sciences, Image Processing, Security and Computer Vision, LAIS, Vol. 3, pp. 426–434. Springer (2020)
36.
go back to reference Kiran, R.U.; Shang, H.; Toyoda, M.; Kitsuregawa, M.: Discovering recurring patterns in time series. In: Proceedings of EDBT’15, pp. 97–108 (2015) Kiran, R.U.; Shang, H.; Toyoda, M.; Kitsuregawa, M.: Discovering recurring patterns in time series. In: Proceedings of EDBT’15, pp. 97–108 (2015)
37.
go back to reference Lee, S.; Park, J.S.: Top-k high utility itemset mining based on utility-list structures. In: Proceedings of IEEE BigComp’16, pp. 101–108 (2016) Lee, S.; Park, J.S.: Top-k high utility itemset mining based on utility-list structures. In: Proceedings of IEEE BigComp’16, pp. 101–108 (2016)
38.
go back to reference Sethi, K.K.; Ramesh, D.: A fast high average-utility itemset mining with efficient tight upper bounds and novel list structure. J. Supercomput. 76, 10288–10318 (2020)CrossRef Sethi, K.K.; Ramesh, D.: A fast high average-utility itemset mining with efficient tight upper bounds and novel list structure. J. Supercomput. 76, 10288–10318 (2020)CrossRef
39.
go back to reference Datta, S.; Bose, S.: Mining and ranking association rules in support, confidence, correlation and dissociation framework. In: S. Das et al. (eds.) Proceedings of FICTA’15, AISC, Vol. 404, pp. 141–152. Springer (2015) Datta, S.; Bose, S.: Mining and ranking association rules in support, confidence, correlation and dissociation framework. In: S. Das et al. (eds.) Proceedings of FICTA’15, AISC, Vol. 404, pp. 141–152. Springer (2015)
40.
go back to reference Datta, S.; Bose, S.: Discovering association rules partially devoid of dissociation by weighted confidence. In: Proceedings of IEEE ReTIS’15, Kolkata, India, pp. 138–143 (2015) Datta, S.; Bose, S.: Discovering association rules partially devoid of dissociation by weighted confidence. In: Proceedings of IEEE ReTIS’15, Kolkata, India, pp. 138–143 (2015)
41.
go back to reference Bose, S.; Datta, S.: Frequent pattern generation in association rule mining using weighted support. In: Proceedings of IEEE C3IT’15, Hooghly, India, pp. 1–5 (2015) Bose, S.; Datta, S.: Frequent pattern generation in association rule mining using weighted support. In: Proceedings of IEEE C3IT’15, Hooghly, India, pp. 1–5 (2015)
42.
go back to reference Borah, A.; Nath, B.: Comparative evaluation of pattern mining techniques: an empirical study. Complex Intell. Syst. 7, 589–619 (2021)CrossRef Borah, A.; Nath, B.: Comparative evaluation of pattern mining techniques: an empirical study. Complex Intell. Syst. 7, 589–619 (2021)CrossRef
43.
go back to reference Fournier-Viger, P.; Lin, J.C.W.; Gomariz, A.; Gueniche, T.; Soltani, A.; Deng, Z.; Lam, H.T.: The SPMF open-source data mining library version 2. In: Proceedings of ECML PKDD’16, part III, LNCS, 9853, pp. 36–40. Springer (2016) Fournier-Viger, P.; Lin, J.C.W.; Gomariz, A.; Gueniche, T.; Soltani, A.; Deng, Z.; Lam, H.T.: The SPMF open-source data mining library version 2. In: Proceedings of ECML PKDD’16, part III, LNCS, 9853, pp. 36–40. Springer (2016)
Metadata
Title
High Occupancy Itemset Mining with Consideration of Transaction Occupancy
Authors
Subrata Datta
Kalyani Mali
Udit Ghosh
Publication date
09-09-2021
Publisher
Springer Berlin Heidelberg
Published in
Arabian Journal for Science and Engineering / Issue 2/2022
Print ISSN: 2193-567X
Electronic ISSN: 2191-4281
DOI
https://doi.org/10.1007/s13369-021-06075-8

Other articles of this Issue 2/2022

Arabian Journal for Science and Engineering 2/2022 Go to the issue

Research Article-Computer Engineering and Computer Science

Large-Scale Data Clustering Using Manifold-Regularized Ensemble of Posterior in GAN

Research Article-Computer Engineering and Computer Science

Adiabatic Configurable Reversible Synthesizer for 5G Applications

Research Article-Computer Engineering and Computer Science

Image Super-Resolution Based on Generalized Residual Network

Research Article-Computer Engineering and Computer Science

Progress of IoT Research Technologies and Applications Serving Hajj and Umrah

Research Article-Computer Engineering and Computer Science

Thermal Comfort Model for HVAC Buildings Using Machine Learning

Research Article-Computer Engineering and Computer Science

Back to Basics: An Interpretable Multi-Class Grade Prediction Framework

Premium Partners