Skip to main content
Top

2019 | OriginalPaper | Chapter

Adaptive Cluster Based Discovery of High Utility Itemsets

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Utility Itemset Mining (UIM) is a key analysis technique for data which is modeled by the Transactional data model. While improving the computational time and space efficiency of the mining of itemsets is important, it is also critically important to predict future itemsets accurately. In today’s time, when both scientific and business competitive edge is commonly derived from first access to knowledge via advanced predictive ability, this problem becomes increasingly relevant. We established in our most recent work that having prior knowledge of approximate cluster structure of the dataset and using it implicitly in the mining process, can lend itself to accurate prediction of future itemsets. We evaluate the individual strength of each transaction while focusing on itemset prediction, and reshape the transaction utilities based on that. We extend our work by identifying that such reshaping of transaction utilities should be adaptive to the anticipated cluster structure, if there is a specific intended prediction window. We define novel concepts for making such an anticipation and integrate Time Series Forecasting into the evaluation. We perform additional illustrative experiments to demonstrate the application of our improved technique and also discuss future direction for this work.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Literature
1.
go back to reference Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Trans. Knowl. Data Eng. 6, 962–969 (1996)CrossRef Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Trans. Knowl. Data Eng. 6, 962–969 (1996)CrossRef
2.
go back to reference Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases, VLDB 1994, vol. 1215, pp. 487–499 (1994) Agrawal, R., Srikant, R., et al.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on Very Large Data Bases, VLDB 1994, vol. 1215, pp. 487–499 (1994)
3.
go back to reference Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)CrossRef Ahmed, C.F., Tanbeer, S.K., Jeong, B.-S., Lee, Y.-K.: Efficient tree structures for high utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)CrossRef
4.
go back to reference Alves, R., Rodriguez-Baena, D.S., Aguilar-Ruiz, J.S.: Gene association analysis: a survey of frequent pattern mining from gene expression data. Brief. Bioinform. 11(2), 210–224 (2009)CrossRef Alves, R., Rodriguez-Baena, D.S., Aguilar-Ruiz, J.S.: Gene association analysis: a survey of frequent pattern mining from gene expression data. Brief. Bioinform. 11(2), 210–224 (2009)CrossRef
5.
go back to reference Andreopoulos, B., An, A., Wang, X., Schroeder, M.: A roadmap of clustering algorithms: finding a match for a biomedical application. Brief. Bioinform. 10(3), 297–314 (2009)CrossRef Andreopoulos, B., An, A., Wang, X., Schroeder, M.: A roadmap of clustering algorithms: finding a match for a biomedical application. Brief. Bioinform. 10(3), 297–314 (2009)CrossRef
7.
go back to reference Brijs, T., Swinnen, G., Vanhoof, K., Wets, G.: Using association rules for product assortment decisions: a case study. In: Knowledge Discovery and Data Mining, pp. 254–260 (1999) Brijs, T., Swinnen, G., Vanhoof, K., Wets, G.: Using association rules for product assortment decisions: a case study. In: Knowledge Discovery and Data Mining, pp. 254–260 (1999)
8.
go back to reference Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: ACM SIGMOD Record, vol. 26, pp. 255–264. ACM (1997) Brin, S., Motwani, R., Ullman, J.D., Tsur, S.: Dynamic itemset counting and implication rules for market basket data. In: ACM SIGMOD Record, vol. 26, pp. 255–264. ACM (1997)
9.
go back to reference Chan, R.C., Yang, Q., Shen, Y.-D.: Mining high utility itemsets. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 19–26. IEEE (2003) Chan, R.C., Yang, Q., Shen, Y.-D.: Mining high utility itemsets. In: Third IEEE International Conference on Data Mining, ICDM 2003, pp. 19–26. IEEE (2003)
10.
go back to reference Chen, K., Liu, L.: The “Best k” for entropy-based categorical data clustering (2005) Chen, K., Liu, L.: The “Best k” for entropy-based categorical data clustering (2005)
11.
go back to reference Guha, S., Rastogi, R., Shim, K.: ROCK: a robust clustering algorithm for categorical attributes. In: Proceedings of 15th International Conference on Data Engineering, pp. 512–521. IEEE (1999) Guha, S., Rastogi, R., Shim, K.: ROCK: a robust clustering algorithm for categorical attributes. In: Proceedings of 15th International Conference on Data Engineering, pp. 512–521. IEEE (1999)
12.
go back to reference Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD Record, vol. 29, pp. 1–12. ACM (2000) Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation. In: ACM SIGMOD Record, vol. 29, pp. 1–12. ACM (2000)
13.
go back to reference Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2(3), 283–304 (1998)CrossRef Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2(3), 283–304 (1998)CrossRef
14.
go back to reference Lakhawat, P., Mishra, M., Somani, A.: A clustering based prediction scheme for high utility itemsets. In: Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, pp. 123–134. INSTICC, SciTePress (2017) Lakhawat, P., Mishra, M., Somani, A.: A clustering based prediction scheme for high utility itemsets. In: Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR, pp. 123–134. INSTICC, SciTePress (2017)
15.
go back to reference Lakhawat, P., Mishra, M., Somani, A.K.: A novel clustering algorithm to capture utility information in transactional data. In: KDIR, pp. 456–462 (2016) Lakhawat, P., Mishra, M., Somani, A.K.: A novel clustering algorithm to capture utility information in transactional data. In: KDIR, pp. 456–462 (2016)
16.
go back to reference Li, H.-F., Huang, H.-Y., Chen, Y.-C., Liu, Y.-J., Lee, S.-Y.: Fast and memory efficient mining of high utility itemsets in data streams. In: Eighth IEEE International Conference on Data Mining, ICDM 2008, pp. 881–886. IEEE (2008) Li, H.-F., Huang, H.-Y., Chen, Y.-C., Liu, Y.-J., Lee, S.-Y.: Fast and memory efficient mining of high utility itemsets in data streams. In: Eighth IEEE International Conference on Data Mining, ICDM 2008, pp. 881–886. IEEE (2008)
17.
go back to reference Liao, S.-H., Chu, P.-H., Hsiao, P.-Y.: Data mining techniques and applications-a decade review from 2000 to 2011. Expert. Syst. Appl. 39(12), 11303–11311 (2012)CrossRef Liao, S.-H., Chu, P.-H., Hsiao, P.-Y.: Data mining techniques and applications-a decade review from 2000 to 2011. Expert. Syst. Appl. 39(12), 11303–11311 (2012)CrossRef
18.
go back to reference Liu, Y., Liao, W.-K., Choudhary, A.: A fast high utility itemsets mining algorithm. In: Proceedings of the 1st International Workshop on Utility-based Data Mining, pp. 90–99. ACM (2005) Liu, Y., Liao, W.-K., Choudhary, A.: A fast high utility itemsets mining algorithm. In: Proceedings of the 1st International Workshop on Utility-based Data Mining, pp. 90–99. ACM (2005)
19.
go back to reference Naulaerts, S., et al.: A primer to frequent itemset mining for bioinformatics. Brief. Bioinform. 16(2), 216–231 (2015)CrossRef Naulaerts, S., et al.: A primer to frequent itemset mining for bioinformatics. Brief. Bioinform. 16(2), 216–231 (2015)CrossRef
20.
go back to reference Ngai, E.W., Xiu, L., Chau, D.C.: Application of data mining techniques in customer relationship management: a literature review and classification. Expert. Syst. Appl. 36(2), 2592–2602 (2009)CrossRef Ngai, E.W., Xiu, L., Chau, D.C.: Application of data mining techniques in customer relationship management: a literature review and classification. Expert. Syst. Appl. 36(2), 2592–2602 (2009)CrossRef
22.
go back to reference Seabold, S., Perktold, J.: StatsModels: econometric and statistical modeling with python. In: 9th Python in Science Conference (2010) Seabold, S., Perktold, J.: StatsModels: econometric and statistical modeling with python. In: 9th Python in Science Conference (2010)
23.
go back to reference Toivonen, H., et al.: Sampling large databases for association rules. VLDB 96, 134–145 (1996) Toivonen, H., et al.: Sampling large databases for association rules. VLDB 96, 134–145 (1996)
24.
go back to reference Tseng, V.S., Wu, C.-W., Fournier-Viger, P., Yu, P.S.: Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans. Knowl. Data Eng. 27(3), 726–739 (2015)CrossRef Tseng, V.S., Wu, C.-W., Fournier-Viger, P., Yu, P.S.: Efficient algorithms for mining the concise and lossless representation of high utility itemsets. IEEE Trans. Knowl. Data Eng. 27(3), 726–739 (2015)CrossRef
25.
go back to reference Tseng, V.S., Wu, C.-W., Shie, B.-E., Yu, P.S.: Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262. ACM (2010) Tseng, V.S., Wu, C.-W., Shie, B.-E., Yu, P.S.: Up-growth: an efficient algorithm for high utility itemset mining. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 253–262. ACM (2010)
26.
go back to reference Yan, H., Chen, K., Liu, L., Yi, Z.: Scale: a scalable framework for efficiently clustering transactional data. Data Min. Knowl. Discov. 20(1), 1–27 (2010)MathSciNetCrossRef Yan, H., Chen, K., Liu, L., Yi, Z.: Scale: a scalable framework for efficiently clustering transactional data. Data Min. Knowl. Discov. 20(1), 1–27 (2010)MathSciNetCrossRef
27.
go back to reference Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)CrossRef Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)CrossRef
Metadata
Title
Adaptive Cluster Based Discovery of High Utility Itemsets
Authors
Piyush Lakhawat
Arun Somani
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-15640-4_8