Skip to main content
Top

2019 | OriginalPaper | Chapter

Clustering-Based Aggregation of High-Utility Patterns from Unknown Multi-database

Authors : Abhinav Muley, Manish Gudadhe

Published in: Transactions on Computational Science XXXIV

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

High-utility patterns generated from mining the unknown and different databases can be clustered to identify the most valid patterns. Sources include the internet, journals, and enterprise data. Here, a grid-based clustering method (CLIQUE) is used to aggregate patterns mined from multiple databases. The proposed model forms the clusters based on all the utilities of patterns to determine the interestingness and the correct interval of its utility measure. The set of all patterns is collected by first mining the databases individually, at the local level. The problem arises when the same pattern is identified by all of the databases but with different utility factors. In this case, it becomes difficult to decide whether the pattern should be considered as a valid or not, due to the presence of multiple utility values. Hence, an aggregation model is applied to test whether a pattern satisfies the utility threshold set by a domain expert. We found that the proposed aggregation model effectively clusters all of the interesting patterns by discarding those patterns that do not satisfy the threshold condition. The proposed model accurately optimizes the utility interval of the valid patterns.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
2.
go back to reference Lesser, V., Horling, B., Klassner, F., Raja, A., Wagner, T., Zhang, S.X.: BIG: an agent for resource-bounded information gathering and decision making. Artif. Intell. 118(1–2), 197–244 (2000)MATHCrossRef Lesser, V., Horling, B., Klassner, F., Raja, A., Wagner, T., Zhang, S.X.: BIG: an agent for resource-bounded information gathering and decision making. Artif. Intell. 118(1–2), 197–244 (2000)MATHCrossRef
3.
go back to reference Zhong, N., Yao, Y.Y., Ohishima, M.: Peculiarity oriented multi database mining. IEEE Trans. Knowl. Data Eng. 15(4), 952–960 (2003)CrossRef Zhong, N., Yao, Y.Y., Ohishima, M.: Peculiarity oriented multi database mining. IEEE Trans. Knowl. Data Eng. 15(4), 952–960 (2003)CrossRef
4.
go back to reference Wu, X., Zhang, S.: Synthesizing high-frequency rules from different data sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)CrossRef Wu, X., Zhang, S.: Synthesizing high-frequency rules from different data sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)CrossRef
5.
go back to reference Zhang, S., Zaki, M.J.: Mining multiple data sources: local pattern analysis. Data Min. Knowl. Discov. 12(2–3), 121–125 (2006)MathSciNetCrossRef Zhang, S., Zaki, M.J.: Mining multiple data sources: local pattern analysis. Data Min. Knowl. Discov. 12(2–3), 121–125 (2006)MathSciNetCrossRef
7.
go back to reference Muley, A., Gudadhe, M.: Synthesizing high-utility patterns from different data sources. Data. 3(3), 32 (2018)CrossRef Muley, A., Gudadhe, M.: Synthesizing high-utility patterns from different data sources. Data. 3(3), 32 (2018)CrossRef
8.
go back to reference Arabie, P., Hubert, L.J.: An overview of combinatorial data. In: Clustering and Classification, p. 5 (1996)CrossRef Arabie, P., Hubert, L.J.: An overview of combinatorial data. In: Clustering and Classification, p. 5 (1996)CrossRef
9.
go back to reference Piatetsky-Shapiro, G., Fayyad, U.M., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, vol. 21. AAAI Press, Menlo Park (1996) Piatetsky-Shapiro, G., Fayyad, U.M., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, vol. 21. AAAI Press, Menlo Park (1996)
11.
go back to reference Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, New York (2011)MATH Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, New York (2011)MATH
12.
go back to reference Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. ACM SIGMOD Rec. 27(2), 94–105 (1998)CrossRef Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. ACM SIGMOD Rec. 27(2), 94–105 (1998)CrossRef
14.
go back to reference Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)CrossRef Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)CrossRef
15.
go back to reference Good, I.: Probability and the Weighting of Evidence. Charles Griffin, London (1950)MATH Good, I.: Probability and the Weighting of Evidence. Charles Griffin, London (1950)MATH
16.
go back to reference Chen, Y., An, A.: Approximate parallel high-utility itemset mining. Big Data Res. 6, 26–42 (2016)CrossRef Chen, Y., An, A.: Approximate parallel high-utility itemset mining. Big Data Res. 6, 26–42 (2016)CrossRef
17.
go back to reference Xun, Y., Zhang, J., Qin, X.: FiDoop: parallel mining of frequent itemsets using mapreduce. IEEE Trans. Syst. Man Cybern.: Syst. 46(3), 313–325 (2016)CrossRef Xun, Y., Zhang, J., Qin, X.: FiDoop: parallel mining of frequent itemsets using mapreduce. IEEE Trans. Syst. Man Cybern.: Syst. 46(3), 313–325 (2016)CrossRef
18.
go back to reference Zhang, F., Liu, M., Gui, F., Shen, W., Shami, A., Ma, Y.: A distributed frequent itemset mining algorithm using spark for big data analytics. Clust. Comput. 18(4), 1493–1501 (2015)CrossRef Zhang, F., Liu, M., Gui, F., Shen, W., Shami, A., Ma, Y.: A distributed frequent itemset mining algorithm using spark for big data analytics. Clust. Comput. 18(4), 1493–1501 (2015)CrossRef
19.
go back to reference Marjani, M., et al.: Big IoT data analytics: architecture, opportunities, and open research challenges. IEEE Access 5, 5247–5261 (2017)CrossRef Marjani, M., et al.: Big IoT data analytics: architecture, opportunities, and open research challenges. IEEE Access 5, 5247–5261 (2017)CrossRef
20.
go back to reference Wang, R., et al.: Review on mining data from multiple data sources. Pattern Recognit. Lett. 109, 120–128 (2018)CrossRef Wang, R., et al.: Review on mining data from multiple data sources. Pattern Recognit. Lett. 109, 120–128 (2018)CrossRef
22.
go back to reference Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993)CrossRef Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993)CrossRef
23.
go back to reference Yao, H., Hamilton, H.J., Geng, L.: A unified framework for utility-based measures for mining itemsets. In: Proceedings of ACM SIGKDD 2nd Workshop on Utility-Based Data Mining, pp. 28–37, August 2006 Yao, H., Hamilton, H.J., Geng, L.: A unified framework for utility-based measures for mining itemsets. In: Proceedings of ACM SIGKDD 2nd Workshop on Utility-Based Data Mining, pp. 28–37, August 2006
24.
go back to reference Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high-utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)CrossRef Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high-utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)CrossRef
26.
go back to reference Lin, Y., Chen, H., Lin, G., Chen, J., Ma, Z., Li, J.: Synthesizing decision rules from multiple information sources: a neighborhood granulation viewpoint. Int. J. Mach. Learn. Cybern. 9, 1919–1928 (2018)CrossRef Lin, Y., Chen, H., Lin, G., Chen, J., Ma, Z., Li, J.: Synthesizing decision rules from multiple information sources: a neighborhood granulation viewpoint. Int. J. Mach. Learn. Cybern. 9, 1919–1928 (2018)CrossRef
27.
go back to reference Zhang, S., Wu, X., Zhang, C.: Multi-database mining. IEEE Comput. Intell. Bull. 2(1), 5–13 (2003) Zhang, S., Wu, X., Zhang, C.: Multi-database mining. IEEE Comput. Intell. Bull. 2(1), 5–13 (2003)
28.
go back to reference Xu, W., Yu, J.: A novel approach to information fusion in multi-source datasets: a granular computing viewpoint. Inf. Sci. 378, 410–423 (2017)CrossRef Xu, W., Yu, J.: A novel approach to information fusion in multi-source datasets: a granular computing viewpoint. Inf. Sci. 378, 410–423 (2017)CrossRef
Metadata
Title
Clustering-Based Aggregation of High-Utility Patterns from Unknown Multi-database
Authors
Abhinav Muley
Manish Gudadhe
Copyright Year
2019
Publisher
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-59958-7_2

Premium Partner