Top

Published in:

2019 | OriginalPaper | Chapter

Clustering-Based Aggregation of High-Utility Patterns from Unknown Multi-database

Authors : Abhinav Muley, Manish Gudadhe

Published in: Transactions on Computational Science XXXIV

Publisher: Springer Berlin Heidelberg

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

High-utility patterns generated from mining the unknown and different databases can be clustered to identify the most valid patterns. Sources include the internet, journals, and enterprise data. Here, a grid-based clustering method (CLIQUE) is used to aggregate patterns mined from multiple databases. The proposed model forms the clusters based on all the utilities of patterns to determine the interestingness and the correct interval of its utility measure. The set of all patterns is collected by first mining the databases individually, at the local level. The problem arises when the same pattern is identified by all of the databases but with different utility factors. In this case, it becomes difficult to decide whether the pattern should be considered as a valid or not, due to the presence of multiple utility values. Hence, an aggregation model is applied to test whether a pattern satisfies the utility threshold set by a domain expert. We found that the proposed aggregation model effectively clusters all of the interesting patterns by discarding those patterns that do not satisfy the threshold condition. The proposed model accurately optimizes the utility interval of the valid patterns.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Machine Learning in Hybrid Environment for Information Identification with Remotely Sensed Image Data

next chapter A Study of Three Different Approaches to Point Placement on a Line in an Inexact Model

Zhang, S., Zhang, C., Wu, X.: Knowledge Discovery in Multiple Databases. Springer, London (2004). https://doi.org/10.1007/978-0-85729-388-6MATHCrossRef

Lesser, V., Horling, B., Klassner, F., Raja, A., Wagner, T., Zhang, S.X.: BIG: an agent for resource-bounded information gathering and decision making. Artif. Intell. 118(1–2), 197–244 (2000)MATHCrossRef

Zhong, N., Yao, Y.Y., Ohishima, M.: Peculiarity oriented multi database mining. IEEE Trans. Knowl. Data Eng. 15(4), 952–960 (2003)CrossRef

Wu, X., Zhang, S.: Synthesizing high-frequency rules from different data sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)CrossRef

Zhang, S., Zaki, M.J.: Mining multiple data sources: local pattern analysis. Data Min. Knowl. Discov. 12(2–3), 121–125 (2006)MathSciNetCrossRef

Adhikari, A., Ramachandra Rao, P., Pedrycz, W.: Developing Multi-Database Mining Applications. Springer, London (2010). https://doi.org/10.1007/978-1-84996-044-1MATHCrossRef

Muley, A., Gudadhe, M.: Synthesizing high-utility patterns from different data sources. Data. 3(3), 32 (2018)CrossRef

Arabie, P., Hubert, L.J.: An overview of combinatorial data. In: Clustering and Classification, p. 5 (1996)CrossRef

Piatetsky-Shapiro, G., Fayyad, U.M., Smyth, P., Uthurusamy, R. (eds.) Advances in Knowledge Discovery and Data Mining, vol. 21. AAAI Press, Menlo Park (1996)

10.

Michalski, R.S., Stepp, R.E.: Learning from observation: conceptual clustering. In: Michalski, R.S., Carbonell, J.G., Mitchell, T.M. (eds.) Machine Learning. SYMBOLIC, vol. 1, pp. 331–363. Springer, Berlin (1983). https://doi.org/10.1007/978-3-662-12405-5_11CrossRef

11.

Han, J., Pei, J., Kamber, M.: Data Mining: Concepts and Techniques. Elsevier, New York (2011)MATH

12.

Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. ACM SIGMOD Rec. 27(2), 94–105 (1998)CrossRef

13.

Fournier-Viger, P., et al.: The SPMF open-source data mining library version 2. In: Berendt, B., et al. (eds.) ECML PKDD 2016. LNCS (LNAI), vol. 9853, pp. 36–40. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46131-1_8CrossRef

14.

Zida, S., Fournier-Viger, P., Lin, J.C.W., Wu, C.W., Tseng, V.S.: EFIM: a fast and memory efficient algorithm for high-utility itemset mining. Knowl. Inf. Syst. 51(2), 595–625 (2017)CrossRef

15.

Good, I.: Probability and the Weighting of Evidence. Charles Griffin, London (1950)MATH

16.

Chen, Y., An, A.: Approximate parallel high-utility itemset mining. Big Data Res. 6, 26–42 (2016)CrossRef

17.

Xun, Y., Zhang, J., Qin, X.: FiDoop: parallel mining of frequent itemsets using mapreduce. IEEE Trans. Syst. Man Cybern.: Syst. 46(3), 313–325 (2016)CrossRef

18.

Zhang, F., Liu, M., Gui, F., Shen, W., Shami, A., Ma, Y.: A distributed frequent itemset mining algorithm using spark for big data analytics. Clust. Comput. 18(4), 1493–1501 (2015)CrossRef

19.

Marjani, M., et al.: Big IoT data analytics: architecture, opportunities, and open research challenges. IEEE Access 5, 5247–5261 (2017)CrossRef

20.

Wang, R., et al.: Review on mining data from multiple data sources. Pattern Recognit. Lett. 109, 120–128 (2018)CrossRef

21.

Adhikari, A., Adhikari, J.: Mining patterns of select items in different data sources. Advances in Knowledge Discovery in Databases. ISRL, vol. 79, pp. 233–254. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-13212-9_12CrossRef

22.

Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. ACM SIGMOD Rec. 22(2), 207–216 (1993)CrossRef

23.

Yao, H., Hamilton, H.J., Geng, L.: A unified framework for utility-based measures for mining itemsets. In: Proceedings of ACM SIGKDD 2nd Workshop on Utility-Based Data Mining, pp. 28–37, August 2006

24.

Ahmed, C.F., Tanbeer, S.K., Jeong, B.S., Lee, Y.K.: Efficient tree structures for high-utility pattern mining in incremental databases. IEEE Trans. Knowl. Data Eng. 21(12), 1708–1721 (2009)CrossRef

25.

Fournier-Viger, P., Wu, C.-W., Tseng, Vincent S.: Novel concise representations of high utility itemsets using generator patterns. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS (LNAI), vol. 8933, pp. 30–43. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-14717-8_3CrossRef

26.

Lin, Y., Chen, H., Lin, G., Chen, J., Ma, Z., Li, J.: Synthesizing decision rules from multiple information sources: a neighborhood granulation viewpoint. Int. J. Mach. Learn. Cybern. 9, 1919–1928 (2018)CrossRef

27.

Zhang, S., Wu, X., Zhang, C.: Multi-database mining. IEEE Comput. Intell. Bull. 2(1), 5–13 (2003)

28.

Xu, W., Yu, J.: A novel approach to information fusion in multi-source datasets: a granular computing viewpoint. Inf. Sci. 378, 410–423 (2017)CrossRef

Title: Clustering-Based Aggregation of High-Utility Patterns from Unknown Multi-database
Authors: Abhinav Muley
Manish Gudadhe
Publisher: Springer Berlin Heidelberg
Book: Transactions on Computational Science XXXIV
Print ISBN: 978-3-662-59957-0

Electronic ISBN: 978-3-662-59958-7

Copyright Year: 2019
DOI: https://doi.org/10.1007/978-3-662-59958-7_2

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner