Skip to main content
Erschienen in: Pattern Analysis and Applications 2/2023

27.11.2022 | Short Paper

Consensus similarity graph construction for clustering

verfasst von: Tülin İnkaya

Erschienen in: Pattern Analysis and Applications | Ausgabe 2/2023

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A similarity graph represents the local characteristics of a data set, and it is used as input to various clustering methods including spectral, graph-based, and hierarchical clustering. Several similarity graphs exist in the literature; however, there is not a single similarity graph that can handle all kinds of cluster shapes and structures. In this study, motivated by the successful applications of ensemble approaches to clustering, a generic method for consensus similarity graph construction is proposed. The proposed approach first constructs multiple similarity graphs using bootstrap aggregating (bagging). Then, these graphs are fused into a consensus similarity graph using the normalized co-association matrix. We use k-nearest neighbor, \(\varepsilon\)-neighborhood, fully connected graph, and proximity graphs as the base similarity graphs. Moreover, the proposed approach is coupled with various clustering algorithms including spectral, graph-based, and hierarchical clustering. The experimental results with various spatial and real data sets demonstrate the effectiveness of the consensus similarity graphs in clustering. The proposed approach is also robust to local noise.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Aggarwal CC, Reddy CK (2014) Data clustering: algorithms and applications. Chapman &Hall/CRC, USAMATH Aggarwal CC, Reddy CK (2014) Data clustering: algorithms and applications. Chapman &Hall/CRC, USAMATH
2.
Zurück zum Zitat Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416MathSciNet Von Luxburg U (2007) A tutorial on spectral clustering. Stat Comput 17(4):395–416MathSciNet
3.
Zurück zum Zitat Tan P-N, Steinbach M, Kumar V (2013) Data mining cluster analysis: basic concepts and algorithms. Introduction to data mining, 487–533 Tan P-N, Steinbach M, Kumar V (2013) Data mining cluster analysis: basic concepts and algorithms. Introduction to data mining, 487–533
4.
Zurück zum Zitat İnkaya T (2015) A parameter-free similarity graph for spectral clustering. Expert Syst Appl 42(24):9489–9498 İnkaya T (2015) A parameter-free similarity graph for spectral clustering. Expert Syst Appl 42(24):9489–9498
5.
Zurück zum Zitat Nadler B, Galun M (2007) Fundamental limitations of spectral clustering. In: Advances in neural information processing systems, pp. 1017–1024 Nadler B, Galun M (2007) Fundamental limitations of spectral clustering. In: Advances in neural information processing systems, pp. 1017–1024
6.
Zurück zum Zitat Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems pp. 849–856 Ng AY, Jordan MI, Weiss Y (2002) On spectral clustering: analysis and an algorithm. In: Advances in neural information processing systems pp. 849–856
7.
Zurück zum Zitat Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 17:1601–1608 Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. Adv Neural Inf Process Syst 17:1601–1608
8.
Zurück zum Zitat Zhang X, Li J, Yu H (2011) Local density adaptive similarity measurement for spectral clustering. Pattern Recogn Lett 32(2):352–358 Zhang X, Li J, Yu H (2011) Local density adaptive similarity measurement for spectral clustering. Pattern Recogn Lett 32(2):352–358
9.
Zurück zum Zitat Correa CD, Lindstrom P (2012) Locally-scaled spectral clustering using empty region graphs. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1330–1338 Correa CD, Lindstrom P (2012) Locally-scaled spectral clustering using empty region graphs. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, pp. 1330–1338
10.
Zurück zum Zitat Mishra G, Mohanty SK (2020) Efficient construction of an approximate similarity graph for minimum spanning tree based clustering. Appl Soft Comput 97:106676 Mishra G, Mohanty SK (2020) Efficient construction of an approximate similarity graph for minimum spanning tree based clustering. Appl Soft Comput 97:106676
11.
Zurück zum Zitat Chrysouli C, Tefas A (2015) Spectral clustering and semi-supervised learning using evolving similarity graphs. Appl Soft Comput 34:625–637 Chrysouli C, Tefas A (2015) Spectral clustering and semi-supervised learning using evolving similarity graphs. Appl Soft Comput 34:625–637
12.
Zurück zum Zitat Zang W, Jiang Z, Ren L (2017) Improved spectral clustering based on density combining dna genetic algorithm. Int J Pattern Recognit Artif Intell 31(04):1750010 Zang W, Jiang Z, Ren L (2017) Improved spectral clustering based on density combining dna genetic algorithm. Int J Pattern Recognit Artif Intell 31(04):1750010
13.
Zurück zum Zitat Tan M, Zhang S, Wu L (2020) Mutual knn based spectral clustering. Neural Comput Appl 32(11):6435–6442 Tan M, Zhang S, Wu L (2020) Mutual knn based spectral clustering. Neural Comput Appl 32(11):6435–6442
14.
Zurück zum Zitat Zhou Z-H (2019) Ensemble methods: foundations and algorithms. Chapman and Hall/CRC, USA Zhou Z-H (2019) Ensemble methods: foundations and algorithms. Chapman and Hall/CRC, USA
15.
Zurück zum Zitat Vega-Pons S, Ruiz-Shulcloper J (2011) A survey of clustering ensemble algorithms. Int J Pattern Recognit Artif Intell 25(03):337–372MathSciNet Vega-Pons S, Ruiz-Shulcloper J (2011) A survey of clustering ensemble algorithms. Int J Pattern Recognit Artif Intell 25(03):337–372MathSciNet
16.
Zurück zum Zitat Zhu X, Change Loy C, Gong S (2014) Constructing robust affinity graphs for spectral clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1450–1457 Zhu X, Change Loy C, Gong S (2014) Constructing robust affinity graphs for spectral clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1450–1457
17.
Zurück zum Zitat Beauchemin M (2015) A density-based similarity matrix construction for spectral clustering. Neurocomputing 151:835–844 Beauchemin M (2015) A density-based similarity matrix construction for spectral clustering. Neurocomputing 151:835–844
18.
Zurück zum Zitat Carreira-Perpinán MA, Zemel RS (2005) Proximity graphs for clustering and manifold learning. Adv Neural Inf Process Syst 17:225–232 Carreira-Perpinán MA, Zemel RS (2005) Proximity graphs for clustering and manifold learning. Adv Neural Inf Process Syst 17:225–232
19.
Zurück zum Zitat Premachandran V, Kakarala R (2013) Consensus of k-nns for robust neighborhood selection on graph-based manifolds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1594–1601 Premachandran V, Kakarala R (2013) Consensus of k-nns for robust neighborhood selection on graph-based manifolds. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1594–1601
20.
Zurück zum Zitat Rokach L (2009) Taxonomy for characterizing ensemble methods in classification tasks: a review and annotated bibliography. Comput Stat Data Anal 53(12):4046–4072MathSciNetMATH Rokach L (2009) Taxonomy for characterizing ensemble methods in classification tasks: a review and annotated bibliography. Comput Stat Data Anal 53(12):4046–4072MathSciNetMATH
21.
Zurück zum Zitat Woźniak M, Grana M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Inf Fusion 16:3–17 Woźniak M, Grana M, Corchado E (2014) A survey of multiple classifier systems as hybrid systems. Inf Fusion 16:3–17
22.
Zurück zum Zitat Boongoen T, Iam-On N (2018) Cluster ensembles: a survey of approaches with recent extensions and applications. Comput Sci Rev 28:1–25MathSciNetMATH Boongoen T, Iam-On N (2018) Cluster ensembles: a survey of approaches with recent extensions and applications. Comput Sci Rev 28:1–25MathSciNetMATH
23.
Zurück zum Zitat Ghaemi R, Sulaiman MN, Ibrahim H, Mustapha N et al (2009) A survey: clustering ensembles techniques. World Acad Sci Eng Technol 50:636–645 Ghaemi R, Sulaiman MN, Ibrahim H, Mustapha N et al (2009) A survey: clustering ensembles techniques. World Acad Sci Eng Technol 50:636–645
24.
Zurück zum Zitat Strehl A, Ghosh J (2002) Cluster ensembles: a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617MathSciNetMATH Strehl A, Ghosh J (2002) Cluster ensembles: a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617MathSciNetMATH
25.
Zurück zum Zitat Fred AL, Jain AK (2002) Data clustering using evidence accumulation. Object Recognit Support User Interact Service Robots 4:276–280 Fred AL, Jain AK (2002) Data clustering using evidence accumulation. Object Recognit Support User Interact Service Robots 4:276–280
26.
Zurück zum Zitat Fred AL, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850 Fred AL, Jain AK (2005) Combining multiple clusterings using evidence accumulation. IEEE Trans Pattern Anal Mach Intell 27(6):835–850
27.
Zurück zum Zitat Dudoit S, Fridlyand J (2003) Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19(9):1090–1099 Dudoit S, Fridlyand J (2003) Bagging to improve the accuracy of a clustering procedure. Bioinformatics 19(9):1090–1099
28.
Zurück zum Zitat Fischer B, Buhmann JM (2003) Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Trans Pattern Anal Mach Intell 25(4):513–518 Fischer B, Buhmann JM (2003) Path-based clustering for grouping of smooth curves and texture segmentation. IEEE Trans Pattern Anal Mach Intell 25(4):513–518
29.
Zurück zum Zitat Wang X, Yang C, Zhou J (2009) Clustering aggregation by probability accumulation. Pattern Recogn 42(5):668–675MATH Wang X, Yang C, Zhou J (2009) Clustering aggregation by probability accumulation. Pattern Recogn 42(5):668–675MATH
30.
Zurück zum Zitat Lourenço A, Bulo SR, Rebagliati N, Fred AL, Figueiredo MA, Pelillo M (2015) Probabilistic consensus clustering using evidence accumulation. Mach Learn 98(1):331–357MathSciNetMATH Lourenço A, Bulo SR, Rebagliati N, Fred AL, Figueiredo MA, Pelillo M (2015) Probabilistic consensus clustering using evidence accumulation. Mach Learn 98(1):331–357MathSciNetMATH
31.
Zurück zum Zitat Huang D, Wang C-D, Lai J-H (2017) Locally weighted ensemble clustering. IEEE Trans Cybern 48(5):1460–1473 Huang D, Wang C-D, Lai J-H (2017) Locally weighted ensemble clustering. IEEE Trans Cybern 48(5):1460–1473
32.
Zurück zum Zitat Zhong C, Hu L, Yue X, Luo T, Fu Q, Xu H (2019) Ensemble clustering based on evidence extracted from the co-association matrix. Pattern Recogn 92:93–106 Zhong C, Hu L, Yue X, Luo T, Fu Q, Xu H (2019) Ensemble clustering based on evidence extracted from the co-association matrix. Pattern Recogn 92:93–106
33.
Zurück zum Zitat Zhong C, Luo T, Yue X (2018) Cluster ensemble based on iteratively refined co-association matrix. IEEE Access 6:69210–69223 Zhong C, Luo T, Yue X (2018) Cluster ensemble based on iteratively refined co-association matrix. IEEE Access 6:69210–69223
34.
Zurück zum Zitat Zhong C, Yue X, Zhang Z, Lei J (2015) A clustering ensemble: Two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48(8):2699–2709MATH Zhong C, Yue X, Zhang Z, Lei J (2015) A clustering ensemble: Two-level-refined co-association matrix with path-based transformation. Pattern Recogn 48(8):2699–2709MATH
35.
Zurück zum Zitat Ayad HG, Kamel MS (2010) On voting-based consensus of cluster ensembles. Pattern Recogn 43(5):1943–1953MATH Ayad HG, Kamel MS (2010) On voting-based consensus of cluster ensembles. Pattern Recogn 43(5):1943–1953MATH
36.
Zurück zum Zitat Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on machine learning, p. 36 Fern XZ, Brodley CE (2004) Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the twenty-first international conference on machine learning, p. 36
37.
Zurück zum Zitat Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392MathSciNetMATH Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392MathSciNetMATH
38.
Zurück zum Zitat Topchy A, Jain AK, Punch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27(12):1866–1881 Topchy A, Jain AK, Punch W (2005) Clustering ensembles: models of consensus and weak partitions. IEEE Trans Pattern Anal Mach Intell 27(12):1866–1881
39.
Zurück zum Zitat Topchy A, Jain AK, Punch W (2004) A mixture model for clustering ensembles. In: Proceedings of the 2004 SIAM international conference on data mining, pp. 379–390 Topchy A, Jain AK, Punch W (2004) A mixture model for clustering ensembles. In: Proceedings of the 2004 SIAM international conference on data mining, pp. 379–390
40.
Zurück zum Zitat Mimaroglu S, Erdil E (2011) Combining multiple clusterings using similarity graph. Pattern Recogn 44(3):694–703MATH Mimaroglu S, Erdil E (2011) Combining multiple clusterings using similarity graph. Pattern Recogn 44(3):694–703MATH
41.
Zurück zum Zitat Hamidi SS, Akbari E, Motameni H (2019) Consensus clustering algorithm based on the automatic partitioning similarity graph. Data Knowl Eng 124:101754 Hamidi SS, Akbari E, Motameni H (2019) Consensus clustering algorithm based on the automatic partitioning similarity graph. Data Knowl Eng 124:101754
42.
Zurück zum Zitat Lu Y, Wan Y (2013) Pha: a fast potential-based hierarchical agglomerative clustering method. Pattern Recogn 46(5):1227–1239 Lu Y, Wan Y (2013) Pha: a fast potential-based hierarchical agglomerative clustering method. Pattern Recogn 46(5):1227–1239
43.
Zurück zum Zitat Belhadj S, Attia A, Adnane BA, Ahmed-Foitih Z, Ahmed AT (2016) A novel epileptic seizure detection using fast potential-based hierarchical agglomerative clustering based on emd. Int J Comput Sci Netw Secur 16(5):7–12 Belhadj S, Attia A, Adnane BA, Ahmed-Foitih Z, Ahmed AT (2016) A novel epileptic seizure detection using fast potential-based hierarchical agglomerative clustering based on emd. Int J Comput Sci Netw Secur 16(5):7–12
44.
Zurück zum Zitat Attia A, Frahta N, Moussaoui A, Belhadj S (2016) An efficient fmri data clustering method using pha algorithm and dynamic time warping. Int J Comput Sci Inf Secur 14(5):222–230 Attia A, Frahta N, Moussaoui A, Belhadj S (2016) An efficient fmri data clustering method using pha algorithm and dynamic time warping. Int J Comput Sci Inf Secur 14(5):222–230
45.
Zurück zum Zitat Cai Z, Yang X, Huang T, Zhu W (2020) A new similarity combining reconstruction coefficient with pairwise distance for agglomerative clustering. Inf Sci 508:173–182MathSciNet Cai Z, Yang X, Huang T, Zhu W (2020) A new similarity combining reconstruction coefficient with pairwise distance for agglomerative clustering. Inf Sci 508:173–182MathSciNet
46.
Zurück zum Zitat Lu Y, Hou X, Chen X (2016) A novel travel-time based similarity measure for hierarchical clustering. Neurocomputing 173:3–8 Lu Y, Hou X, Chen X (2016) A novel travel-time based similarity measure for hierarchical clustering. Neurocomputing 173:3–8
47.
Zurück zum Zitat Brito MR, Chávez EL, Quiroz AJ, Yukich JE (1997) Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection. Stat Probab Lett 35(1):33–42MathSciNetMATH Brito MR, Chávez EL, Quiroz AJ, Yukich JE (1997) Connectivity of the mutual k-nearest-neighbor graph in clustering and outlier detection. Stat Probab Lett 35(1):33–42MathSciNetMATH
48.
Zurück zum Zitat Duda R, Hart P, Stork D (2012) Pattern classification. Wiley, New YorkMATH Duda R, Hart P, Stork D (2012) Pattern classification. Wiley, New YorkMATH
49.
Zurück zum Zitat Yao AC-C (1982) On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM J Comput 11(4):721–736MathSciNetMATH Yao AC-C (1982) On constructing minimum spanning trees in k-dimensional spaces and related problems. SIAM J Comput 11(4):721–736MathSciNetMATH
52.
Zurück zum Zitat Topchy A, Minaei-Bidgoli B, Jain AK, Punch WF (2004) Adaptive clustering ensembles. In: Proceedings of the 17th international conference on pattern recognition. ICPR 2004, pp. 272–275 Topchy A, Minaei-Bidgoli B, Jain AK, Punch WF (2004) Adaptive clustering ensembles. In: Proceedings of the 17th international conference on pattern recognition. ICPR 2004, pp. 272–275
53.
Zurück zum Zitat Casa A, Scrucca L, Menardi G (2021) Better than the best? answers via model ensemble in density-based clustering. Adv Data Anal Classif 15(3):599–623MathSciNetMATH Casa A, Scrucca L, Menardi G (2021) Better than the best? answers via model ensemble in density-based clustering. Adv Data Anal Classif 15(3):599–623MathSciNetMATH
54.
Zurück zum Zitat Jain AK, Law MH (2005) Data clustering: A user’s dilemma. In: International conference on pattern recognition and machine intelligence, pp. 1–10. Springer Jain AK, Law MH (2005) Data clustering: A user’s dilemma. In: International conference on pattern recognition and machine intelligence, pp. 1–10. Springer
55.
Zurück zum Zitat Liu D, Nosovskiy GV, Sourina O (2008) Effective clustering and boundary detection algorithm based on delaunay triangulation. Pattern Recogn Lett 29(9):1261–1273MATH Liu D, Nosovskiy GV, Sourina O (2008) Effective clustering and boundary detection algorithm based on delaunay triangulation. Pattern Recogn Lett 29(9):1261–1273MATH
56.
Zurück zum Zitat Ultsch A (2005) Clustering wih som: U*c. In: Proceedings of the workshop on self-organizing maps, pp. 75–82 Ultsch A (2005) Clustering wih som: U*c. In: Proceedings of the workshop on self-organizing maps, pp. 75–82
58.
Zurück zum Zitat Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905 Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
59.
Zurück zum Zitat Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 100(1):68–86MATH Zahn CT (1971) Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Trans Comput 100(1):68–86MATH
60.
Zurück zum Zitat Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Morgan Kaufmann, BurlingtonMATH Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Morgan Kaufmann, BurlingtonMATH
61.
Zurück zum Zitat Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850 Rand WM (1971) Objective criteria for the evaluation of clustering methods. J Am Stat Assoc 66(336):846–850
62.
Zurück zum Zitat Fred A, Jain AK (2003) Robust data clustering. In: Proceedings of the 2003 IEEE Computer society conference on computer vision and pattern recognition, pp. II–II Fred A, Jain AK (2003) Robust data clustering. In: Proceedings of the 2003 IEEE Computer society conference on computer vision and pattern recognition, pp. II–II
63.
Zurück zum Zitat Wilcoxon F (1992) Individual comparisons by ranking methods. In: Johnson NL (ed) Breakthroughs in statistics. Springer, New York, pp 196–202 Wilcoxon F (1992) Individual comparisons by ranking methods. In: Johnson NL (ed) Breakthroughs in statistics. Springer, New York, pp 196–202
64.
Zurück zum Zitat Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1):105–139 Bauer E, Kohavi R (1999) An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Mach Learn 36(1):105–139
Metadaten
Titel
Consensus similarity graph construction for clustering
verfasst von
Tülin İnkaya
Publikationsdatum
27.11.2022
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 2/2023
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-022-01116-w

Weitere Artikel der Ausgabe 2/2023

Pattern Analysis and Applications 2/2023 Zur Ausgabe

Premium Partner