Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 7/2018

17-02-2017 | Original Article

Combined constraint-based with metric-based in semi-supervised clustering ensemble

Authors: Siting Wei, Zhixin Li, Canlong Zhang

Published in: International Journal of Machine Learning and Cybernetics | Issue 7/2018

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Recently, both semi-supervised clustering and cluster ensemble have received tremendous attention due to their accurate and reliable performance. There are mainly two kinds of existing semi-supervised clustering algorithms called constraint-based and metric-based. In this paper, we present a semi-supervised clustering ensemble approach which takes both pairwise constraints and metric measure into account. Firstly, under the assistance of supervised information included pairwise constraints and labeled data, the approach generates different base clustering partitions respectively using constraint-based semi-supervised clustering and metric-based semi-supervised clustering, in which the latter develops a new metric function. Given the spatial particularity of image pixels, the metric considers spatial distribution of surrounding pixels besides inherent features of pixels in the process of image feature extraction. And then the target clustering is obtained by integrating those base clustering partitions into an ensemble function. Finally, we conduct experimental verification on general data sets and image data sets, and compare clustering performance of our approach with those of other approaches. Both theoretical analysis and experimental results demonstrate that the proposed method produces considerable improvement in clustering accuracy and yields superior clustering results over a number of representative clustering methods.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Wu L, Hoi S C H, Jin R, Zhu J, Yu N (2010) Learning bregman distance functions for semi-supervised clustering. IEEE Trans Knowl Data Eng 24(3):478–491CrossRef Wu L, Hoi S C H, Jin R, Zhu J, Yu N (2010) Learning bregman distance functions for semi-supervised clustering. IEEE Trans Knowl Data Eng 24(3):478–491CrossRef
2.
go back to reference Strehl A, Ghosh J, Cardie C (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617MathSciNetMATH Strehl A, Ghosh J, Cardie C (2003) Cluster ensembles—a knowledge reuse framework for combining multiple partitions. J Mach Learn Res 3:583–617MathSciNetMATH
3.
go back to reference Du L, Shen YD, Shen Z, Wang J, Xu Z (2013) A self-supervised framework for clustering ensemble. Lect Notes Comput Sci 7923:253–264CrossRef Du L, Shen YD, Shen Z, Wang J, Xu Z (2013) A self-supervised framework for clustering ensemble. Lect Notes Comput Sci 7923:253–264CrossRef
4.
go back to reference Hao ZF, Wang LJ, Cai RC, Wen W (2015) An improved clustering ensemble method based link analysis. World Wide Web-internet & Web. Inform Syst 18(2):185–195 Hao ZF, Wang LJ, Cai RC, Wen W (2015) An improved clustering ensemble method based link analysis. World Wide Web-internet & Web. Inform Syst 18(2):185–195
5.
go back to reference Yu Z, Chen H, You J, Wong HS, Liu J, Li L et al (2014) Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles. IEEE/ACM Trans Comput Biol Bioinform 11(4):727–740CrossRef Yu Z, Chen H, You J, Wong HS, Liu J, Li L et al (2014) Double selection based semi-supervised clustering ensemble for tumor clustering from gene expression profiles. IEEE/ACM Trans Comput Biol Bioinform 11(4):727–740CrossRef
6.
go back to reference Yu Z, Luo P, You J, Wong HS, Leung H, Wu S et al (2016) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans Knowl Data Eng 28(3):701–714CrossRef Yu Z, Luo P, You J, Wong HS, Leung H, Wu S et al (2016) Incremental semi-supervised clustering ensemble for high dimensional data clustering. IEEE Trans Knowl Data Eng 28(3):701–714CrossRef
7.
go back to reference Xiong S, Azimi J, Fern XZ (2014) Active learning of constraints for semi-supervised clustering. IEEE Trans Knowl Data Eng 26(1):43–54CrossRef Xiong S, Azimi J, Fern XZ (2014) Active learning of constraints for semi-supervised clustering. IEEE Trans Knowl Data Eng 26(1):43–54CrossRef
8.
go back to reference Wang D, Gao X, Wang X (2015) Semi-supervised nonnegative matrix factorization via constraint propagation. IEEE Trans Cybern 46:1–12. Wang D, Gao X, Wang X (2015) Semi-supervised nonnegative matrix factorization via constraint propagation. IEEE Trans Cybern 46:1–12.
9.
go back to reference Yan Y, Chen L, Nguyen D T (2012) Semi-supervised clustering with multi-viewpoint based similarity measure. IEEE Int Jt Conf Neural Netw (IJCNN), 24, 1–8. Yan Y, Chen L, Nguyen D T (2012) Semi-supervised clustering with multi-viewpoint based similarity measure. IEEE Int Jt Conf Neural Netw (IJCNN), 24, 1–8.
10.
go back to reference Yin X, Shu T, Huang Q (2012) Semi-supervised fuzzy clustering with metric learning and entropy regularization. Knowl-Based Syst 35(15):304–311CrossRef Yin X, Shu T, Huang Q (2012) Semi-supervised fuzzy clustering with metric learning and entropy regularization. Knowl-Based Syst 35(15):304–311CrossRef
11.
go back to reference Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. The 21st International Conference on Machine Learning, 81–88. Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. The 21st International Conference on Machine Learning, 81–88.
12.
go back to reference Yin X, Chen S, Hu E, Zhang D (2010) Semi-supervised clustering with metric learning: an adaptive kernel method. Pattern Recognit 43(4):1320–1333CrossRefMATH Yin X, Chen S, Hu E, Zhang D (2010) Semi-supervised clustering with metric learning: an adaptive kernel method. Pattern Recognit 43(4):1320–1333CrossRefMATH
13.
go back to reference Lin L, Qu W, Yu X (2009) A semi-supervised clustering algorithm based on rough reduction. International Conference on Chinese Control and Decision Conference, 5427–5431. Lin L, Qu W, Yu X (2009) A semi-supervised clustering algorithm based on rough reduction. International Conference on Chinese Control and Decision Conference, 5427–5431.
14.
go back to reference Zhang H, Lu J (2009) Semi-supervised fuzzy clustering: a kernel-based approach. Knowl-Based Syst 22(6):477–481CrossRef Zhang H, Lu J (2009) Semi-supervised fuzzy clustering: a kernel-based approach. Knowl-Based Syst 22(6):477–481CrossRef
15.
go back to reference Arzeno N, Vikalo H (2015) Semi-supervised affinity propagation with soft instance-level constraints. IEEE Trans Pattern Anal Mach Intell 37(5):1041–1052CrossRef Arzeno N, Vikalo H (2015) Semi-supervised affinity propagation with soft instance-level constraints. IEEE Trans Pattern Anal Mach Intell 37(5):1041–1052CrossRef
16.
go back to reference Basu S, Banerjee A, Mooney RJ (2002) Semi-supervised clustering by seeding. In: Proceedings of the nineteenth international conference on machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 27–34 Basu S, Banerjee A, Mooney RJ (2002) Semi-supervised clustering by seeding. In: Proceedings of the nineteenth international conference on machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 27–34
17.
go back to reference Pelleg D, Baras D (2007) K-means with large and noisy constraint sets. In: Kok JN, Koronacki J, Mantaras RL, Matwin S, Mladenič D, Skowron A (eds) Machine learning: ECML 2007. Lecture notes in computer science, vol 4701. Springer, Berlin, Heidelberg, pp 674–682 Pelleg D, Baras D (2007) K-means with large and noisy constraint sets. In: Kok JN, Koronacki J, Mantaras RL, Matwin S, Mladenič D, Skowron A (eds) Machine learning: ECML 2007. Lecture notes in computer science, vol 4701. Springer, Berlin, Heidelberg, pp 674–682
18.
go back to reference Grira N, Crucianu M, Boujemaa N (2008) Active semi-supervised fuzzy clustering. Pattern Recognit 41(5):1834–1844CrossRefMATH Grira N, Crucianu M, Boujemaa N (2008) Active semi-supervised fuzzy clustering. Pattern Recognit 41(5):1834–1844CrossRefMATH
19.
go back to reference Zeng H, Cheung Y M, Member S (2012) Semi-supervised maximum margin clustering with pairwise constraints. IEEE Trans Knowl Data Eng 24(5):926–939 Zeng H, Cheung Y M, Member S (2012) Semi-supervised maximum margin clustering with pairwise constraints. IEEE Trans Knowl Data Eng 24(5):926–939
20.
go back to reference Ding S, Jia H, Zhang L, Jin F (2014) Research of semi-supervised spectral clustering algorithm based on pairwise constraints. Neural Comput Appl 24(1), 211–219.CrossRef Ding S, Jia H, Zhang L, Jin F (2014) Research of semi-supervised spectral clustering algorithm based on pairwise constraints. Neural Comput Appl 24(1), 211–219.CrossRef
21.
go back to reference Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning. ACM, New York, pp 209–216 Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: Proceedings of the 24th international conference on Machine learning. ACM, New York, pp 209–216
22.
go back to reference Weinberger KQ, Blitzer J, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(1):207–244MATH Weinberger KQ, Blitzer J, Saul LK (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(1):207–244MATH
23.
go back to reference Huang M, Chen Y, Liu J, Ji W (2014) A large margin nearest cluster metric based semi-supervised clustering algorithm for brain fibers. International Conference on Game Theory for Networks, 1–5. Huang M, Chen Y, Liu J, Ji W (2014) A large margin nearest cluster metric based semi-supervised clustering algorithm for brain fibers. International Conference on Game Theory for Networks, 1–5.
24.
go back to reference Nguyen N, Caruana R (2007) Consensus clusterings. In: Proceedings of the 7th IEEE international conference on data mining. IEEE Computer Society, Washington, DC, pp 607–612 Nguyen N, Caruana R (2007) Consensus clusterings. In: Proceedings of the 7th IEEE international conference on data mining. IEEE Computer Society, Washington, DC, pp 607–612
25.
go back to reference Wang X, Han D, Han C (2013) Rough set based cluster ensemble selection. In: Proceedings of 16th International Conference on Information Fusion (FUSION). IEEE, Istanbul, Turkey, pp 438–444 Wang X, Han D, Han C (2013) Rough set based cluster ensemble selection. In: Proceedings of 16th International Conference on Information Fusion (FUSION). IEEE, Istanbul, Turkey, pp 438–444
26.
go back to reference Wang H, Qi J, Zheng W, Wang M (2010) Semi-supervised cluster ensemble based on binary similarity matrix. IEEE International Conference on Information Management and Engineering, 251–254. Wang H, Qi J, Zheng W, Wang M (2010) Semi-supervised cluster ensemble based on binary similarity matrix. IEEE International Conference on Information Management and Engineering, 251–254.
27.
go back to reference Chen D, Yang Y, Wang H, Mahmood A (2013) Convergence analysis of semi-supervised clustering ensemble. International Conference on Information Science and Technology (ICIST), 783–788. Chen D, Yang Y, Wang H, Mahmood A (2013) Convergence analysis of semi-supervised clustering ensemble. International Conference on Information Science and Technology (ICIST), 783–788.
28.
go back to reference Zhang D, Tan K, Chen S (2004) Semi-supervised kernel-based fuzzy c-means. In: Lecture notes computer science, vol 3316, pp 1229–1234 Zhang D, Tan K, Chen S (2004) Semi-supervised kernel-based fuzzy c-means. In: Lecture notes computer science, vol 3316, pp 1229–1234
29.
30.
go back to reference Na Y, Yu J (2013) A pixel similarity method for spectral clustering image segmentation. J Nanjing Univ Nat Sci 2:159–168 Na Y, Yu J (2013) A pixel similarity method for spectral clustering image segmentation. J Nanjing Univ Nat Sci 2:159–168
31.
go back to reference Fowlkes C, Martin D, Malik J (2003) Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches. In: IEEE conference on computer vision and pattern recognition, vol 2, pp 54–61 Fowlkes C, Martin D, Malik J (2003) Learning affinity functions for image segmentation: combining patch-based and gradient-based approaches. In: IEEE conference on computer vision and pattern recognition, vol 2, pp 54–61
32.
go back to reference Cour T, Bénézit F, Shi J (2005) Spectral segmentation with multiscale graph decomposition. IEEE Comput Soc Conf Comput Vis Pattern Recog 2:1124–1131 Cour T, Bénézit F, Shi J (2005) Spectral segmentation with multiscale graph decomposition. IEEE Comput Soc Conf Comput Vis Pattern Recog 2:1124–1131
33.
go back to reference Martin D, Fowlkes C, Malik J (2004) Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans Pattern Anal Mach Intell 26(5):530–549CrossRef Martin D, Fowlkes C, Malik J (2004) Learning to detect natural image boundaries using local brightness, color, and texture cues. IEEE Trans Pattern Anal Mach Intell 26(5):530–549CrossRef
34.
go back to reference Sun T, Ren Z, Ding S (2011) Region-based semi-supervised clustering image segmentation. Int Conf Nat Comput 4:1855–1858. Sun T, Ren Z, Ding S (2011) Region-based semi-supervised clustering image segmentation. Int Conf Nat Comput 4:1855–1858.
35.
go back to reference Lichman M (2013) UCI Machine Learning Repository. University of California, Irvine, CA School of Information and Computer Science. doi:http://archive.ics.uci.edu/ml. Lichman M (2013) UCI Machine Learning Repository. University of California, Irvine, CA School of Information and Computer Science. doi:http://​archive.​ics.​uci.​edu/​ml.​
36.
go back to reference Kuncheva L, Hadjitodorov S B (2004) Using diversity in cluster ensembles. IEEE Int Conf Syst Man Cybern 2:1214–1219. Kuncheva L, Hadjitodorov S B (2004) Using diversity in cluster ensembles. IEEE Int Conf Syst Man Cybern 2:1214–1219.
37.
go back to reference Arbeláez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Software Eng 33(5):898–916 Arbeláez P, Maire M, Fowlkes C, Malik J (2011) Contour detection and hierarchical image segmentation. IEEE Trans Software Eng 33(5):898–916
38.
go back to reference Wang F, Zhang C, Li T (2009) Clustering with local and global regularization. IEEE Trans Knowl Data Eng 21(12):1665–1678CrossRef Wang F, Zhang C, Li T (2009) Clustering with local and global regularization. IEEE Trans Knowl Data Eng 21(12):1665–1678CrossRef
Metadata
Title
Combined constraint-based with metric-based in semi-supervised clustering ensemble
Authors
Siting Wei
Zhixin Li
Canlong Zhang
Publication date
17-02-2017
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 7/2018
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-016-0628-6

Other articles of this Issue 7/2018

International Journal of Machine Learning and Cybernetics 7/2018 Go to the issue