Skip to main content
Top
Published in: Knowledge and Information Systems 2/2013

01-08-2013 | Regular Paper

Non-unique cluster numbers determination methods based on stability in spectral clustering

Authors: Sumuya Borjigin, Chonghui Guo

Published in: Knowledge and Information Systems | Issue 2/2013

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Recently, a large amount of work has been devoted to the study of spectral clustering—a simple yet powerful method for finding structure in a data set using spectral properties of an associated pairwise similarity matrix. Most of the existing spectral clustering algorithms estimate only one cluster number or estimate non-unique cluster numbers based on eigengap criterion. However, the number of clusters not always exists one, and eigengap criterion lacks theoretical justification. In this paper, we propose non-unique cluster numbers determination methods based on stability in spectral clustering (NCNDBS). We first utilize the multiway normalized cut spectral clustering algorithm to cluster data set for a candidate cluster number \(k\). Then the ratio value of the multiway normalized cut criterion of the obtained clusters and the sum of the leading eigenvalues (descending sort) of the stochastic transition matrix is chosen as a standard to decide whether the \(k\) is a reasonable cluster number. At last, by varying the scaling parameter in the Gaussian function, we judge whether the reasonable cluster number \(k\) is also a stability one. By three stages, we can determine non-unique cluster numbers of a data set. The Lumpability theorem concluded by Meil\(\breve{a}\) and Xu provides a theoretical base for our methods. NCNDBS can estimate non-unique cluster numbers of the data set successfully by illustrative experiments.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Azran A, Ghahramani Z (2006) Spectral methods for automatic multiscale data clustering. In: Proceedings of the 2006 IEEE computer society conference on computer vision and pattern recognition. IEEE Computer Society, Washington Azran A, Ghahramani Z (2006) Spectral methods for automatic multiscale data clustering. In: Proceedings of the 2006 IEEE computer society conference on computer vision and pattern recognition. IEEE Computer Society, Washington
2.
go back to reference Chen Y, Rege M, Dong M, Hua J (2008) Non-negative matrix factorization for semi-supervised data clustering. Knowl Inf Syst 17:355–379CrossRef Chen Y, Rege M, Dong M, Hua J (2008) Non-negative matrix factorization for semi-supervised data clustering. Knowl Inf Syst 17:355–379CrossRef
3.
go back to reference Climescu-Haulica A (2007) How to choose the number of clusters: the Cramer multiplicity solution. Studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 15–22 Climescu-Haulica A (2007) How to choose the number of clusters: the Cramer multiplicity solution. Studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 15–22
4.
go back to reference Li T (2008) Clustering based on matrix spproximation: a unifying view. Knowl Inf Syst 17:1–15MATHCrossRef Li T (2008) Clustering based on matrix spproximation: a unifying view. Knowl Inf Syst 17:1–15MATHCrossRef
6.
go back to reference Meil\(\breve{a}\) M, Shi J, (2000) Learning segmentation by random walks. In: Todd L, Thomas D, Volker T (eds) Neural information processing systems, Denver, USA, December 2000. Advances in neural information processing systems. MIT Press, Cambridge, pp 873–879 Meil\(\breve{a}\) M, Shi J, (2000) Learning segmentation by random walks. In: Todd L, Thomas D, Volker T (eds) Neural information processing systems, Denver, USA, December 2000. Advances in neural information processing systems. MIT Press, Cambridge, pp 873–879
7.
go back to reference Meil\(\breve{a}\) M, Xu L (2003) Multiway cuts and spectral clustering. Technical Report 442: University of Washington. Meil\(\breve{a}\) M, Xu L (2003) Multiway cuts and spectral clustering. Technical Report 442: University of Washington.
8.
go back to reference Milligan G, Cooper M (1985) An examination of procedures for determining number of clusters in a data set. Psychometrika 50:159–179CrossRef Milligan G, Cooper M (1985) An examination of procedures for determining number of clusters in a data set. Psychometrika 50:159–179CrossRef
9.
go back to reference Nagai A (2007) Inappropriateness of the criterion of K-way normalized cuts for deciding the number of clusters. Pattern Recognit Lett 28:1981–1986CrossRef Nagai A (2007) Inappropriateness of the criterion of K-way normalized cuts for deciding the number of clusters. Pattern Recognit Lett 28:1981–1986CrossRef
10.
go back to reference Nascimento M, Carvalho A (2011) Spectral methods for graph clustering: a survey. Eur J Oper Res 211(2):221–231MATHCrossRef Nascimento M, Carvalho A (2011) Spectral methods for graph clustering: a survey. Eur J Oper Res 211(2):221–231MATHCrossRef
11.
go back to reference Ng A, Jordan M, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Dietterich TG, Becker S, Ghahramani Z (eds) Neural information processing systems, Denver, USA, (December 2001). Advances in neural information processing systems. MIT Press, Cambridge, pp 849–856 Ng A, Jordan M, Weiss Y (2001) On spectral clustering: analysis and an algorithm. In: Dietterich TG, Becker S, Ghahramani Z (eds) Neural information processing systems, Denver, USA, (December 2001). Advances in neural information processing systems. MIT Press, Cambridge, pp 849–856
12.
go back to reference Sanguinetti G, Laidler J, Lawrence N (2005) A probabilistic approach to spectral clustering: using KL divergence to find good clusters. Statistics and Optimization of Clustering Workshop, London Sanguinetti G, Laidler J, Lawrence N (2005) A probabilistic approach to spectral clustering: using KL divergence to find good clusters. Statistics and Optimization of Clustering Workshop, London
13.
go back to reference Sanguinetti G, Laidler J, Lawrence N (2005) Automatic determination of the number of clusters using spectral algorithms. Mystic, Proceedings of machine learning for signal processing Sanguinetti G, Laidler J, Lawrence N (2005) Automatic determination of the number of clusters using spectral algorithms. Mystic, Proceedings of machine learning for signal processing
14.
go back to reference Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905CrossRef
15.
go back to reference Stewart G, Sun J (2001) Matrix perturbation theory, 2nd edn. Academic, New York Stewart G, Sun J (2001) Matrix perturbation theory, 2nd edn. Academic, New York
16.
go back to reference Sumuya Guo C, Zang Y (2011) Cluster number estimation based on normalized cut criterion in spectral clustering. ICIC Express Lett 5(1):155–161 Sumuya Guo C, Zang Y (2011) Cluster number estimation based on normalized cut criterion in spectral clustering. ICIC Express Lett 5(1):155–161
17.
go back to reference Takacs B, Demiris Y (2010) Spectral clustering in multi-agent systems. Knowl Inf Syst 25:607–622CrossRef Takacs B, Demiris Y (2010) Spectral clustering in multi-agent systems. Knowl Inf Syst 25:607–622CrossRef
18.
go back to reference Tepper M, Musé P, Almansa A, Mejail M (2011) Automatically finding clusters in normalized cuts. Pattern Recognit 44:1372–1386MATHCrossRef Tepper M, Musé P, Almansa A, Mejail M (2011) Automatically finding clusters in normalized cuts. Pattern Recognit 44:1372–1386MATHCrossRef
19.
20.
go back to reference Wang C, Li W, Ding L, Tian J, Chen S (2005) Image segmentation using spectral clustering. Proceedings of the 17th IEEE international conference on tools with artificial intelligence. IEEE Computer Society, Washington, pp 677–678 Wang C, Li W, Ding L, Tian J, Chen S (2005) Image segmentation using spectral clustering. Proceedings of the 17th IEEE international conference on tools with artificial intelligence. IEEE Computer Society, Washington, pp 677–678
21.
go back to reference Xiang T, Gong S (2008) Spectral clustering with eigenvector selection. Pattern Recognit 41(3):1012–1029MATHCrossRef Xiang T, Gong S (2008) Spectral clustering with eigenvector selection. Pattern Recognit 41(3):1012–1029MATHCrossRef
22.
go back to reference Xu R, Donald W (2009) Clustering. Wiley, New Jersey Xu R, Donald W (2009) Clustering. Wiley, New Jersey
23.
go back to reference Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. In: Lawrence S, Yair W, Léon B (eds) Neural information processing systems, Vancouver, Canada, December 2004. Advances in neural information processing systems. MIT Press, Cambridge, pp 1601–1608 Zelnik-Manor L, Perona P (2004) Self-tuning spectral clustering. In: Lawrence S, Yair W, Léon B (eds) Neural information processing systems, Vancouver, Canada, December 2004. Advances in neural information processing systems. MIT Press, Cambridge, pp 1601–1608
24.
25.
go back to reference Zheng X, Lin X (2004) Automatic determination of intrinsic cluster number family in spectral clustering using random walk on graph. International conference on image processing, Singapore, October 2004. IEEE Computer Society, Singapore, pp 3471–3474 Zheng X, Lin X (2004) Automatic determination of intrinsic cluster number family in spectral clustering using random walk on graph. International conference on image processing, Singapore, October 2004. IEEE Computer Society, Singapore, pp 3471–3474
Metadata
Title
Non-unique cluster numbers determination methods based on stability in spectral clustering
Authors
Sumuya Borjigin
Chonghui Guo
Publication date
01-08-2013
Publisher
Springer London
Published in
Knowledge and Information Systems / Issue 2/2013
Print ISSN: 0219-1377
Electronic ISSN: 0219-3116
DOI
https://doi.org/10.1007/s10115-012-0547-0

Other articles of this Issue 2/2013

Knowledge and Information Systems 2/2013 Go to the issue

Premium Partner