Skip to main content

2016 | OriginalPaper | Buchkapitel

9. Clustering

verfasst von : Thomas A. Runkler

Erschienen in: Data Analytics

Verlag: Springer Fachmedien Wiesbaden

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Clustering is unsupervised learning that assigns labels to objects in unlabeled data . When clustering is performed on data that do have physical classes, the clusters may or may not correspond with the physical classes. Cluster partitions may be mathematically represented by sets, partition matrices, and/or cluster prototypes. Sequential clustering (single linkage, complete linkage, average linkage, Ward’s method, etc.) is easily implemented but computationally expensive. Partitional clustering can be based on hard, fuzzy, possibilistic, or noise clustering models. Cluster prototypes can take many forms such as hyperspheric, ellipsoidal, linear, circles, or more complex shapes. Relational clustering models find clusters in relational data. Complex relational clusters can be found by kernelization. Cluster tendency assessment finds out if the data contain clusters at all, and cluster validity measures help identify an appropriate number of clusters. Clustering can also be done by heuristic methods such as the self-organizing map.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat G. B. Ball and D. J. Hall. Isodata, an iterative method of multivariate analysis and pattern classification. In IFIPS Congress, 1965. G. B. Ball and D. J. Hall. Isodata, an iterative method of multivariate analysis and pattern classification. In IFIPS Congress, 1965.
2.
Zurück zum Zitat J. C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, 1981.CrossRefMATH J. C. Bezdek. Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York, 1981.CrossRefMATH
3.
Zurück zum Zitat J. C. Bezdek. Fuzzy models — what are they, and why? IEEE Transactions on Fuzzy Systems, 1(1):1–6, 1993.MathSciNetCrossRef J. C. Bezdek. Fuzzy models — what are they, and why? IEEE Transactions on Fuzzy Systems, 1(1):1–6, 1993.MathSciNetCrossRef
4.
Zurück zum Zitat J. C. Bezdek, C. Coray, R. Gunderson, and J. Watson. Detection and characterization of cluster substructure, I. Linear structure: Fuzzy c–lines. SIAM Journal on Applied Mathematics, 40(2):339–357, April 1981. J. C. Bezdek, C. Coray, R. Gunderson, and J. Watson. Detection and characterization of cluster substructure, I. Linear structure: Fuzzy c–lines. SIAM Journal on Applied Mathematics, 40(2):339–357, April 1981.
5.
Zurück zum Zitat J. C. Bezdek, C. Coray, R. Gunderson, and J. Watson. Detection and characterization of cluster substructure, II. Fuzzy c–varieties and convex combinations thereof. SIAM Journal on Applied Mathematics, 40(2):358–372, April 1981. J. C. Bezdek, C. Coray, R. Gunderson, and J. Watson. Detection and characterization of cluster substructure, II. Fuzzy c–varieties and convex combinations thereof. SIAM Journal on Applied Mathematics, 40(2):358–372, April 1981.
6.
Zurück zum Zitat J. C. Bezdek and R. J. Hathaway. Optimization of fuzzy clustering criteria using genetic algorithms. In IEEE Conference on Evolutionary Computation, Orlando, volume 2, pages 589–594, June 1994. J. C. Bezdek and R. J. Hathaway. Optimization of fuzzy clustering criteria using genetic algorithms. In IEEE Conference on Evolutionary Computation, Orlando, volume 2, pages 589–594, June 1994.
7.
Zurück zum Zitat J. C. Bezdek, J. M. Keller, R. Krishnapuram, and N. R. Pal. Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. Kluwer, Norwell, 1999.CrossRefMATH J. C. Bezdek, J. M. Keller, R. Krishnapuram, and N. R. Pal. Fuzzy Models and Algorithms for Pattern Recognition and Image Processing. Kluwer, Norwell, 1999.CrossRefMATH
8.
Zurück zum Zitat R. N. Davé. Fuzzy shell clustering and application to circle detection in digital images. International Journal on General Systems, 16:343–355, 1990.CrossRefMATH R. N. Davé. Fuzzy shell clustering and application to circle detection in digital images. International Journal on General Systems, 16:343–355, 1990.CrossRefMATH
9.
Zurück zum Zitat R. N. Davé. Characterization and detection of noise in clustering. Pattern Recognition Letters, 12:657–664, 1991.CrossRef R. N. Davé. Characterization and detection of noise in clustering. Pattern Recognition Letters, 12:657–664, 1991.CrossRef
10.
Zurück zum Zitat W. H. E. Day and H. Edelsbrunner. Efficient algorithms for agglomerative hierarchical clustering methods. Journal of Classification, 1(1):7–24, 1984.CrossRefMATH W. H. E. Day and H. Edelsbrunner. Efficient algorithms for agglomerative hierarchical clustering methods. Journal of Classification, 1(1):7–24, 1984.CrossRefMATH
11.
Zurück zum Zitat H. Fang and Y. Saad. Farthest centroids divisive clustering. In International Conference on Machine Learning and Applications, pages 232–238, 2008. H. Fang and Y. Saad. Farthest centroids divisive clustering. In International Conference on Machine Learning and Applications, pages 232–238, 2008.
12.
Zurück zum Zitat M. Girolami. Mercer kernel–based clustering in feature space. IEEE Transactions on Neural Networks, 13:780–784, 2002.CrossRef M. Girolami. Mercer kernel–based clustering in feature space. IEEE Transactions on Neural Networks, 13:780–784, 2002.CrossRef
13.
Zurück zum Zitat E. E. Gustafson and W. C. Kessel. Fuzzy clustering with a covariance matrix. In IEEE International Conference on Decision and Control, San Diego, pages 761–766, 1979. E. E. Gustafson and W. C. Kessel. Fuzzy clustering with a covariance matrix. In IEEE International Conference on Decision and Control, San Diego, pages 761–766, 1979.
14.
Zurück zum Zitat R. J. Hathaway and J. C. Bezdek. NERF c–means: Non–Euclidean relational fuzzy clustering. Pattern Recognition, 27:429–437, 1994.CrossRef R. J. Hathaway and J. C. Bezdek. NERF c–means: Non–Euclidean relational fuzzy clustering. Pattern Recognition, 27:429–437, 1994.CrossRef
15.
Zurück zum Zitat R. J. Hathaway and J. C. Bezdek. Optimization of clustering criteria by reformulation. IEEE Transactions on Fuzzy Systems, 3(2):241–245, May 1995. R. J. Hathaway and J. C. Bezdek. Optimization of clustering criteria by reformulation. IEEE Transactions on Fuzzy Systems, 3(2):241–245, May 1995.
16.
Zurück zum Zitat R. J. Hathaway, J. W. Davenport, and J. C. Bezdek. Relational duals of the c–means algorithms. Pattern Recognition, 22:205–212, 1989.MathSciNetCrossRefMATH R. J. Hathaway, J. W. Davenport, and J. C. Bezdek. Relational duals of the c–means algorithms. Pattern Recognition, 22:205–212, 1989.MathSciNetCrossRefMATH
17.
Zurück zum Zitat R. J. Hathaway, J. M. Huband, and J. C. Bezdek. Kernelized non–Euclidean relational fuzzy c–means algorithm. In IEEE International Conference on Fuzzy Systems, pages 414–419, Reno, May 2005. R. J. Hathaway, J. M. Huband, and J. C. Bezdek. Kernelized non–Euclidean relational fuzzy c–means algorithm. In IEEE International Conference on Fuzzy Systems, pages 414–419, Reno, May 2005.
18.
Zurück zum Zitat A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, 1988.MATH A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice Hall, Englewood Cliffs, 1988.MATH
19.
Zurück zum Zitat A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3):264–323, 1999.CrossRef A. K. Jain, M. N. Murty, and P. J. Flynn. Data clustering: A review. ACM Computing Surveys, 31(3):264–323, 1999.CrossRef
20.
Zurück zum Zitat T. Kohonen. Automatic formation of topological maps of patterns in a self–organizing system. In E. Oja and O. Simula, editors, Scandinavian Conference on Image Analysis, pages 214–220, Helsinki, 1981. T. Kohonen. Automatic formation of topological maps of patterns in a self–organizing system. In E. Oja and O. Simula, editors, Scandinavian Conference on Image Analysis, pages 214–220, Helsinki, 1981.
22.
Zurück zum Zitat R. Krishnapuram, A. Joshi, O. Nasraoui, and L. Yi. Low–complexity fuzzy relational clustering algorithms for web mining. IEEE Transactions on Fuzzy Systems, 9(4):595–607, August 2001.CrossRef R. Krishnapuram, A. Joshi, O. Nasraoui, and L. Yi. Low–complexity fuzzy relational clustering algorithms for web mining. IEEE Transactions on Fuzzy Systems, 9(4):595–607, August 2001.CrossRef
23.
Zurück zum Zitat R. Krishnapuram and J. M. Keller. A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems, 1(2):98–110, May 1993.CrossRef R. Krishnapuram and J. M. Keller. A possibilistic approach to clustering. IEEE Transactions on Fuzzy Systems, 1(2):98–110, May 1993.CrossRef
24.
Zurück zum Zitat T. A. Runkler. The effect of kernelization in relational fuzzy clustering. In GMA/GI Workshop Fuzzy Systems and Computational Intelligence, Dortmund, pages 48–61, November 2006. T. A. Runkler. The effect of kernelization in relational fuzzy clustering. In GMA/GI Workshop Fuzzy Systems and Computational Intelligence, Dortmund, pages 48–61, November 2006.
25.
Zurück zum Zitat T. A. Runkler. Kernelized non–Euclidean relational possibilistic c–means clustering. In IEEE Three Rivers Workshop on Soft Computing in Industrial Applications, Passau, August 2007. T. A. Runkler. Kernelized non–Euclidean relational possibilistic c–means clustering. In IEEE Three Rivers Workshop on Soft Computing in Industrial Applications, Passau, August 2007.
26.
Zurück zum Zitat T. A. Runkler. Relational fuzzy clustering. In J. Valente de Oliveira and W. Pedrycz, editors, Advances in Fuzzy Clustering and its Applications, chapter 2, pages 31–52. Wiley, 2007. T. A. Runkler. Relational fuzzy clustering. In J. Valente de Oliveira and W. Pedrycz, editors, Advances in Fuzzy Clustering and its Applications, chapter 2, pages 31–52. Wiley, 2007.
27.
Zurück zum Zitat T. A. Runkler. Wasp swarm optimization of the c-means clustering model. International Journal of Intelligent Systems, 23(3):269–285, February 2008.CrossRefMATH T. A. Runkler. Wasp swarm optimization of the c-means clustering model. International Journal of Intelligent Systems, 23(3):269–285, February 2008.CrossRefMATH
28.
Zurück zum Zitat J. Sander, M. Ester, H.-P. Kriegel, and X. Xu. Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 2(2):169–194, 1998.CrossRef J. Sander, M. Ester, H.-P. Kriegel, and X. Xu. Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery, 2(2):169–194, 1998.CrossRef
29.
Zurück zum Zitat B. Schölkopf, A.J. Smola, and K. R. Müller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10:1299–1319, 1998.CrossRef B. Schölkopf, A.J. Smola, and K. R. Müller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation, 10:1299–1319, 1998.CrossRef
30.
Zurück zum Zitat P. Sneath and R. Sokal. Numerical Taxonomy. Freeman, San Francisco, 1973.MATH P. Sneath and R. Sokal. Numerical Taxonomy. Freeman, San Francisco, 1973.MATH
31.
Zurück zum Zitat J. H. Ward. Hierarchical grouping to optimize an objective function. Journal of American Statistical Association, 58(301):236–244, 1963.MathSciNetCrossRef J. H. Ward. Hierarchical grouping to optimize an objective function. Journal of American Statistical Association, 58(301):236–244, 1963.MathSciNetCrossRef
32.
Zurück zum Zitat D. J. Willshaw and C. von der Malsburg. How patterned neural connections can be set up by self-organization. Proceedings of the Royal Society London, B194:431–445, 1976.CrossRef D. J. Willshaw and C. von der Malsburg. How patterned neural connections can be set up by self-organization. Proceedings of the Royal Society London, B194:431–445, 1976.CrossRef
33.
Zurück zum Zitat M. P. Windham. Cluster validity for the fuzzy c–means clustering algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-4(4):357–363, July 1982.CrossRef M. P. Windham. Cluster validity for the fuzzy c–means clustering algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, PAMI-4(4):357–363, July 1982.CrossRef
34.
Zurück zum Zitat Z.-D. Wu, W.-X. Xie, and J.-P. Yu. Fuzzy c–means clustering algorithm based on kernel method. In International Conference on Computational Intelligence and Multimedia Applications, pages 49–54, Xi’an, 2003. Z.-D. Wu, W.-X. Xie, and J.-P. Yu. Fuzzy c–means clustering algorithm based on kernel method. In International Conference on Computational Intelligence and Multimedia Applications, pages 49–54, Xi’an, 2003.
35.
Zurück zum Zitat D.-Q. Zhang and S.-C. Chen. Fuzzy clustering using kernel method. In International Conference on Control and Automation, pages 123–127, 2002. D.-Q. Zhang and S.-C. Chen. Fuzzy clustering using kernel method. In International Conference on Control and Automation, pages 123–127, 2002.
36.
Zurück zum Zitat D.-Q. Zhang and S.-C. Chen. Clustering incomplete data using kernel–based fuzzy c–means algorithm. Neural Processing Letters, 18:155–162, 2003.CrossRef D.-Q. Zhang and S.-C. Chen. Clustering incomplete data using kernel–based fuzzy c–means algorithm. Neural Processing Letters, 18:155–162, 2003.CrossRef
37.
Zurück zum Zitat D.-Q. Zhang and S.-C. Chen. Kernel–based fuzzy and possibilistic c–means clustering. In International Conference on Artificial Neural Networks, pages 122–125, Istanbul, 2003. D.-Q. Zhang and S.-C. Chen. Kernel–based fuzzy and possibilistic c–means clustering. In International Conference on Artificial Neural Networks, pages 122–125, Istanbul, 2003.
38.
Zurück zum Zitat R. Zhang and A.I. Rudnicky. A large scale clustering scheme for kernel k–means. In International Conference on Pattern Recognition, pages 289–292, Quebec, 2002. R. Zhang and A.I. Rudnicky. A large scale clustering scheme for kernel k–means. In International Conference on Pattern Recognition, pages 289–292, Quebec, 2002.
Metadaten
Titel
Clustering
verfasst von
Thomas A. Runkler
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-658-14075-5_9