Skip to main content
Top

2018 | OriginalPaper | Chapter

Improvement of the Simplified Silhouette Validity Index

Authors : Artur Starczewski, Krzysztof Przybyszewski

Published in: Artificial Intelligence and Soft Computing

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The fundamental issue of data clustering is an evaluation of results of clustering algorithms. Lots of methods have been proposed for cluster validation. The most popular approach is based on internal cluster validity indices. Among this kind of indices, the Silhouette index and its computationally simpled version, i.e. the Simplified Silhouette, are frequently used. In this paper modification of the Simplified Silhouette index is proposed. The suggested approach is based on using an additional component, which improves clusters validity assessment. The performance of the new cluster validity indices has been demonstrated for artificial and real datasets, where the PAM clustering algorithm has been applied as the underlying clustering technique.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Prez, J.M., Perona, I.: An extensive comparative study of cluster validity indices. Pattern Recogn. 46, 243–256 (2013)CrossRef Arbelaitz, O., Gurrutxaga, I., Muguerza, J., Prez, J.M., Perona, I.: An extensive comparative study of cluster validity indices. Pattern Recogn. 46, 243–256 (2013)CrossRef
2.
go back to reference Bilski, J., Smoląg, J.: Parallel architectures for learning the RTRN and Elman dynamic neural networks. IEEE Trans. Parallel Distrib. Syst. 26(9), 2561–2570 (2015)CrossRef Bilski, J., Smoląg, J.: Parallel architectures for learning the RTRN and Elman dynamic neural networks. IEEE Trans. Parallel Distrib. Syst. 26(9), 2561–2570 (2015)CrossRef
3.
go back to reference Bilski, J., Wilamowski, B.M.: Parallel learning of feedforward neural networks without error backpropagation. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9692, pp. 57–69. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39378-0_6CrossRef Bilski, J., Wilamowski, B.M.: Parallel learning of feedforward neural networks without error backpropagation. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9692, pp. 57–69. Springer, Cham (2016). https://​doi.​org/​10.​1007/​978-3-319-39378-0_​6CrossRef
5.
go back to reference Bradley, P., Fayyad, U.: Refining initial points for k-means clustering. In: Proceedings of the Fifteenth International Conference on Knowledge Discovery and Data Mining, pp. 9–15. AAAI Press, New York (1998) Bradley, P., Fayyad, U.: Refining initial points for k-means clustering. In: Proceedings of the Fifteenth International Conference on Knowledge Discovery and Data Mining, pp. 9–15. AAAI Press, New York (1998)
7.
go back to reference Cpałka, K., Rebrova, O., Nowicki, R., Rutkowski, L.: On design of flexible neuro-fuzzy systems for nonlinear modelling. Int. J. Gen. Syst. 42(6), 706–720 (2013)MATHCrossRef Cpałka, K., Rebrova, O., Nowicki, R., Rutkowski, L.: On design of flexible neuro-fuzzy systems for nonlinear modelling. Int. J. Gen. Syst. 42(6), 706–720 (2013)MATHCrossRef
8.
go back to reference Cpałka, K., Rutkowski, L.: Flexible Takagi-Sugeno fuzzy systems. In: Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, IJCNN (2005) Cpałka, K., Rutkowski, L.: Flexible Takagi-Sugeno fuzzy systems. In: Proceedings of the 2005 IEEE International Joint Conference on Neural Networks, IJCNN (2005)
10.
go back to reference Fränti, P., Rezaei, M., Zhao, Q.: Centroid index: cluster level similarity measure. Pattern Recogn. 47(9), 3034–3045 (2014)CrossRef Fränti, P., Rezaei, M., Zhao, Q.: Centroid index: cluster level similarity measure. Pattern Recogn. 47(9), 3034–3045 (2014)CrossRef
11.
go back to reference Gabryel, M.: A bag-of-features algorithm for applications using a NoSQL database. Inf. Softw. Technol. 639, 332–343 (2016)CrossRef Gabryel, M.: A bag-of-features algorithm for applications using a NoSQL database. Inf. Softw. Technol. 639, 332–343 (2016)CrossRef
12.
15.
go back to reference Hruschka, E.R., de Castro, L.N., Campello, R.J.: Evolutionary algorithms for clustering gene-expression data. In: Fourth IEEE International Conference on Data Mining, ICDM 2004, pp. 403–406. IEEE (2004) Hruschka, E.R., de Castro, L.N., Campello, R.J.: Evolutionary algorithms for clustering gene-expression data. In: Fourth IEEE International Conference on Data Mining, ICDM 2004, pp. 403–406. IEEE (2004)
16.
go back to reference Jain, A., Dubes, R.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)MATH Jain, A., Dubes, R.: Algorithms for Clustering Data. Prentice-Hall, Englewood Cliffs (1988)MATH
18.
go back to reference Lago-Fernández, L.F., Corbacho, F.: Normality-based validation for crisp clustering. Pattern Recogn. 43(3), 782–795 (2010)MATHCrossRef Lago-Fernández, L.F., Corbacho, F.: Normality-based validation for crisp clustering. Pattern Recogn. 43(3), 782–795 (2010)MATHCrossRef
21.
go back to reference Meng, X., van Dyk, D.: The EM algorithm - an old folk-song sung to a fast new tune. J. Roy. Stat. Soc. Ser. B (Methodol.) 59(3), 511–567 (1997)MathSciNetMATHCrossRef Meng, X., van Dyk, D.: The EM algorithm - an old folk-song sung to a fast new tune. J. Roy. Stat. Soc. Ser. B (Methodol.) 59(3), 511–567 (1997)MathSciNetMATHCrossRef
22.
go back to reference Murtagh, F.: A survey of recent advances in hierarchical clustering algorithms. Comput. J. 26(4), 354–359 (1983)MATHCrossRef Murtagh, F.: A survey of recent advances in hierarchical clustering algorithms. Comput. J. 26(4), 354–359 (1983)MATHCrossRef
23.
go back to reference Pal, N.R., Bezdek, J.C.: On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Syst. 3(3), 370–379 (1995)CrossRef Pal, N.R., Bezdek, J.C.: On cluster validity for the fuzzy c-means model. IEEE Trans. Fuzzy Syst. 3(3), 370–379 (1995)CrossRef
24.
go back to reference Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009)CrossRef Park, H.S., Jun, C.H.: A simple and fast algorithm for K-medoids clustering. Expert Syst. Appl. 36(2), 3336–3341 (2009)CrossRef
25.
go back to reference Rohlf, F.: Single-link clustering algorithms. In: Krishnaiah, P.R, Kanal, L.N. (eds.) Handbook of Statistics, vol. 2, pp. 267–284 (1982) Rohlf, F.: Single-link clustering algorithms. In: Krishnaiah, P.R, Kanal, L.N. (eds.) Handbook of Statistics, vol. 2, pp. 267–284 (1982)
26.
go back to reference Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)MATHCrossRef Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)MATHCrossRef
27.
go back to reference Rutkowski L, Cpałka K.: Compromise approach to neuro-fuzzy systems. In: Sincak, P., Vascak, J., Kvasnicka, V., Pospichal, J. (eds.) Intelligent Technologies - Theory and Applications. New Trends in Intelligent Technologies. Frontiers in Artificial Intelligence and Applications, vol. 76, pp. 85–90 (2002) Rutkowski L, Cpałka K.: Compromise approach to neuro-fuzzy systems. In: Sincak, P., Vascak, J., Kvasnicka, V., Pospichal, J. (eds.) Intelligent Technologies - Theory and Applications. New Trends in Intelligent Technologies. Frontiers in Artificial Intelligence and Applications, vol. 76, pp. 85–90 (2002)
28.
go back to reference Rutkowski, L., Cpałka, K.: A neuro-fuzzy controller with a compromise fuzzy reasoning. Control Cybern. 31(2), 297–308 (2002)MATH Rutkowski, L., Cpałka, K.: A neuro-fuzzy controller with a compromise fuzzy reasoning. Control Cybern. 31(2), 297–308 (2002)MATH
29.
go back to reference Saha, S., Bandyopadhyay, S.: Some connectivity based cluster validity indices. Appl. Soft Comput. 12(5), 1555–1565 (2012)CrossRef Saha, S., Bandyopadhyay, S.: Some connectivity based cluster validity indices. Appl. Soft Comput. 12(5), 1555–1565 (2012)CrossRef
30.
go back to reference Sameh, A.S., Asoke, K.N.: Development of assessment criteria for clustering algorithms. Pattern Anal. Appl. 12(1), 79–98 (2009)MathSciNetCrossRef Sameh, A.S., Asoke, K.N.: Development of assessment criteria for clustering algorithms. Pattern Anal. Appl. 12(1), 79–98 (2009)MathSciNetCrossRef
32.
go back to reference Shieh, H.-L.: Robust validity index for a modified subtractive clustering algorithm. Appl. Soft Comput. 22, 47–59 (2014)CrossRef Shieh, H.-L.: Robust validity index for a modified subtractive clustering algorithm. Appl. Soft Comput. 22, 47–59 (2014)CrossRef
34.
go back to reference Starczewski, A., Krzyżak, A.: A modification of the silhouette index for the improvement of cluster validity assessment. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9693, pp. 114–124. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39384-1_10CrossRef Starczewski, A., Krzyżak, A.: A modification of the silhouette index for the improvement of cluster validity assessment. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9693, pp. 114–124. Springer, Cham (2016). https://​doi.​org/​10.​1007/​978-3-319-39384-1_​10CrossRef
35.
go back to reference Starczewski, A., Krzyżak, A.: Improvement of the validity index for determination of an appropriate data partitioning. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 159–170. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59060-8_16CrossRef Starczewski, A., Krzyżak, A.: Improvement of the validity index for determination of an appropriate data partitioning. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2017. LNCS (LNAI), vol. 10246, pp. 159–170. Springer, Cham (2017). https://​doi.​org/​10.​1007/​978-3-319-59060-8_​16CrossRef
36.
go back to reference Wu, K.L., Yang, M.S., Hsieh, J.N.: Robust cluster validity indexes. Pattern Recogn. 42, 2541–2550 (2009)MATHCrossRef Wu, K.L., Yang, M.S., Hsieh, J.N.: Robust cluster validity indexes. Pattern Recogn. 42, 2541–2550 (2009)MATHCrossRef
37.
go back to reference Vendramin, L., Campello, R.J., Hruschka, E.R.: Relative clustering validity criteria: a comparative overview. Stat. Anal. Data Min. 3(4), 209–235 (2010)MathSciNet Vendramin, L., Campello, R.J., Hruschka, E.R.: Relative clustering validity criteria: a comparative overview. Stat. Anal. Data Min. 3(4), 209–235 (2010)MathSciNet
38.
go back to reference Zhao, Q., Fränti, P.: WB-index: a sum-of-squares based index for cluster validity. Data Knowl. Eng. 92, 77–89 (2014)CrossRef Zhao, Q., Fränti, P.: WB-index: a sum-of-squares based index for cluster validity. Data Knowl. Eng. 92, 77–89 (2014)CrossRef
Metadata
Title
Improvement of the Simplified Silhouette Validity Index
Authors
Artur Starczewski
Krzysztof Przybyszewski
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-91262-2_39

Premium Partner