Skip to main content

2015 | OriginalPaper | Buchkapitel

An Efficient Kernelized Fuzzy Possibilistic C-Means for High-Dimensional Data Clustering

verfasst von : B. Shanmugapriya, M. Punithavalli

Erschienen in: Computational Vision and Robotics

Verlag: Springer India

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Clustering high-dimensional data has been a major concern owing to the intrinsic sparsity of the data points. Several recent research results signify that in case of high-dimensional data, even the notion of proximity or clustering possibly will not be significant. Fuzzy c-means (FCM) and possibilistic c-means (PCM) have the capability to handle the high-dimensional data, whereas FCM is sensitive to noise and PCM requires appropriate initialization to converge to nearly global minimum. Hence, to overcome this issue, a fuzzy possibilistic c-means (FPCM) with symmetry-based distance measure has been proposed which can find out the number of clusters that exist in a dataset. Also, an efficient kernelized fuzzy possibilistic c-means (KFPCM) algorithm has been proposed for effective clustering results. The proposed KFPCM uses a distance measure which is based on the kernel-induced distance measure. FPCM combines the advantages of both FCM and PCM; moreover, the kernel-induced distance measure helps in obtaining better clustering results in case of high-dimensional data. The proposed KFPCM is evaluated using datasets such as Iris, Wine, Lymphography, Lung Cancer, and Diabetes in terms of clustering accuracy, number of iterations, and execution time. The results prove the effectiveness of the proposed KFPCM.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Yip, K.Y., Cheung, D.W., Ng, M.K.: On discovery of extremely low-dimensional clusters using semi-supervised projected clustering. In: ICDE (2005) Yip, K.Y., Cheung, D.W., Ng, M.K.: On discovery of extremely low-dimensional clusters using semi-supervised projected clustering. In: ICDE (2005)
2.
Zurück zum Zitat Moise, G., et al.: P3C: a robust projected clustering algorithm. Department of Computing Science, University of Alberta Moise, G., et al.: P3C: a robust projected clustering algorithm. Department of Computing Science, University of Alberta
3.
Zurück zum Zitat Aggarwal, C.C., et al.: A framework for projected clustering of high dimensional data streams. In: Proceedings of the 30th VLDB Conference, Toronto, Canada (2004) Aggarwal, C.C., et al.: A framework for projected clustering of high dimensional data streams. In: Proceedings of the 30th VLDB Conference, Toronto, Canada (2004)
4.
Zurück zum Zitat Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensional data. In: SIAM (2004) Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensional data. In: SIAM (2004)
5.
Zurück zum Zitat Kohane, I.S., Kho, A., Butte, A.J.: Microarrays for an Integrative Genomics. MIT Press, Massachusetts (2002) Kohane, I.S., Kho, A., Butte, A.J.: Microarrays for an Integrative Genomics. MIT Press, Massachusetts (2002)
6.
Zurück zum Zitat Raychaudhuri, S., Sutphin, P.D., Chang, J.T., Altman, R.B.: Basic microarray analysis: grouping and feature reduction. Trends Biotechnol. 19(5), 189–193 (2001)CrossRef Raychaudhuri, S., Sutphin, P.D., Chang, J.T., Altman, R.B.: Basic microarray analysis: grouping and feature reduction. Trends Biotechnol. 19(5), 189–193 (2001)CrossRef
7.
Zurück zum Zitat Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. SIGKDD Explor. Newsl. 6(1), 90–105 (2004)CrossRef Parsons, L., Haque, E., Liu, H.: Subspace clustering for high dimensional data: a review. SIGKDD Explor. Newsl. 6(1), 90–105 (2004)CrossRef
8.
Zurück zum Zitat Havens, T.C., Chitta, R., Jain, A.K., Jin, R.: Speedup of Fuzzy and possibilistic kernel C-means for large-scale clustering. Department of Computer Science and Engineering, Michigan State University, East Lansing Havens, T.C., Chitta, R., Jain, A.K., Jin, R.: Speedup of Fuzzy and possibilistic kernel C-means for large-scale clustering. Department of Computer Science and Engineering, Michigan State University, East Lansing
9.
Zurück zum Zitat Günnemann, S., et al.: Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT 2011, Uppsala, Sweden, 22–24 Mar 2011 Günnemann, S., et al.: Subspace clustering for indexing high dimensional data: a main memory index based on local reductions and individual multi-representations. In: Proceedings of the 14th International Conference on Extending Database Technology, EDBT 2011, Uppsala, Sweden, 22–24 Mar 2011
10.
Zurück zum Zitat Zhang, D.-Q., Chen, S.-C.: Kernel-based fuzzy and possibilistic C-means clustering. Nanjing University of Aeronautics and Astronautics, Nanjing Zhang, D.-Q., Chen, S.-C.: Kernel-based fuzzy and possibilistic C-means clustering. Nanjing University of Aeronautics and Astronautics, Nanjing
11.
Zurück zum Zitat Vanisri, D., Loganathan, C.: An efficient fuzzy possibilistic C means with penalized and compensated constraints. Glob. J. Comput. Sci. Technol. 11(3), (2011) Vanisri, D., Loganathan, C.: An efficient fuzzy possibilistic C means with penalized and compensated constraints. Glob. J. Comput. Sci. Technol. 11(3), (2011)
12.
Zurück zum Zitat Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Proceeding of the 7th International Conference on Database Theory (ICDT ’99) 1999 Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Proceeding of the 7th International Conference on Database Theory (ICDT ’99) 1999
13.
Zurück zum Zitat Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the Nearest Neighbor in high dimensional spaces? In: Proceedings of International Conference on Very Large Data Bases (VLDB ’00) 2000 Hinneburg, A., Aggarwal, C.C., Keim, D.A.: What is the Nearest Neighbor in high dimensional spaces? In: Proceedings of International Conference on Very Large Data Bases (VLDB ’00) 2000
14.
Zurück zum Zitat Chu, Y.H., Huang, J.W., Chuang, K.T., Yang, D.N., Chen, M.S.: Density conscious subspace clustering for high-dimensional data. IEEE Trans. Knowl. Data Eng. 22(1), (2010) Chu, Y.H., Huang, J.W., Chuang, K.T., Yang, D.N., Chen, M.S.: Density conscious subspace clustering for high-dimensional data. IEEE Trans. Knowl. Data Eng. 22(1), (2010)
15.
Zurück zum Zitat Frigui, H.: Simultaneous clustering and feature discrimination with applications. In: Advances in Fuzzy Clustering and Feature Discrimination with Applications, pp. 285–312. Wiley, New York (2007) Frigui, H.: Simultaneous clustering and feature discrimination with applications. In: Advances in Fuzzy Clustering and Feature Discrimination with Applications, pp. 285–312. Wiley, New York (2007)
16.
Zurück zum Zitat Sledge, I., Havens, T., Bezdek, J., Keller, J.: Relational duals of cluster validity functions for the C-means family. IEEE Trans. Fuzzy Syst. 18(6), 1160–1170 (2010)CrossRef Sledge, I., Havens, T., Bezdek, J., Keller, J.: Relational duals of cluster validity functions for the C-means family. IEEE Trans. Fuzzy Syst. 18(6), 1160–1170 (2010)CrossRef
17.
Zurück zum Zitat Namkoong, Y., Heo, G., Woo, Y.W. : An extension of possibilistic fuzzy C-means with regularization. In: IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–6 (2010) Namkoong, Y., Heo, G., Woo, Y.W. : An extension of possibilistic fuzzy C-means with regularization. In: IEEE International Conference on Fuzzy Systems (FUZZ), pp. 1–6 (2010)
18.
Zurück zum Zitat Yan, Y., Chen, L.: Hyperspherical possibilistic fuzzy c-means for High dimensional data clustering. In: 7th International Conference on Information, Communications and Signal Processing (2009) Yan, Y., Chen, L.: Hyperspherical possibilistic fuzzy c-means for High dimensional data clustering. In: 7th International Conference on Information, Communications and Signal Processing (2009)
19.
Zurück zum Zitat Wu, W.H., Zhou, J.J.: Possibilistic fuzzy c-means clustering model using kernel methods. Comput. Intell. Model. Control Autom. 2, 465–470 (2005) Wu, W.H., Zhou, J.J.: Possibilistic fuzzy c-means clustering model using kernel methods. Comput. Intell. Model. Control Autom. 2, 465–470 (2005)
20.
Zurück zum Zitat Sun, Y., Liu, G., Xu, K.: A k-means-based projected clustering algorithm. In: Third International Joint Conference on Computational Science and Optimization (CSO), vol. 1, pp. 466–470 (2010) Sun, Y., Liu, G., Xu, K.: A k-means-based projected clustering algorithm. In: Third International Joint Conference on Computational Science and Optimization (CSO), vol. 1, pp. 466–470 (2010)
21.
Zurück zum Zitat Olive, D.J.: Applied Robust Statistics. Carbondale, 62901-4408 (2008) Olive, D.J.: Applied Robust Statistics. Carbondale, 62901-4408 (2008)
22.
Zurück zum Zitat Agrawal, R., Gehrkem, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Haas, L., Tiwary, A. (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 94–105. Seattle ,WA (1998) Agrawal, R., Gehrkem, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: Haas, L., Tiwary, A. (eds) Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 94–105. Seattle ,WA (1998)
23.
Zurück zum Zitat Jia, K., He, M., Cheng, T.: A new similarity measure based robust possibilistic C-means clustering algorithm. Lect. Notes Comput. Sci. 6988, 335–342 (2011)CrossRef Jia, K., He, M., Cheng, T.: A new similarity measure based robust possibilistic C-means clustering algorithm. Lect. Notes Comput. Sci. 6988, 335–342 (2011)CrossRef
Metadaten
Titel
An Efficient Kernelized Fuzzy Possibilistic C-Means for High-Dimensional Data Clustering
verfasst von
B. Shanmugapriya
M. Punithavalli
Copyright-Jahr
2015
Verlag
Springer India
DOI
https://doi.org/10.1007/978-81-322-2196-8_26