Skip to main content
Erschienen in: Journal of Classification 3/2020

23.08.2019

A General Framework for Dimensionality Reduction of K-Means Clustering

verfasst von: Tong Wu, Yanni Xiao, Muhan Guo, Feiping Nie

Erschienen in: Journal of Classification | Ausgabe 3/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Dimensionality reduction plays an important role in many machine learning and pattern recognition applications. Linear discriminant analysis (LDA) is the most popular supervised dimensionality reduction technique which searches for the projection matrix that makes the data points of different classes to be far from each other while requiring data points of the same class to be close to each other. In this paper, trace ratio LDA is combined with K-means clustering into a unified framework, in which K-means clustering is employed to generate class labels for unlabeled data and LDA is used to investigate low-dimensional representation of data. Therefore, by combining the subspace clustering with dimensionality reduction together, the optimal subspace can be obtained. Differing from other existing dimensionality reduction methods, our novel framework is suitable for different scenarios: supervised, semi-supervised, and unsupervised dimensionality reduction cases. Experimental results on benchmark datasets validate the effectiveness and superiority of our algorithm compared with other relevant techniques.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Belkin, M., & Niyogi, P. (2001). Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in neural information processing systems 14 (pp. 585–591): MIT Press. Belkin, M., & Niyogi, P. (2001). Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in neural information processing systems 14 (pp. 585–591): MIT Press.
Zurück zum Zitat Cai, D., He, X., Han, J. (2007). Semi-supervised discriminant analysis. In 2007 IEEE 11th international conference on computer vision (pp. 1–7): IEEE. Cai, D., He, X., Han, J. (2007). Semi-supervised discriminant analysis. In 2007 IEEE 11th international conference on computer vision (pp. 1–7): IEEE.
Zurück zum Zitat Cai, D., Zhang, C., He, X. (2010). Unsupervised feature selection for multi-cluster data. In ACM SIGKDD international conference on knowledge discovery and data mining (pp. 333–342). Cai, D., Zhang, C., He, X. (2010). Unsupervised feature selection for multi-cluster data. In ACM SIGKDD international conference on knowledge discovery and data mining (pp. 333–342).
Zurück zum Zitat Chen, P., Jiao, L., Liu, F., Zhao, J., Zhao, Z., Liu, S. (2017). Semi-supervised double sparse graphs based discriminant analysis for dimensionality reduction. Pattern Recognition, 61, 361–378.CrossRef Chen, P., Jiao, L., Liu, F., Zhao, J., Zhao, Z., Liu, S. (2017). Semi-supervised double sparse graphs based discriminant analysis for dimensionality reduction. Pattern Recognition, 61, 361–378.CrossRef
Zurück zum Zitat Cui, Y., & Fan, L. (2012). A novel supervised dimensionality reduction algorithm: graph-based fisher analysis. Pattern Recognition, 45(4), 1471–1481.CrossRef Cui, Y., & Fan, L. (2012). A novel supervised dimensionality reduction algorithm: graph-based fisher analysis. Pattern Recognition, 45(4), 1471–1481.CrossRef
Zurück zum Zitat Delac, K., Grgic, M., Grgic, S. (2005). Independent comparative study of pca, ica, and lda on the feret data set. International Journal of Imaging Systems & Technology, 15(5), 252–260.CrossRef Delac, K., Grgic, M., Grgic, S. (2005). Independent comparative study of pca, ica, and lda on the feret data set. International Journal of Imaging Systems & Technology, 15(5), 252–260.CrossRef
Zurück zum Zitat Ding, C., & Peng, H. (2005). Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology, 3(02), 185–205.CrossRef Ding, C., & Peng, H. (2005). Minimum redundancy feature selection from microarray gene expression data. Journal of Bioinformatics and Computational Biology, 3(02), 185–205.CrossRef
Zurück zum Zitat Feng, Z., Yang, M., Zhang, L., Liu, Y., Zhang, D. (2013). Joint discriminative dimensionality reduction and dictionary learning for face recognition. Pattern Recognition, 46(8), 2134–2143.CrossRef Feng, Z., Yang, M., Zhang, L., Liu, Y., Zhang, D. (2013). Joint discriminative dimensionality reduction and dictionary learning for face recognition. Pattern Recognition, 46(8), 2134–2143.CrossRef
Zurück zum Zitat Fukunaga, K. (1972). Introduction to statistical pattern recognition, 2nd edn. New York: Academic Press.MATH Fukunaga, K. (1972). Introduction to statistical pattern recognition, 2nd edn. New York: Academic Press.MATH
Zurück zum Zitat He, X., Cai, D., Yan, S., Zhang, H.-J. (2005). Neighborhood preserving embedding. In Tenth IEEE international conference on computer vision (ICCV’05) Volume 1, (Vol. 2 pp. 1208–1213): IEEE. He, X., Cai, D., Yan, S., Zhang, H.-J. (2005). Neighborhood preserving embedding. In Tenth IEEE international conference on computer vision (ICCV’05) Volume 1, (Vol. 2 pp. 1208–1213): IEEE.
Zurück zum Zitat Hoi, S., Liu, W., Lyu, M., Ma, W.-Y. (2006). Learning distance metrics with contextual constraints for image retrieval. In 2006 IEEE computer society conference on computer vision and pattern recognition, (Vol. 2 pp. 2072–2078): IEEE. Hoi, S., Liu, W., Lyu, M., Ma, W.-Y. (2006). Learning distance metrics with contextual constraints for image retrieval. In 2006 IEEE computer society conference on computer vision and pattern recognition, (Vol. 2 pp. 2072–2078): IEEE.
Zurück zum Zitat Hou, C., Nie, F., Li, X., Yi, D., Wu, Y. (2014). Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Transactions on Cybernetics, 44(6), 793.CrossRef Hou, C., Nie, F., Li, X., Yi, D., Wu, Y. (2014). Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Transactions on Cybernetics, 44(6), 793.CrossRef
Zurück zum Zitat Jia, Y., Nie, F., Zhang, C. (2009). Trace ratio problem revisited. IEEE Transactions on Neural Networks, 20(4), 729–735.CrossRef Jia, Y., Nie, F., Zhang, C. (2009). Trace ratio problem revisited. IEEE Transactions on Neural Networks, 20(4), 729–735.CrossRef
Zurück zum Zitat Kokiopoulou, E., & Saad, Y. (2007). Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(12), 2143–2156.CrossRef Kokiopoulou, E., & Saad, Y. (2007). Orthogonal neighborhood preserving projections: a projection-based dimensionality reduction technique. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(12), 2143–2156.CrossRef
Zurück zum Zitat Li, H., Jiang, T., Zhang, K. (2006). Efficient and robust feature extraction by maximum margin criterion. IEEE Transactions on Neural Networks, 17(1), 157–165.CrossRef Li, H., Jiang, T., Zhang, K. (2006). Efficient and robust feature extraction by maximum margin criterion. IEEE Transactions on Neural Networks, 17(1), 157–165.CrossRef
Zurück zum Zitat Lin, Y.-Y., Liu, T.-L., Chen, H.-T. (2005). Semantic manifold learning for image retrieval. In Proceedings of the 13th annual ACM international conference on multimedia (pp. 249–258): ACM. Lin, Y.-Y., Liu, T.-L., Chen, H.-T. (2005). Semantic manifold learning for image retrieval. In Proceedings of the 13th annual ACM international conference on multimedia (pp. 249–258): ACM.
Zurück zum Zitat Liu, W., Jiang, W., Chang, S.-F. (2008). Relevance aggregation projections for image retrieval. In Proceedings of the 2008 international conference on content-based image and video retrieval (pp. 119–126): ACM. Liu, W., Jiang, W., Chang, S.-F. (2008). Relevance aggregation projections for image retrieval. In Proceedings of the 2008 international conference on content-based image and video retrieval (pp. 119–126): ACM.
Zurück zum Zitat Lyons, M.J., Budynek, J., Akamatsu, S. (1999). Automatic classification of single facial images. Pattern Analysis & Machine Intelligence IEEE Transactions on, 21 (12), 1357–1362.CrossRef Lyons, M.J., Budynek, J., Akamatsu, S. (1999). Automatic classification of single facial images. Pattern Analysis & Machine Intelligence IEEE Transactions on, 21 (12), 1357–1362.CrossRef
Zurück zum Zitat Mahapatra, D. (2017). Semi-supervised learning and graph cuts for consensus based medical image segmentation. Pattern Recognition, 63, 700–709.CrossRef Mahapatra, D. (2017). Semi-supervised learning and graph cuts for consensus based medical image segmentation. Pattern Recognition, 63, 700–709.CrossRef
Zurück zum Zitat Mardia, K.V., Kent, J.T., Bibby, J.M. (2001). Multivariate analysis. Mardia, K.V., Kent, J.T., Bibby, J.M. (2001). Multivariate analysis.
Zurück zum Zitat Nie, F., Xiang, S., Jia, Y., Zhang, C. (2009). Semi-supervised orthogonal discriminant analysis via label propagation. Pattern Recognition, 42(11), 2615–2627.CrossRef Nie, F., Xiang, S., Jia, Y., Zhang, C. (2009). Semi-supervised orthogonal discriminant analysis via label propagation. Pattern Recognition, 42(11), 2615–2627.CrossRef
Zurück zum Zitat Nie, F., Xiang, S., Zhang, C. (2007). Neighborhood minmax projections. In International Joint Conference on Artifical Intelligence (pp. 993–998). Nie, F., Xiang, S., Zhang, C. (2007). Neighborhood minmax projections. In International Joint Conference on Artifical Intelligence (pp. 993–998).
Zurück zum Zitat Niyogi, X. (2004). Locality preserving projections. In Neural information processing systems, (Vol. 16 p. 153): MIT. Niyogi, X. (2004). Locality preserving projections. In Neural information processing systems, (Vol. 16 p. 153): MIT.
Zurück zum Zitat Nutt, C.L., Mani, D.R., Betensky, R.A., Tamayo, P., Cairncross, J.G., Ladd, C., Pohl, U., Hartmann, C., Mclaughlin, M.E., Batchelor, T.T. (2003). Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Research, 63(7), 1602–7. Nutt, C.L., Mani, D.R., Betensky, R.A., Tamayo, P., Cairncross, J.G., Ladd, C., Pohl, U., Hartmann, C., Mclaughlin, M.E., Batchelor, T.T. (2003). Gene expression-based classification of malignant gliomas correlates better with survival than histological classification. Cancer Research, 63(7), 1602–7.
Zurück zum Zitat Pedronette, D.C.G., Gonçalves, F.M.F., Guilherme, I.R. (2018). Unsupervised manifold learning through reciprocal knn graph and connected components for image retrieval tasks. Pattern Recognition, 75, 161–174.CrossRef Pedronette, D.C.G., Gonçalves, F.M.F., Guilherme, I.R. (2018). Unsupervised manifold learning through reciprocal knn graph and connected components for image retrieval tasks. Pattern Recognition, 75, 161–174.CrossRef
Zurück zum Zitat Raducanu, B., & Dornaika, F. (2012). A supervised non-linear dimensionality reduction approach for manifold learning. Pattern Recognition, 45(6), 2432–2444.CrossRef Raducanu, B., & Dornaika, F. (2012). A supervised non-linear dimensionality reduction approach for manifold learning. Pattern Recognition, 45(6), 2432–2444.CrossRef
Zurück zum Zitat Roweis, S.T., & Saul, L.K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.CrossRef Roweis, S.T., & Saul, L.K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.CrossRef
Zurück zum Zitat Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2), 203.CrossRef Singh, D., Febbo, P.G., Ross, K., Jackson, D.G., Manola, J., Ladd, C., Tamayo, P., Renshaw, A.A., D’Amico, A.V., Richie, J.P. (2002). Gene expression correlates of clinical prostate cancer behavior. Cancer Cell, 1(2), 203.CrossRef
Zurück zum Zitat Sugiyama, M. (2007). Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. Journal of Machine Learning Research, 8(May), 1027–1061.MATH Sugiyama, M. (2007). Dimensionality reduction of multimodal labeled data by local fisher discriminant analysis. Journal of Machine Learning Research, 8(May), 1027–1061.MATH
Zurück zum Zitat Sugiyama, M., Idé, T., Nakajima, S., Sese, J. (2010). Semi-supervised local fisher discriminant analysis for dimensionality reduction. Machine Learning, 78 (1-2), 35–61.MathSciNetCrossRef Sugiyama, M., Idé, T., Nakajima, S., Sese, J. (2010). Semi-supervised local fisher discriminant analysis for dimensionality reduction. Machine Learning, 78 (1-2), 35–61.MathSciNetCrossRef
Zurück zum Zitat Tenenbaum, J.B., De Silva, V., Langford, J.C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.CrossRef Tenenbaum, J.B., De Silva, V., Langford, J.C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.CrossRef
Zurück zum Zitat Wang, D., Nie, F., Huang, H., Yan, J., Risacher, S.L., Saykin, A.J., Shen, L. (2013). Structural brain network constrained neuroimaging marker identification for predicting cognitive functions. Inf Process Med Imaging, 23, 536–547. Wang, D., Nie, F., Huang, H., Yan, J., Risacher, S.L., Saykin, A.J., Shen, L. (2013). Structural brain network constrained neuroimaging marker identification for predicting cognitive functions. Inf Process Med Imaging, 23, 536–547.
Zurück zum Zitat Wang, H., Nie, F., Huang, H., Kim, S., Nho, K., Risacher, S.L., Saykin, A.J., Shen, L. (2012). Identifying quantitative trait loci via group-sparse multitask regression and feature selection: an imaging genetics study of the adni cohort. Bioinformatics, 28(2), 229.CrossRef Wang, H., Nie, F., Huang, H., Kim, S., Nho, K., Risacher, S.L., Saykin, A.J., Shen, L. (2012). Identifying quantitative trait loci via group-sparse multitask regression and feature selection: an imaging genetics study of the adni cohort. Bioinformatics, 28(2), 229.CrossRef
Zurück zum Zitat Wang, H., Nie, F., Huang, H., Risacher, S., Ding, C., Saykin, A.J., Shen, L. (2011). Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance. In International conference on computer vision (pp. 557–562). Wang, H., Nie, F., Huang, H., Risacher, S., Ding, C., Saykin, A.J., Shen, L. (2011). Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance. In International conference on computer vision (pp. 557–562).
Zurück zum Zitat Wang, H., Yan, S., Xu, D., Tang, X. (2007). Trace ratio vs. ratio trace for dimensionality reduction. In IEEE conference on computer vision and pattern recognition (pp. 1–8). Wang, H., Yan, S., Xu, D., Tang, X. (2007). Trace ratio vs. ratio trace for dimensionality reduction. In IEEE conference on computer vision and pattern recognition (pp. 1–8).
Zurück zum Zitat Wang, S., Lu, J., Gu, X., Du, H., Yang, J. (2016). Semi-supervised linear discriminant analysis for dimension reduction and classification. Pattern Recognition, 57, 179–189.CrossRef Wang, S., Lu, J., Gu, X., Du, H., Yang, J. (2016). Semi-supervised linear discriminant analysis for dimension reduction and classification. Pattern Recognition, 57, 179–189.CrossRef
Zurück zum Zitat Wang, X., Liu, Y., Nie, F., Huang, H. (2015). Discriminative unsupervised dimensionality reduction. In Proceedings of the 24th international conference on artificial intelligence (pp. 3925–3931): AAAI Press. Wang, X., Liu, Y., Nie, F., Huang, H. (2015). Discriminative unsupervised dimensionality reduction. In Proceedings of the 24th international conference on artificial intelligence (pp. 3925–3931): AAAI Press.
Zurück zum Zitat Wu, H., & Prasad, S. (2018). Semi-supervised dimensionality reduction of hyperspectral imagery using pseudo-labels. Pattern Recognition, 74, 212–224.CrossRef Wu, H., & Prasad, S. (2018). Semi-supervised dimensionality reduction of hyperspectral imagery using pseudo-labels. Pattern Recognition, 74, 212–224.CrossRef
Zurück zum Zitat Yan, S., Xu, D., Zhang, B., Zhang, H.-J., Yang, Q., Lin, S. (2007). Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(1), 40–51.CrossRef Yan, S., Xu, D., Zhang, B., Zhang, H.-J., Yang, Q., Lin, S. (2007). Graph embedding and extensions: a general framework for dimensionality reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29(1), 40–51.CrossRef
Zurück zum Zitat Yu, G., Zhang, G., Domeniconi, C., Yu, Z., You, J. (2012). Semi-supervised classification based on random subspace dimensionality reduction. Pattern Recognition, 45(3), 1119–1135.CrossRef Yu, G., Zhang, G., Domeniconi, C., Yu, Z., You, J. (2012). Semi-supervised classification based on random subspace dimensionality reduction. Pattern Recognition, 45(3), 1119–1135.CrossRef
Zurück zum Zitat Yu, J., & Tian, Q. (2006). Learning image manifolds by semantic subspace projection. In Proceedings of the 14th ACM international conference on multimedia (pp. 297–306): ACM. Yu, J., & Tian, Q. (2006). Learning image manifolds by semantic subspace projection. In Proceedings of the 14th ACM international conference on multimedia (pp. 297–306): ACM.
Zurück zum Zitat Zhang, D., Zhou, Z.-H., Chen, S. (2007). Semi-supervised dimensionality reduction. In SDM, SIAM (pp. 629–634). Zhang, D., Zhou, Z.-H., Chen, S. (2007). Semi-supervised dimensionality reduction. In SDM, SIAM (pp. 629–634).
Zurück zum Zitat Zhang, H., Wu, Q.M.J., Chow, T.W.S., Zhao, M. (2012). A two-dimensional neighborhood preserving projection for appearance-based face recognition. Pattern Recognition, 45(5), 1866–1876.CrossRef Zhang, H., Wu, Q.M.J., Chow, T.W.S., Zhao, M. (2012). A two-dimensional neighborhood preserving projection for appearance-based face recognition. Pattern Recognition, 45(5), 1866–1876.CrossRef
Zurück zum Zitat Zhang, Z., Zhang, Y., Li, F., Zhao, M., Zhang, L., Yan, S. (2017). Discriminative sparse flexible manifold embedding with novel graph for robust visual representation and label propagation. Pattern Recognition, 61, 492–510.CrossRef Zhang, Z., Zhang, Y., Li, F., Zhao, M., Zhang, L., Yan, S. (2017). Discriminative sparse flexible manifold embedding with novel graph for robust visual representation and label propagation. Pattern Recognition, 61, 492–510.CrossRef
Zurück zum Zitat Zhuang, X., & Dai, D. (2007). Improved discriminate analysis for high-dimensional data and its application to face recognition. Pattern Recognition, 40(5), 1570–1578.CrossRef Zhuang, X., & Dai, D. (2007). Improved discriminate analysis for high-dimensional data and its application to face recognition. Pattern Recognition, 40(5), 1570–1578.CrossRef
Metadaten
Titel
A General Framework for Dimensionality Reduction of K-Means Clustering
verfasst von
Tong Wu
Yanni Xiao
Muhan Guo
Feiping Nie
Publikationsdatum
23.08.2019
Verlag
Springer US
Erschienen in
Journal of Classification / Ausgabe 3/2020
Print ISSN: 0176-4268
Elektronische ISSN: 1432-1343
DOI
https://doi.org/10.1007/s00357-019-09342-4

Weitere Artikel der Ausgabe 3/2020

Journal of Classification 3/2020 Zur Ausgabe

Premium Partner