Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 4/2019

18-12-2017 | Original Article

Unsupervised feature selection based on self-representation sparse regression and local similarity preserving

Authors: Ronghua Shang, Jiangwei Chang, Licheng Jiao, Yu Xue

Published in: International Journal of Machine Learning and Cybernetics | Issue 4/2019

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Feature selection, as an indispensable method of data preprocessing, has attracted the attention of researchers. In this paper, we propose a new feature selection model called unsupervised feature selection based on self-representation sparse regression and local similarity preserving, i.e., UFSRL. Specifically, UFSRL is sparse reconstruction of the original data itself, rather than fitting low-dimensional embedding, and the manifold learning exerted on UFSRL model to preserve the local similarity of the data. Moreover, the l2,1/2-matrix norm has been imposed on the coefficient matrix, which make the proposed model sparse and robust to noise. In order to solve the proposed model, we design an effective iterative algorithm, and present the analysis of its convergence. Extensive experiments on eight synthetic and real-world data-sets are conducted, and the results of UFSRL compared with six corresponding feature selection algorithms. The experimental results show that UFSRL can effectively identify the feature subset with discriminative while reconstructing the data sparsely, and it is superior to some unsupervised feature selection algorithms in clustering performance.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Literature
1.
go back to reference Gu B, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416MathSciNetCrossRef Gu B, Sheng VS, Tay KY, Romano W, Li S (2015) Incremental support vector learning for ordinal regression. IEEE Trans Neural Netw Learn Syst 26(7):1403–1416MathSciNetCrossRef
3.
go back to reference Tian Q, Chen S (2017) Cross-heterogeneous-database age estimation through correlation representation learning. Neurocomputing 238:286–295CrossRef Tian Q, Chen S (2017) Cross-heterogeneous-database age estimation through correlation representation learning. Neurocomputing 238:286–295CrossRef
4.
go back to reference Mutch J, Lowe DG (2006) Multiclass object recognition with sparse localized features. In: Proceedings IEEE computer society conference on computer vision pattern recognit, pp 11–18 Mutch J, Lowe DG (2006) Multiclass object recognition with sparse localized features. In: Proceedings IEEE computer society conference on computer vision pattern recognit, pp 11–18
5.
go back to reference Li Z, Tang J (2015) Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process 24(12):5343–5355MathSciNetCrossRefMATH Li Z, Tang J (2015) Unsupervised feature selection via nonnegative spectral analysis and redundancy control. IEEE Trans Image Process 24(12):5343–5355MathSciNetCrossRefMATH
6.
go back to reference Gu B, Sheng VS (2016) A robust regularization path algorithm for ν-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248CrossRef Gu B, Sheng VS (2016) A robust regularization path algorithm for ν-support vector classification. IEEE Trans Neural Netw Learn Syst 28(5):1241–1248CrossRef
7.
go back to reference Zhu YY, Liang JW, Chen JY, Ming Z (2017) An improved NSGA-III algorithm for feature selection used in intrusion detection. Knowl Based Syst 116:74–85CrossRef Zhu YY, Liang JW, Chen JY, Ming Z (2017) An improved NSGA-III algorithm for feature selection used in intrusion detection. Knowl Based Syst 116:74–85CrossRef
8.
go back to reference Tang V, Yan H (2012) Noise reduction in microarray gene expression data based on spectral analysis. Int J Mach Learn Cyber 3(1):51–57CrossRef Tang V, Yan H (2012) Noise reduction in microarray gene expression data based on spectral analysis. Int J Mach Learn Cyber 3(1):51–57CrossRef
9.
go back to reference Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for ν-support vector regression. Neural Netw 67:140–150CrossRefMATH Gu B, Sheng VS, Wang Z, Ho D, Osman S, Li S (2015) Incremental learning for ν-support vector regression. Neural Netw 67:140–150CrossRefMATH
10.
go back to reference Wang H, Jing XJ, Niu B (2017) A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data. Knowl Based Syst 126:8–19CrossRef Wang H, Jing XJ, Niu B (2017) A discrete bacterial algorithm for feature selection in classification of microarray gene expression cancer data. Knowl Based Syst 126:8–19CrossRef
11.
go back to reference Wang H, Niu B (2017) A novel bacterial algorithm with randomness control for feature selection in classification. Neurocomputing 228:176–186CrossRef Wang H, Niu B (2017) A novel bacterial algorithm with randomness control for feature selection in classification. Neurocomputing 228:176–186CrossRef
12.
go back to reference Sharma A, Imoto S, Miyano S, Sharma V (2012) Null space based feature selection method for gene expression data. Int J Mach Learn Cybern 3(4):269–276CrossRef Sharma A, Imoto S, Miyano S, Sharma V (2012) Null space based feature selection method for gene expression data. Int J Mach Learn Cybern 3(4):269–276CrossRef
13.
go back to reference Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neutral Netw Learn Syst 23(11):1738–1754CrossRef Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neutral Netw Learn Syst 23(11):1738–1754CrossRef
14.
go back to reference Hu Q, Pan W, An S, Ma P, Wei J (2010) An efficient genes election technique for cancer recognition based on neighborhood mutual information. Int J Mach Learn Cybern 1(1):63–74CrossRef Hu Q, Pan W, An S, Ma P, Wei J (2010) An efficient genes election technique for cancer recognition based on neighborhood mutual information. Int J Mach Learn Cybern 1(1):63–74CrossRef
15.
go back to reference Yu SQ, Chen HF, Wang Q, Shen LL, Huang YZ (2017) Invariant feature extraction for gait recognition using only one uniform model. Neurocomputing 239:81–93CrossRef Yu SQ, Chen HF, Wang Q, Shen LL, Huang YZ (2017) Invariant feature extraction for gait recognition using only one uniform model. Neurocomputing 239:81–93CrossRef
16.
go back to reference Wan MH, Lai ZH (2017) Feature extraction via sparse difference embedding (SDE). KSII Trans Internet Inf Syst 11(7):3594–3607 Wan MH, Lai ZH (2017) Feature extraction via sparse difference embedding (SDE). KSII Trans Internet Inf Syst 11(7):3594–3607
17.
go back to reference MartõÂnez AM, Kak AC (2001) PCA versus LDA. IEEE Trans Pattern Anal Mach Intell 23(3):228–233CrossRef MartõÂnez AM, Kak AC (2001) PCA versus LDA. IEEE Trans Pattern Anal Mach Intell 23(3):228–233CrossRef
18.
go back to reference Tao D, Tang X, Li X, Wu X (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans Pattern Anal Mach Intell 28(7):1088–1099CrossRef Tao D, Tang X, Li X, Wu X (2006) Asymmetric bagging and random subspace for support vector machines-based relevance feedback in image retrieval. IEEE Trans Pattern Anal Mach Intell 28(7):1088–1099CrossRef
19.
go back to reference Gui J, Sun Z, Ji S, Tao D, Tan T (2016) Feature selection based on structured sparsity: a comprehensive study. IEEE Trans Neutral Netw Learn Syst 28(7):1490–1507MathSciNetCrossRef Gui J, Sun Z, Ji S, Tao D, Tan T (2016) Feature selection based on structured sparsity: a comprehensive study. IEEE Trans Neutral Netw Learn Syst 28(7):1490–1507MathSciNetCrossRef
20.
go back to reference Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238CrossRef Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238CrossRef
21.
go back to reference Xu J, Yang G, Man H, He H (2013) L 1 graph based on sparse coding for feature selection. In: Proceedings of international symposium on neural networks (ISNN), pp 594–601 Xu J, Yang G, Man H, He H (2013) L 1 graph based on sparse coding for feature selection. In: Proceedings of international symposium on neural networks (ISNN), pp 594–601
22.
go back to reference Yang JB, Ong C-J (2012) Feature selection based on sparse imputation. In: Proceedings of international joint conference on neural networks (IJCNN), pp 1–7 Yang JB, Ong C-J (2012) Feature selection based on sparse imputation. In: Proceedings of international joint conference on neural networks (IJCNN), pp 1–7
23.
go back to reference Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V (2000) Feature selection for SVMs. In: Proceedings of advances in neural information processing system, vol 12. Cambridge, pp 526–532 Weston J, Mukherjee S, Chapelle O, Pontil M, Poggio T, Vapnik V (2000) Feature selection for SVMs. In: Proceedings of advances in neural information processing system, vol 12. Cambridge, pp 526–532
24.
go back to reference Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, HobokenMATH Duda RO, Hart PE, Stork DG (2001) Pattern classification, 2nd edn. Wiley, HobokenMATH
25.
go back to reference Gu Q, Li Z, Han J (2011) Generalized Fisher score for feature selection. In: Proceedings of 27th conference on uncertainty in artificial intelligence, pp 266–273 Gu Q, Li Z, Han J (2011) Generalized Fisher score for feature selection. In: Proceedings of 27th conference on uncertainty in artificial intelligence, pp 266–273
26.
go back to reference Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422CrossRefMATH Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46:389–422CrossRefMATH
27.
go back to reference Liu HW, Sun JG, Liu L, Zhang HJ (2009) Feature selection with dynamic mutual information. Pattern Recog 42(7):1330–1339CrossRefMATH Liu HW, Sun JG, Liu L, Zhang HJ (2009) Feature selection with dynamic mutual information. Pattern Recog 42(7):1330–1339CrossRefMATH
28.
go back to reference Martínez Sotoca J, Pla F (2010) Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recog 43(6):2068–2081CrossRefMATH Martínez Sotoca J, Pla F (2010) Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recog 43(6):2068–2081CrossRefMATH
29.
go back to reference Ma ZG, Nie FP, Yang Y, Uijlings JRR, Sebe N (2012) Web image annotation via subspace-sparsity collaborated feature selection. IEEE Trans Multimed 14(4):1021–1030CrossRef Ma ZG, Nie FP, Yang Y, Uijlings JRR, Sebe N (2012) Web image annotation via subspace-sparsity collaborated feature selection. IEEE Trans Multimed 14(4):1021–1030CrossRef
30.
go back to reference Zhu X, Ghahramani Z, Lafferty JD (2003) Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of 20th international conference machine learning, pp 912–919 Zhu X, Ghahramani Z, Lafferty JD (2003) Semi-supervised learning using gaussian fields and harmonic functions. In: Proceedings of 20th international conference machine learning, pp 912–919
31.
go back to reference Xu ZL, King IW, Lyu MR, Jin R (2010) Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans Neural Netw 21(7):1033–1047CrossRef Xu ZL, King IW, Lyu MR, Jin R (2010) Discriminative semi-supervised feature selection via manifold regularization. IEEE Trans Neural Netw 21(7):1033–1047CrossRef
32.
go back to reference Liu Y, Nie FP, Wu JG, Chen LH (2010) Semi-supervised feature selection based on label propagation and subset selection. In: Proceedings of ICCIA, pp 293–296 Liu Y, Nie FP, Wu JG, Chen LH (2010) Semi-supervised feature selection based on label propagation and subset selection. In: Proceedings of ICCIA, pp 293–296
33.
go back to reference Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the16th ACM SIGKDD international conference on knowledge discovery and data mining, pp 333–342 Cai D, Zhang C, He X (2010) Unsupervised feature selection for multi-cluster data. In: Proceedings of the16th ACM SIGKDD international conference on knowledge discovery and data mining, pp 333–342
34.
go back to reference Tang JL, Liu H (2012) Unsupervised feature selection for linked social media data. In: Proceedings of KDD, pp 904–912 Tang JL, Liu H (2012) Unsupervised feature selection for linked social media data. In: Proceedings of KDD, pp 904–912
35.
go back to reference Li ZC, Yang Y, Liu J, Zhou XF, Lu HQ (2012) Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of AAAI, pp 1026–1032 Li ZC, Yang Y, Liu J, Zhou XF, Lu HQ (2012) Unsupervised feature selection using nonnegative spectral analysis. In: Proceedings of AAAI, pp 1026–1032
36.
go back to reference Xiang S, Shen X, Ye J (2015) Efficient nonconvex sparse group feature selection via continuous and discrete optimization. Artif Intell 224:28–50MathSciNetCrossRefMATH Xiang S, Shen X, Ye J (2015) Efficient nonconvex sparse group feature selection via continuous and discrete optimization. Artif Intell 224:28–50MathSciNetCrossRefMATH
37.
go back to reference Xie Z, Xu Y (2014) Sparse group lasso based uncertain feature selection. Int J Mach Learn Cybern 5(2):201–210CrossRef Xie Z, Xu Y (2014) Sparse group lasso based uncertain feature selection. Int J Mach Learn Cybern 5(2):201–210CrossRef
38.
go back to reference Cong Y, Wang S, Liu J, Cao J, Yang Y, Luo J (2015) Deep sparse feature selection for computer aided endoscopy diagnosis. Pattern Recognit 48(3):907–917CrossRef Cong Y, Wang S, Liu J, Cao J, Yang Y, Luo J (2015) Deep sparse feature selection for computer aided endoscopy diagnosis. Pattern Recognit 48(3):907–917CrossRef
39.
go back to reference He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. Adv Neural Inf Process Syst 18:507–514 He X, Cai D, Niyogi P (2005) Laplacian score for feature selection. Adv Neural Inf Process Syst 18:507–514
40.
go back to reference Foucart S, Lai MJ (2008) The sparest solutions of underdetermined linear system by lq-minimization for 0 < q ≤ 1. Appl Comput Harmonic Anal 26(3):395–407CrossRef Foucart S, Lai MJ (2008) The sparest solutions of underdetermined linear system by lq-minimization for 0 < q ≤ 1. Appl Comput Harmonic Anal 26(3):395–407CrossRef
41.
go back to reference Chartrand R (2009) Fast algorithms for nonconvex compressive sensing: MRI reconstruction from very few data. In: Proceedings of IEEE international symposium on biomedical imaging, pp 262–265 Chartrand R (2009) Fast algorithms for nonconvex compressive sensing: MRI reconstruction from very few data. In: Proceedings of IEEE international symposium on biomedical imaging, pp 262–265
42.
go back to reference Nie FP, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint L 2,1-norms minimization. In: Proceedings of NIPS, pp 1813–1821 Nie FP, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint L 2,1-norms minimization. In: Proceedings of NIPS, pp 1813–1821
43.
go back to reference Wang L, Chen S, Wang Y (2014) A unified algorithm for mixed l 2,p-minimizations and its application in feature selection. Comput Optim Appl 58(2):409–421MathSciNetCrossRefMATH Wang L, Chen S, Wang Y (2014) A unified algorithm for mixed l 2,p-minimizations and its application in feature selection. Comput Optim Appl 58(2):409–421MathSciNetCrossRefMATH
44.
go back to reference Shi CJ, Ruan QQ, An GY, Zhao RZ (2015) Hessian semi-supervised sparse feature selection based on L 2,1/2-matrix norm. IEEE Trans Mutimed 17(1):16–28 Shi CJ, Ruan QQ, An GY, Zhao RZ (2015) Hessian semi-supervised sparse feature selection based on L 2,1/2-matrix norm. IEEE Trans Mutimed 17(1):16–28
45.
go back to reference Zhu P, Zuo W, Zhang L, Hu Q, Shiu SCK (2015) Unsupervised feature selection by regularized self-representation. Pattern Recognit 48:438–446CrossRefMATH Zhu P, Zuo W, Zhang L, Hu Q, Shiu SCK (2015) Unsupervised feature selection by regularized self-representation. Pattern Recognit 48:438–446CrossRefMATH
46.
go back to reference Zhao Z, Liu H (2007) Spectral feature selection for supervised and unsupervised learning. In: Proceedings of 24th international conference on machine learning, pp 1151–1158 Zhao Z, Liu H (2007) Spectral feature selection for supervised and unsupervised learning. In: Proceedings of 24th international conference on machine learning, pp 1151–1158
47.
go back to reference Zhao Z, Wang L, Liu H (2010) Efficient spectral feature selection with minimum redundancy. In: Proceedings of 24th AAAI conference on artificial intelligence, pp 673–678 Zhao Z, Wang L, Liu H (2010) Efficient spectral feature selection with minimum redundancy. In: Proceedings of 24th AAAI conference on artificial intelligence, pp 673–678
48.
go back to reference Hou C, Nie F, Li X, Yi D, Wu Y (2014) Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans Cybern 44(6):793–804CrossRef Hou C, Nie F, Li X, Yi D, Wu Y (2014) Joint embedding learning and sparse regression: a framework for unsupervised feature selection. IEEE Trans Cybern 44(6):793–804CrossRef
49.
go back to reference Fang X, Xu Y, Li X, Fan Z, Liu H, Chen Y (2014) Locality and similarity preserving embedding for feature selection. Neurocomputing 128:304–315CrossRef Fang X, Xu Y, Li X, Fan Z, Liu H, Chen Y (2014) Locality and similarity preserving embedding for feature selection. Neurocomputing 128:304–315CrossRef
50.
go back to reference Shang R, Zhang Z, Jiao L, Liu C, Li Y (2016) Self-representation based dual-graph regularized feature selection clustering. Neurocomputing 171:1242–1253CrossRef Shang R, Zhang Z, Jiao L, Liu C, Li Y (2016) Self-representation based dual-graph regularized feature selection clustering. Neurocomputing 171:1242–1253CrossRef
51.
go back to reference Yan H, Yang J, Yang JY (2016) Robust Joint feature weights learning framework. IEEE Trans Knowl Data Eng 28(5):1327–1339CrossRef Yan H, Yang J, Yang JY (2016) Robust Joint feature weights learning framework. IEEE Trans Knowl Data Eng 28(5):1327–1339CrossRef
52.
go back to reference Zhao Z, He XF, Cai D, Zhang LJ, Ng W, Zhuang YT (2016) Graph regularized feature selection with data reconstruction. IEEE Trans Knowl Data Eng 28(3):689–700CrossRef Zhao Z, He XF, Cai D, Zhang LJ, Ng W, Zhuang YT (2016) Graph regularized feature selection with data reconstruction. IEEE Trans Knowl Data Eng 28(3):689–700CrossRef
53.
go back to reference Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791CrossRefMATH Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401:788–791CrossRefMATH
54.
go back to reference Liu H, Wu Z, Li X, Cai D, Huang TS (2012) Constrained nonnegative matrix factorization for imagine representation. IEEE Trans Pattern Anal Mach Intell 34(7):1299–1311CrossRef Liu H, Wu Z, Li X, Cai D, Huang TS (2012) Constrained nonnegative matrix factorization for imagine representation. IEEE Trans Pattern Anal Mach Intell 34(7):1299–1311CrossRef
55.
go back to reference Papadimitriou C, Steiglitz K (1998) Combinatorial optimization: algorithms and complexity. Dover, New YorkMATH Papadimitriou C, Steiglitz K (1998) Combinatorial optimization: algorithms and complexity. Dover, New YorkMATH
56.
go back to reference Gibbons J, Dickinson, Chakraborti S (2011) Nonparametric statistical inference. Springer, BerlinCrossRefMATH Gibbons J, Dickinson, Chakraborti S (2011) Nonparametric statistical inference. Springer, BerlinCrossRefMATH
Metadata
Title
Unsupervised feature selection based on self-representation sparse regression and local similarity preserving
Authors
Ronghua Shang
Jiangwei Chang
Licheng Jiao
Yu Xue
Publication date
18-12-2017
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 4/2019
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-017-0760-y

Other articles of this Issue 4/2019

International Journal of Machine Learning and Cybernetics 4/2019 Go to the issue