Skip to main content
Erschienen in: Neural Processing Letters 1/2021

07.01.2021

Deep Neural Networks Regularization Using a Combination of Sparsity Inducing Feature Selection Methods

verfasst von: Fatemeh Farokhmanesh, Mohammad Taghi Sadeghi

Erschienen in: Neural Processing Letters | Ausgabe 1/2021

Einloggen

Aktivieren Sie unsere intelligente Suche um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Deep learning is an important subcategory of machine learning approaches in which there is a hope of replacing man-made features with fully automatic extracted features. However, in deep learning, we are generally facing a very high dimensional feature space. This may lead to overfitting problem which is tried to be prevented by applying regularization techniques. In this framework, the sparse representation based feature selection and regularization methods are very attractive. This is because of the nature of the sparse methods which represent a data with as less as possible non-zero coefficients. In this paper, we utilize a variety of sparse representation based methods for regularizing of deep neural networks. For this purpose, first, the effects of three basic sparsity inducing methods are studied. These are the Least Square Regression, Sparse Group Lasso (SGL) and Correntropy inducing Robust Feature Selection (CRFS) methods. Then, in order to improve the regularization process, three combinations of the basic methods are proposed. This study is performed considering a simple fully connected deep neural network and a VGG-like network. Our experimental results show that, overall, the combined methods outperform the basic ones. Considering two important factors of the amount of induced sparsity and classification accuracy, the combination of the CRFS and SGL methods leads to very successful results in deep neural network.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH
2.
Zurück zum Zitat Krogh A, Hertz JA (1992) A simple weight decay can improve generalization. In: Advances in neural information processing systems, pp 950–957 Krogh A, Hertz JA (1992) A simple weight decay can improve generalization. In: Advances in neural information processing systems, pp 950–957
3.
Zurück zum Zitat MacKay DJC (1995) Probable networks and plausible predictions: a review of practical Bayesian methods for supervised neural networks. Netw Comput Neural Syst 6:469–505CrossRef MacKay DJC (1995) Probable networks and plausible predictions: a review of practical Bayesian methods for supervised neural networks. Netw Comput Neural Syst 6:469–505CrossRef
4.
Zurück zum Zitat Weigend AS, Rumelhart DE, Huberman BA (1991) Generalization by weight-elimination with application to forecasting. In: Advances in neural information processing systems, pp 875–882 Weigend AS, Rumelhart DE, Huberman BA (1991) Generalization by weight-elimination with application to forecasting. In: Advances in neural information processing systems, pp 875–882
5.
Zurück zum Zitat Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. In: Advances in neural information processing systems, pp 1135–1143 Han S, Pool J, Tran J, Dally W (2015) Learning both weights and connections for efficient neural network. In: Advances in neural information processing systems, pp 1135–1143
6.
Zurück zum Zitat Denil M, Shakibi B, Dinh L, De Freitas N, et al. (2013) Predicting parameters in deep learning. In: Advances in neural information processing systems, pp 2148–2156 Denil M, Shakibi B, Dinh L, De Freitas N, et al. (2013) Predicting parameters in deep learning. In: Advances in neural information processing systems, pp 2148–2156
7.
Zurück zum Zitat Sainath TN, Kingsbury B, Sindhwani V, Arisoy E, Ramabhadran B (2013) Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In: 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp 6655–6659 Sainath TN, Kingsbury B, Sindhwani V, Arisoy E, Ramabhadran B (2013) Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In: 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp 6655–6659
9.
Zurück zum Zitat Wang H, Nie F, Huang H, Ding C (2013) Heterogeneous visual features fusion via sparse multimodal machine. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3097–3102 Wang H, Nie F, Huang H, Ding C (2013) Heterogeneous visual features fusion via sparse multimodal machine. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3097–3102
10.
Zurück zum Zitat Scardapane S, Comminiello D, Hussain A, Uncini A (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89CrossRef Scardapane S, Comminiello D, Hussain A, Uncini A (2017) Group sparse regularization for deep neural networks. Neurocomputing 241:81–89CrossRef
11.
Zurück zum Zitat Wen W, Wu C, Wang Y, Chen Y, Li H (2016) Learning structured sparsity in deep neural networks. In: Advances in neural information processing systems, pp 2074–2082 Wen W, Wu C, Wang Y, Chen Y, Li H (2016) Learning structured sparsity in deep neural networks. In: Advances in neural information processing systems, pp 2074–2082
12.
13.
Zurück zum Zitat He R, Tan T, Wang L, Zheng WS (2012) l 2, 1 regularized correntropy for robust feature selection. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 2504–2511 He R, Tan T, Wang L, Zheng WS (2012) l 2, 1 regularized correntropy for robust feature selection. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 2504–2511
14.
Zurück zum Zitat Strutz T (2010) Data fitting and uncertainty: a practical introduction to weighted least squares and beyond. Vieweg and Teubner, Berlin Strutz T (2010) Data fitting and uncertainty: a practical introduction to weighted least squares and beyond. Vieweg and Teubner, Berlin
15.
Zurück zum Zitat Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst 23(11):1738–1754CrossRef Xiang S, Nie F, Meng G, Pan C, Zhang C (2012) Discriminative least squares regression for multiclass classification and feature selection. IEEE Trans Neural Netw Learn Syst 23(11):1738–1754CrossRef
16.
Zurück zum Zitat Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT Press, CambridgeMATH Goodfellow I, Bengio Y, Courville A, Bengio Y (2016) Deep learning, vol 1. MIT Press, CambridgeMATH
17.
Zurück zum Zitat Prechelt L (1998) Early stopping-but when? In: Neural networks: tricks of the trade. Springer, pp 55–69 Prechelt L (1998) Early stopping-but when? In: Neural networks: tricks of the trade. Springer, pp 55–69
18.
Zurück zum Zitat Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556 Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:​14091556
19.
Zurück zum Zitat Nie F, Huang H, Cai X, Ding CH (2010) Efficient and robust feature selection via joint \(l_{2,1}\)-norms minimization. In: Advances in neural information processing systems, pp 1813–1821 Nie F, Huang H, Cai X, Ding CH (2010) Efficient and robust feature selection via joint \(l_{2,1}\)-norms minimization. In: Advances in neural information processing systems, pp 1813–1821
20.
Zurück zum Zitat Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Stat Methodol) 68(1):49–67MathSciNetCrossRef Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B (Stat Methodol) 68(1):49–67MathSciNetCrossRef
21.
Zurück zum Zitat Zhou J, Liu J, Narayan VA, Ye J (2012) Modeling disease progression via fused sparse group lasso. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1095–1103 Zhou J, Liu J, Narayan VA, Ye J (2012) Modeling disease progression via fused sparse group lasso. In: Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, pp 1095–1103
22.
Zurück zum Zitat Gui J, Sun Z, Ji S, Tao D, Tan T (2017) Feature selection based on structured sparsity: a comprehensive study. IEEE Trans Neural Netw Learn Syst 28(7):1490–1507MathSciNetCrossRef Gui J, Sun Z, Ji S, Tao D, Tan T (2017) Feature selection based on structured sparsity: a comprehensive study. IEEE Trans Neural Netw Learn Syst 28(7):1490–1507MathSciNetCrossRef
23.
Zurück zum Zitat Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical reports Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. Technical reports
24.
Zurück zum Zitat Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process Mag 29(6):141–142CrossRef Deng L (2012) The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process Mag 29(6):141–142CrossRef
25.
Zurück zum Zitat Bayer C, Enge-Rosenblatt O, Bator M, Mönks U, Dicks A, Lohweg V (2013) Sensorless drive diagnosis using automated feature extraction, significance ranking and reduction. ETFA 2013:1–4 Bayer C, Enge-Rosenblatt O, Bator M, Mönks U, Dicks A, Lohweg V (2013) Sensorless drive diagnosis using automated feature extraction, significance ranking and reduction. ETFA 2013:1–4
26.
Zurück zum Zitat Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electron Agric 24(3):131–151CrossRef Blackard JA, Dean DJ (1999) Comparative accuracies of artificial neural networks and discriminant analysis in predicting forest cover types from cartographic variables. Comput Electron Agric 24(3):131–151CrossRef
27.
Zurück zum Zitat Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M et al (2001) Classification of human lung carcinomas by mrna expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci 98(24):13790–13795CrossRef Bhattacharjee A, Richards WG, Staunton J, Li C, Monti S, Vasa P, Ladd C, Beheshti J, Bueno R, Gillette M et al (2001) Classification of human lung carcinomas by mrna expression profiling reveals distinct adenocarcinoma subclasses. Proc Natl Acad Sci 98(24):13790–13795CrossRef
28.
Zurück zum Zitat Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323 Glorot X, Bordes A, Bengio Y (2011) Deep sparse rectifier neural networks. In: Proceedings of the fourteenth international conference on artificial intelligence and statistics, pp 315–323
29.
Zurück zum Zitat Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256 Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp 249–256
30.
Zurück zum Zitat Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: Machine learning in python. J Mach Learn Res 12(Oct):2825–2830MathSciNetMATH Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: Machine learning in python. J Mach Learn Res 12(Oct):2825–2830MathSciNetMATH
31.
Zurück zum Zitat Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: A CPU and GPU math compiler in python. In: Proceedings of the 9th python in science conference, vol 1 Bergstra J, Breuleux O, Bastien F, Lamblin P, Pascanu R, Desjardins G, Turian J, Warde-Farley D, Bengio Y (2010) Theano: A CPU and GPU math compiler in python. In: Proceedings of the 9th python in science conference, vol 1
Metadaten
Titel
Deep Neural Networks Regularization Using a Combination of Sparsity Inducing Feature Selection Methods
verfasst von
Fatemeh Farokhmanesh
Mohammad Taghi Sadeghi
Publikationsdatum
07.01.2021
Verlag
Springer US
Erschienen in
Neural Processing Letters / Ausgabe 1/2021
Print ISSN: 1370-4621
Elektronische ISSN: 1573-773X
DOI
https://doi.org/10.1007/s11063-020-10389-3

Weitere Artikel der Ausgabe 1/2021

Neural Processing Letters 1/2021 Zur Ausgabe

Neuer Inhalt