Skip to main content
Erschienen in: Neural Computing and Applications 15/2021

29.01.2021 | Original Article

Deep neural networks architecture driven by problem-specific information

verfasst von: Daniel Urda, Francisco J. Veredas, Javier González-Enrique, Juan J. Ruiz-Aguilar, Jose M. Jerez, Ignacio J. Turias

Erschienen in: Neural Computing and Applications | Ausgabe 15/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Deep learning provides a variety of neural network-based models, known as deep neural networks (DNNs), which are being successfully used in several domains to build highly accurate predictors. A key factor which usually makes DNNs to outperform traditional machine learning models is the amount of data that is nowadays accessible and available. Nevertheless, there are other factors linked to DNNs topologies that may also have influence on the predictive performance of DNN models. In particular, fully connected deep neural networks (fc-DNNs) typically struggle in achieving good performance rates when applied to small datasets. This is due to the high number of parameters which need to be learned when training this kind of models, which makes them prone to over-fitting issues. In this paper, authors propose the use of problem-specific information in order to impose constraints to network architecture so that a fc-DNN is transformed into a partially connected DNN (pc-DNN), in such a way that network topology is driven by prior knowledge. This work compares two baseline models, the elastic net and fc-DNNs, to pc-DNNs applied on three synthetic datasets with different number of samples. Synthetic data was generated to estimate the goodness of using problem-specific information to drive network architectures. Furthermore, a similar analysis is performed herein on a real-world problem dataset to show the benefits of pc-DNN models in term of predictive performance. The results of the analysis showed that pc-DNNs with built-in problem-specific information clearly outperformed the elastic net and fc-DNNs in most of the datasets used, in either synthetic or real-world problems. The pc-DNNs turned out to be a useful model, especially when it is applied to small- or medium-size datasets, on which it significantly outperformed the baseline models considered in this study. Specifically, the pc-DNNs achieved AUC and MSE improvement rates of (\(8.21\%\), \(19.79\%\)) and (\(6.65\%\), \(20.54\%\)) in small- and medium-size datasets for both case studies analyzed, the synthetic and real-world problem, respectively.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444CrossRef
2.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. CVPR. IEEE Computer Society, Washington, pp 770–778 He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. CVPR. IEEE Computer Society, Washington, pp 770–778
3.
Zurück zum Zitat Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25. Curran Associates, Inc., New York, pp 1097–1105 Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (eds) Advances in neural information processing systems 25. Curran Associates, Inc., New York, pp 1097–1105
4.
Zurück zum Zitat Cao Y, Geddes TA, Hwa Yang JY, Yang P (2020) Ensemble deep learning in bioinformatics. Nat Mach Intell 2(9):500–508CrossRef Cao Y, Geddes TA, Hwa Yang JY, Yang P (2020) Ensemble deep learning in bioinformatics. Nat Mach Intell 2(9):500–508CrossRef
5.
Zurück zum Zitat Angermueller C, Pärnamaa T, Parts L, Stegle O (2016) Deep learning for computational biology. Mol Syst Biol 12(7):878CrossRef Angermueller C, Pärnamaa T, Parts L, Stegle O (2016) Deep learning for computational biology. Mol Syst Biol 12(7):878CrossRef
6.
Zurück zum Zitat Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–118CrossRef Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639):115–118CrossRef
7.
Zurück zum Zitat Litjens G, Sánchez CI, Timofeeva N, Hermsen M, Nagtegaal I, Kovacs I, Hulsbergen-Van De Kaa C, Bult P, Van Ginneken B, Van Der Laak J (2016) Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci Rep 6(26):286 Litjens G, Sánchez CI, Timofeeva N, Hermsen M, Nagtegaal I, Kovacs I, Hulsbergen-Van De Kaa C, Bult P, Van Ginneken B, Van Der Laak J (2016) Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci Rep 6(26):286
8.
Zurück zum Zitat Amodei D, Ananthanarayanan S, et al (2016) Deep speech 2 : end-to-end speech recognition in english and mandarin. In: Balcan MF, Weinberger KQ (eds.) Proceedings of The 33rd international conference on machine learning, proceedings of machine learning research. PMLR. vol. 48, pp. 173–182 Amodei D, Ananthanarayanan S, et al (2016) Deep speech 2 : end-to-end speech recognition in english and mandarin. In: Balcan MF, Weinberger KQ (eds.) Proceedings of The 33rd international conference on machine learning, proceedings of machine learning research. PMLR. vol. 48, pp. 173–182
9.
Zurück zum Zitat Shang C, Yang F, Huang D, Lyu W (2014) Data-driven soft sensor development based on deep learning technique. J Process Control 24(3):223–233CrossRef Shang C, Yang F, Huang D, Lyu W (2014) Data-driven soft sensor development based on deep learning technique. J Process Control 24(3):223–233CrossRef
10.
Zurück zum Zitat Lee D, Kang S, Shin J (2017) Using deep learning techniques to forecast environmental consumption level. Sustain Sci Pract Policy 9(10):1894 Lee D, Kang S, Shin J (2017) Using deep learning techniques to forecast environmental consumption level. Sustain Sci Pract Policy 9(10):1894
11.
Zurück zum Zitat Banko M, Brill E (2001) Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th annual meeting on association for computational linguistics, ACL ’01, pp 26–33 Banko M, Brill E (2001) Scaling to very very large corpora for natural language disambiguation. In: Proceedings of the 39th annual meeting on association for computational linguistics, ACL ’01, pp 26–33
12.
Zurück zum Zitat Pereira F, Norvig P, Halevy A (2009) The unreasonable effectiveness of data. IEEE Intell Syst 24:8–12 Pereira F, Norvig P, Halevy A (2009) The unreasonable effectiveness of data. IEEE Intell Syst 24:8–12
13.
Zurück zum Zitat Koumakis L (2020) Deep learning models in genomics; are we there yet? Comput Struct Biotechnol J 18:1466–1473CrossRef Koumakis L (2020) Deep learning models in genomics; are we there yet? Comput Struct Biotechnol J 18:1466–1473CrossRef
14.
Zurück zum Zitat Tobore I, Li J, Yuhang L, Al-Handarish Y, Kandwal A, Nie Z, Wang L (2019) Deep learning intervention for health care challenges: some biomedical domain considerations. JMIR mHealth uHealth 7(8):e11966CrossRef Tobore I, Li J, Yuhang L, Al-Handarish Y, Kandwal A, Nie Z, Wang L (2019) Deep learning intervention for health care challenges: some biomedical domain considerations. JMIR mHealth uHealth 7(8):e11966CrossRef
18.
Zurück zum Zitat Antoniou A, Storkey A, Edwards H (2018) Data augmentation generative adversarial networks Antoniou A, Storkey A, Edwards H (2018) Data augmentation generative adversarial networks
19.
21.
Zurück zum Zitat Nusrat I, Jang SB (2018) A comparison of regularization techniques in deep neural networks. Symmetry 10(11):648CrossRef Nusrat I, Jang SB (2018) A comparison of regularization techniques in deep neural networks. Symmetry 10(11):648CrossRef
25.
Zurück zum Zitat Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357MATHCrossRef Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357MATHCrossRef
26.
Zurück zum Zitat Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks
27.
Zurück zum Zitat Moreno-Barea FJ, Strazzera F, Jerez JM, Urda D, Franco L (2018) Forward noise adjustment scheme for data augmentation. In: 2018 IEEE symposium series on computational intelligence (SSCI), pp 728–734 Moreno-Barea FJ, Strazzera F, Jerez JM, Urda D, Franco L (2018) Forward noise adjustment scheme for data augmentation. In: 2018 IEEE symposium series on computational intelligence (SSCI), pp 728–734
28.
Zurück zum Zitat Kingma DP, Welling M (2014) Auto-Encoding Variational Bayes. In: 2nd international conference on learning representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings Kingma DP, Welling M (2014) Auto-Encoding Variational Bayes. In: 2nd international conference on learning representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings
31.
Zurück zum Zitat Pan SJ, Yang Q et al (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef Pan SJ, Yang Q et al (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359CrossRef
34.
Zurück zum Zitat Wei W,Meng D, Zhao Q, Xu Z, Wu (2019) emi-supervised transfer learning for image rain removal. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR) Wei W,Meng D, Zhao Q, Xu Z, Wu (2019) emi-supervised transfer learning for image rain removal. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR)
35.
Zurück zum Zitat Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67MathSciNetMATH Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, Zhou Y, Li W, Liu PJ (2020) Exploring the limits of transfer learning with a unified text-to-text transformer. J Mach Learn Res 21(140):1–67MathSciNetMATH
37.
Zurück zum Zitat López-García G, Jerez JM, Franco L, Veredas FJ (2020) Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data. PLoS One 15(3):e0230536CrossRef López-García G, Jerez JM, Franco L, Veredas FJ (2020) Transfer learning with convolutional neural networks for cancer survival prediction using gene-expression data. PLoS One 15(3):e0230536CrossRef
39.
Zurück zum Zitat Mocanu DC, Mocanu E, Stone P, Nguyen PH, Gibescu M, Liotta A (2018) Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science. Nat Commun 9(1):2383CrossRef Mocanu DC, Mocanu E, Stone P, Nguyen PH, Gibescu M, Liotta A (2018) Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science. Nat Commun 9(1):2383CrossRef
43.
Zurück zum Zitat Urda D, Aragón F, Bautista R, Franco L, Veredas FJ, Claros MG, Jerez JM (2018) BLASSO: integration of biological knowledge into a regularized linear model. BMC Syst Biol 12(Suppl 5):94CrossRef Urda D, Aragón F, Bautista R, Franco L, Veredas FJ, Claros MG, Jerez JM (2018) BLASSO: integration of biological knowledge into a regularized linear model. BMC Syst Biol 12(Suppl 5):94CrossRef
44.
Zurück zum Zitat Frecon J, Salzo S, Pontil M (2018) Bilevel learning of the group lasso structure. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems 31. Curran Associates, Inc., New York, pp 8301–8311 Frecon J, Salzo S, Pontil M (2018) Bilevel learning of the group lasso structure. In: Bengio S, Wallach H, Larochelle H, Grauman K, Cesa-Bianchi N, Garnett R (eds) Advances in neural information processing systems 31. Curran Associates, Inc., New York, pp 8301–8311
48.
49.
Zurück zum Zitat Tibshirani R (1996) Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc Ser B (Stat Methodol) 58(1):267–288MATH Tibshirani R (1996) Regression shrinkage and selection via the lasso: a retrospective. J R Stat Soc Ser B (Stat Methodol) 58(1):267–288MATH
50.
Zurück zum Zitat Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRef Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRef
52.
Zurück zum Zitat Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on international conference on machine learning, ICML’10, pp 807–814 Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on international conference on machine learning, ICML’10, pp 807–814
54.
Zurück zum Zitat Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(1):1929–1958MathSciNetMATH
58.
Zurück zum Zitat Eum KD, Kazemiparkouhi F, Wang B, Manjourides J, Pun V, Pavlu V, Suh H (2019) Long-term NO2 exposures and cause-specific mortality in american older adults. Environ Int 124:10–15CrossRef Eum KD, Kazemiparkouhi F, Wang B, Manjourides J, Pun V, Pavlu V, Suh H (2019) Long-term NO2 exposures and cause-specific mortality in american older adults. Environ Int 124:10–15CrossRef
59.
Zurück zum Zitat Sanyal S, Rochereau T, Maesano CN, Com-Ruelle L, Annesi-Maesano I (2018) Long-Term effect of outdoor air pollution on mortality and morbidity: a 12-year Follow-Up study for metropolitan France. Int J Environ Res Public Health 15(11):2487CrossRef Sanyal S, Rochereau T, Maesano CN, Com-Ruelle L, Annesi-Maesano I (2018) Long-Term effect of outdoor air pollution on mortality and morbidity: a 12-year Follow-Up study for metropolitan France. Int J Environ Res Public Health 15(11):2487CrossRef
61.
Zurück zum Zitat Kříž R (2014) Chaos in nitrogen dioxide concentration time series and its prediction. In: Zelinka I, Suganthan PN, Chen G, Snasel V, Abraham A, Rössler O (eds) Nostradamus 2014: prediction, modeling and analysis of complex systems. Springer International Publishing, Cham, pp 365–376 Kříž R (2014) Chaos in nitrogen dioxide concentration time series and its prediction. In: Zelinka I, Suganthan PN, Chen G, Snasel V, Abraham A, Rössler O (eds) Nostradamus 2014: prediction, modeling and analysis of complex systems. Springer International Publishing, Cham, pp 365–376
62.
Zurück zum Zitat Liu Y, Tian Y, Chen M (2017) Research on the prediction of carbon emission based on the chaos theory and neural network. Int J Bioautom 21(4):339–348 Liu Y, Tian Y, Chen M (2017) Research on the prediction of carbon emission based on the chaos theory and neural network. Int J Bioautom 21(4):339–348
65.
Zurück zum Zitat Izonin I, Greguš ml, M, Tkachenko R, Logoyda M, Mishchuk O, Kynash Y, (2019) Sgd-based wiener polynomial approximation for missing data recovery in air pollution monitoring dataset. In: Rojas I, Joya G, Catala A (eds) Adv Comput Intell. Springer International Publishing, Cham, pp 781–793CrossRef Izonin I, Greguš ml, M, Tkachenko R, Logoyda M, Mishchuk O, Kynash Y, (2019) Sgd-based wiener polynomial approximation for missing data recovery in air pollution monitoring dataset. In: Rojas I, Joya G, Catala A (eds) Adv Comput Intell. Springer International Publishing, Cham, pp 781–793CrossRef
67.
Zurück zum Zitat Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence. Volume 2, IJCAI’95, pp 1137–1143 Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence. Volume 2, IJCAI’95, pp 1137–1143
72.
Zurück zum Zitat Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Int Res 11(1):169–198MATH Opitz D, Maclin R (1999) Popular ensemble methods: an empirical study. J Artif Int Res 11(1):169–198MATH
73.
Zurück zum Zitat Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10:1895–1923CrossRef Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10:1895–1923CrossRef
74.
Zurück zum Zitat Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH
75.
Zurück zum Zitat Lacoste A, Laviolette F, Marchand M (2012) Bayesian comparison of machine learning algorithms on single and multiple datasets. Proc Fifteenth Int Conf Artif Intell Stat 22:665–675 Lacoste A, Laviolette F, Marchand M (2012) Bayesian comparison of machine learning algorithms on single and multiple datasets. Proc Fifteenth Int Conf Artif Intell Stat 22:665–675
Metadaten
Titel
Deep neural networks architecture driven by problem-specific information
verfasst von
Daniel Urda
Francisco J. Veredas
Javier González-Enrique
Juan J. Ruiz-Aguilar
Jose M. Jerez
Ignacio J. Turias
Publikationsdatum
29.01.2021
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 15/2021
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-021-05702-7

Weitere Artikel der Ausgabe 15/2021

Neural Computing and Applications 15/2021 Zur Ausgabe

Premium Partner