Skip to main content
Erschienen in: Neural Computing and Applications 6/2019

11.12.2018 | EANN 2017

Customised ensemble methodologies for deep learning: Boosted Residual Networks and related approaches

verfasst von: Alan Mosca, George D. Magoulas

Erschienen in: Neural Computing and Applications | Ausgabe 6/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

This paper introduces a family of new customised methodologies for ensembles, called Boosted Residual Networks, which builds a boosted ensemble of residual networks by growing the member network at each round of boosting. The proposed approach combines recent developments in residual networks—a method for creating very deep networks by including a shortcut layer between different groups of layers—with Deep Incremental Boosting, a methodology to train fast ensembles of networks of increasing depth through the use of boosting. Additionally, we explore a simpler variant of Boosted Residual Networks based on bagging, called Bagged Residual Networks. We then analyse how the recent developments in ensemble distillation can improve our results. We demonstrate that the synergy of residual networks and Deep Incremental Boosting has better potential than simply boosting a residual network of fixed structure or using the equivalent Deep Incremental Boosting without the shortcut layers, by permitting the creation of models with better generalisation in significantly less time.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Fußnoten
1
In a few cases BRN is actually faster than DIB, but we believe this to be just noise due to external factors such as system load and affinity of some resulting computational graphs instead of others.
 
Literatur
3.
Zurück zum Zitat Schapire RE, Freund Y (1996) Experiments with a new boosting algorithm. In: Machine learning: proceedings of the thirteenth international conference, pp 148–156 Schapire RE, Freund Y (1996) Experiments with a new boosting algorithm. In: Machine learning: proceedings of the thirteenth international conference, pp 148–156
4.
Zurück zum Zitat Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40(2):139–157CrossRef Dietterich TG (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40(2):139–157CrossRef
5.
Zurück zum Zitat Tramèr F, Kurakin A, Papernot N, Goodfellow I, Boneh D, McDaniel P (2017) Ensemble adversarial training: attacks and defenses. arXiv:1705.07204 Tramèr F, Kurakin A, Papernot N, Goodfellow I, Boneh D, McDaniel P (2017) Ensemble adversarial training: attacks and defenses. arXiv:​1705.​07204
6.
Zurück zum Zitat Mosca A, Magoulas GD (2018) Distillation of deep learning ensembles as a regularisation method. In: Advances in hybridization of intelligent methods, Springer, pp 97–118 Mosca A, Magoulas GD (2018) Distillation of deep learning ensembles as a regularisation method. In: Advances in hybridization of intelligent methods, Springer, pp 97–118
7.
Zurück zum Zitat Mosca A, Magoulas G (2017) Boosted residual networks. In: EANN. 18th international conference on engineering applications of neural networks Mosca A, Magoulas G (2017) Boosted residual networks. In: EANN. 18th international conference on engineering applications of neural networks
8.
Zurück zum Zitat Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MATH Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140MATH
9.
Zurück zum Zitat Mosca A, Magoulas G (2016) Deep incremental boosting. In: Benzmuller C, Sutcliffe G, Rojas R (eds) GCAI 2016. 2nd global conference on artificial intelligence. EPiC series in computing, EasyChair, vol 41, pp 293–302 Mosca A, Magoulas G (2016) Deep incremental boosting. In: Benzmuller C, Sutcliffe G, Rojas R (eds) GCAI 2016. 2nd global conference on artificial intelligence. EPiC series in computing, EasyChair, vol 41, pp 293–302
10.
Zurück zum Zitat Whitley D, Starkweather T, Bogart C (1990) Genetic algorithms and neural networks: optimizing connections and connectivity. Parallel Comput 14(3):347–361CrossRef Whitley D, Starkweather T, Bogart C (1990) Genetic algorithms and neural networks: optimizing connections and connectivity. Parallel Comput 14(3):347–361CrossRef
11.
Zurück zum Zitat Malakooti B, Zhou YQ (1994) Feedforward artificial neural networks for solving discrete multiple criteria decision making problems. Manag Sci 40(11):1542–1561CrossRefMATH Malakooti B, Zhou YQ (1994) Feedforward artificial neural networks for solving discrete multiple criteria decision making problems. Manag Sci 40(11):1542–1561CrossRefMATH
12.
Zurück zum Zitat Płaczek S, Adhikari B (2014) Analysis of multilayer neural networks with direct and cross forward connection. Fundam Inf 133(2–3):227–240MathSciNet Płaczek S, Adhikari B (2014) Analysis of multilayer neural networks with direct and cross forward connection. Fundam Inf 133(2–3):227–240MathSciNet
13.
Zurück zum Zitat Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, OxfordMATH Bishop C (1995) Neural networks for pattern recognition. Oxford University Press, OxfordMATH
14.
Zurück zum Zitat Ripley BD (2007) Pattern recognition and neural networks. Cambridge University Press, CambridgeMATH Ripley BD (2007) Pattern recognition and neural networks. Cambridge University Press, CambridgeMATH
15.
Zurück zum Zitat Raiko T, Valpola H, LeCun Y (2012) Deep learning made easier by linear transformations in perceptrons. In: Artificial intelligence and statistics, pp 924–932 Raiko T, Valpola H, LeCun Y (2012) Deep learning made easier by linear transformations in perceptrons. In: Artificial intelligence and statistics, pp 924–932
16.
Zurück zum Zitat Schraudolph N (1998) Accelerated gradient descent by factor-centering decomposition. Technical report/IDSIA 98 Schraudolph N (1998) Accelerated gradient descent by factor-centering decomposition. Technical report/IDSIA 98
17.
Zurück zum Zitat Schraudolph NN (2012) Centering neural network gradient factors. In: Montavon G, Orr GB, Müller KR (eds) Neural networks: tricks of the trade. Springer, Berlin, pp 205–223CrossRef Schraudolph NN (2012) Centering neural network gradient factors. In: Montavon G, Orr GB, Müller KR (eds) Neural networks: tricks of the trade. Springer, Berlin, pp 205–223CrossRef
18.
Zurück zum Zitat Vatanen T, Raiko T, Valpola H, LeCun Y (2013) Pushing stochastic gradient towards second-order methods—backpropagation learning with transformations in nonlinearities. In: International conference on neural information processing, Springer, pp 442–449 Vatanen T, Raiko T, Valpola H, LeCun Y (2013) Pushing stochastic gradient towards second-order methods—backpropagation learning with transformations in nonlinearities. In: International conference on neural information processing, Springer, pp 442–449
20.
Zurück zum Zitat Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in neural information processing systems, pp 2377–2385 Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. In: Advances in neural information processing systems, pp 2377–2385
22.
23.
Zurück zum Zitat Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Advances in neural information processing systems, pp 3320–3328 Yosinski J, Clune J, Bengio Y, Lipson H (2014) How transferable are features in deep neural networks? In: Advances in neural information processing systems, pp 3320–3328
24.
Zurück zum Zitat Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1717–1724 Oquab M, Bottou L, Laptev I, Sivic J (2014) Learning and transferring mid-level image representations using convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1717–1724
25.
Zurück zum Zitat Veit A, Wilber MJ, Belongie S (2016) Residual networks behave like ensembles of relatively shallow networks. In: Advances in neural information processing systems, pp 550–558 Veit A, Wilber MJ, Belongie S (2016) Residual networks behave like ensembles of relatively shallow networks. In: Advances in neural information processing systems, pp 550–558
26.
Zurück zum Zitat Schapire RE (1990) The strength of weak learnability. Mach Learn 5:197–227 Schapire RE (1990) The strength of weak learnability. Mach Learn 5:197–227
28.
Zurück zum Zitat Mukherjee I, Schapire RE (2013) A theory of multiclass boosting. J Mach Learn Res 14:437–497MathSciNetMATH Mukherjee I, Schapire RE (2013) A theory of multiclass boosting. J Mach Learn Res 14:437–497MathSciNetMATH
29.
Zurück zum Zitat Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969MathSciNetMATH Freund Y, Iyer R, Schapire RE, Singer Y (2003) An efficient boosting algorithm for combining preferences. J Mach Learn Res 4:933–969MathSciNetMATH
31.
Zurück zum Zitat Ba LJ, Caurana R (2014) Do deep nets really need to be deep? In: Advances in neural information processing systems, pp 2654–2662 Ba LJ, Caurana R (2014) Do deep nets really need to be deep? In: Advances in neural information processing systems, pp 2654–2662
32.
Zurück zum Zitat Bucilu C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 535–541 Bucilu C, Caruana R, Niculescu-Mizil A (2006) Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 535–541
34.
Zurück zum Zitat Mosca A, Magoulas GD (2016) Regularizing deep learning ensembles by distillation. In: 6th international workshop on combinations of intelligent methods and applications (CIMA 2016), p 53 Mosca A, Magoulas GD (2016) Regularizing deep learning ensembles by distillation. In: 6th international workshop on combinations of intelligent methods and applications (CIMA 2016), p 53
37.
Zurück zum Zitat Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. vol 4, No. 4. Technical report, University of Toronto Krizhevsky A, Hinton G (2009) Learning multiple layers of features from tiny images. vol 4, No. 4. Technical report, University of Toronto
38.
Zurück zum Zitat Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115(3):211–252MathSciNetCrossRef Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2015) ImageNet large scale visual recognition challenge. Int J Comput Vis (IJCV) 115(3):211–252MathSciNetCrossRef
39.
Zurück zum Zitat Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066 Wan L, Zeiler M, Zhang S, Cun YL, Fergus R (2013) Regularization of neural networks using dropconnect. In: Proceedings of the 30th international conference on machine learning (ICML-13), pp 1058–1066
41.
Zurück zum Zitat Clevert D, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). CoRR arXiv:1511.07289 Clevert D, Unterthiner T, Hochreiter S (2015) Fast and accurate deep network learning by exponential linear units (elus). CoRR arXiv:​1511.​07289
42.
Zurück zum Zitat Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701CrossRefMATH Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701CrossRefMATH
44.
Zurück zum Zitat Mosca A, Magoulas GD (2017) Training convolutional networks with weight-wise adaptive learning rates. In: ESANN 2017 proceedings, European symposium on artificial neural networks, computational intelligence and machine learning. Bruges (Belgium), 26–28 April 2017, i6doc.com publ Mosca A, Magoulas GD (2017) Training convolutional networks with weight-wise adaptive learning rates. In: ESANN 2017 proceedings, European symposium on artificial neural networks, computational intelligence and machine learning. Bruges (Belgium), 26–28 April 2017, i6doc.com publ
45.
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034 He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on imagenet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
47.
Zurück zum Zitat Ciccone M, Gallieri M, Masci J, Osendorfer C, Gomez F (2018) NAIS-Net: stable deep networks from non-autonomous differential equations. CoRR arXiv:1804.07209 Ciccone M, Gallieri M, Masci J, Osendorfer C, Gomez F (2018) NAIS-Net: stable deep networks from non-autonomous differential equations. CoRR arXiv:​1804.​07209
Metadaten
Titel
Customised ensemble methodologies for deep learning: Boosted Residual Networks and related approaches
verfasst von
Alan Mosca
George D. Magoulas
Publikationsdatum
11.12.2018
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 6/2019
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-018-3922-2

Weitere Artikel der Ausgabe 6/2019

Neural Computing and Applications 6/2019 Zur Ausgabe