nach oben

Erschienen in:

2019 | OriginalPaper | Buchkapitel

Deep Learning Architecture Search by Neuro-Cell-Based Evolution with Function-Preserving Mutations

verfasst von : Martin Wistuba

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

The design of convolutional neural network architectures for a new image data set is a laborious and computational expensive task which requires expert knowledge. We propose a novel neuro-evolutionary technique to solve this problem without human interference. Our method assumes that a convolutional neural network architecture is a sequence of neuro-cells and keeps mutating them using function-preserving operations. This novel combination of approaches has several advantages. We define the network architecture by a sequence of repeating neuro-cells which reduces the search space complexity. Furthermore, these cells are possibly transferable and can be used in order to arbitrarily extend the complexity of the network. Mutations based on function-preserving operations guarantee better parameter initialization than random initialization such that less training time is required per network architecture. Our proposed method finds within 12 GPU hours neural network architectures that can achieve a classification error of about 4% and 24% with only 5.5 and 6.5 million parameters on CIFAR-10 and CIFAR-100, respectively. In comparison to competitor approaches, our method provides similar competitive results but requires orders of magnitudes less search time and in many cases less network parameters.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Hyperparameter Learning for Conditional Kernel Mean Embeddings with Rademacher Complexity Bounds

Nächstes Kapitel VC-Dimension Based Generalization Bounds for Relational Learning

Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: Proceedings of the International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April (2017)

Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)MathSciNetMATH

Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Reinforcement learning for architecture search by network transformation. CoRR abs/1707.04873 (2017)

Chen, T., Goodfellow, I.J., Shlens, J.: Net2Net: accelerating learning via knowledge transfer. In: Proceedings of the International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, 2–4 May (2016)

Chollet, F.: Xception: deep learning with depthwise separable convolutions. CoRR abs/1610.02357 (2016)

Devries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. CoRR abs/1708.04552 (2017)

Diaz, G.I., Fokoue-Nkoutche, A., Nannicini, G., Samulowitz, H.: An effective algorithm for hyperparameter optimization of neural networks. IBM J. Res. Dev. 61(4), 9 (2017)

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, 27–30 June 2016, pp. 770–778 (2016)

Huang, G., Liu, Z., van der Maaten, L., Weinberger, K.Q.: Densely connected convolutional networks. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 2261–2269 (2017)

10.

Huang, G., Sun, Y., Liu, Z., Sedra, D., Weinberger, K.Q.: Deep Networks with Stochastic Depth. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016, Part IV. LNCS, vol. 9908, pp. 646–661. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_39CrossRef

11.

Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6–11 July 2015, pp. 448–456 (2015)

12.

Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report (2009)

13.

Larsson, G., Maire, M., Shakhnarovich, G.: Fractalnet: Ultra-deep neural networks without residuals. In: Proceedings of the International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April (2017)

14.

Liu, C., et al.: Progressive neural architecture search. CoRR abs/1712.00559 (2017)

15.

Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: Proceedings of the International Conference on Learning Representations, ICLR 2018, Vancouver, Canada (2018)

16.

Loshchilov, I., Hutter, F.: SGDR: Stochastic gradient descent with warm restarts. In: Proceedings of the International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April (2017)

17.

Miikkulainen, R., et al.: Evolving deep neural networks. CoRR abs/1703.00548 (2017)

18.

Miller, G.F., Todd, P.M., Hegde, S.U.: Designing neural networks using genetic algorithms. In: Proceedings of the 3rd International Conference on Genetic Algorithms, June 1989, pp. 379–384. George Mason University, Fairfax, Virginia, USA (1989)

19.

Negrinho, R., Gordon, G.J.: Deeparchitect: Automatically designing and training deep architectures. CoRR abs/1704.08792 (2017)

20.

Real, E., et al.: Large-scale evolution of image classifiers. In: Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6–11 August 2017, pp. 2902–2911 (2017)

21.

Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. CoRR abs/1409.1556 (2014)

22.

Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012, Proceedings of a meeting held 3–6 December 2012, Lake Tahoe, Nevada, United States, pp. 2960–2968 (2012)

23.

Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRef

24.

Suganuma, M., Shirakawa, S., Nagao, T.: A genetic programming approach to designing convolutional neural network architectures. In: Proceedings of the Genetic and Evolutionary Computation Conference, GECCO 2017, Berlin, Germany, 15–19 July 2017, pp. 497–504 (2017)

25.

Szegedy, C., et al.: Going deeper with convolutions. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, 7–12 June 2015, pp. 1–9 (2015)

26.

Wistuba, M.: Bayesian optimization combined with successive halving for neural network architecture optimization. In: Proceedings of AutoML@PKDD/ECML 2017, Skopje, Macedonia, 22 September 2017, pp. 2–11 (2017)

27.

Wistuba, M.: Finding competitive network architectures within a day using UCT. CoRR abs/1712.07420 (2017)

28.

Yu, G., Smith, D.K., Zhu, H., Guan, Y., Lam, T.T.Y.: ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods Ecol. Evol. 8(1), 28–36 (2016)CrossRef

29.

Zagoruyko, S., Komodakis, N.: Wide residual networks. In: Proceedings of the British Machine Vision Conference 2016, BMVC 2016, York, UK, 19–22 September 2016 (2016)

30.

Zhong, Z., Yan, J., Liu, C.: Practical network blocks design with q-learning. CoRR abs/1708.05552 (2017)

31.

Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: Proceedings of the International Conference on Learning Representations, ICLR 2017, Toulon, France, 24–26 April (2017)

32.

Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. CoRR abs/1707.07012 (2017)

Titel: Deep Learning Architecture Search by Neuro-Cell-Based Evolution with Function-Preserving Mutations
verfasst von: Martin Wistuba
Verlag: Springer International Publishing
Buch: Machine Learning and Knowledge Discovery in Databases
Print ISBN: 978-3-030-10927-1

Electronic ISBN: 978-3-030-10928-8

Copyright-Jahr: 2019
DOI: https://doi.org/10.1007/978-3-030-10928-8_15

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner