nach oben

Erschienen in:

2018 | OriginalPaper | Buchkapitel

Progressive Neural Architecture Search

verfasst von : Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space. Direct comparison under the same search space shows that our method is up to 5 times more efficient than the RL method of Zoph et al. (2018) in terms of number of models evaluated, and 8 times faster in terms of total compute. The structures we discover in this way achieve state of the art classification accuracies on CIFAR-10 and ImageNet.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Convolutional Networks with Adaptive Inference Graphs

Nächstes Kapitel Diverse Image-to-Image Translation via Disentangled Representations

Nur mit Berechtigung zugänglich

The code and checkpoint for the PNAS model trained on ImageNet can be downloaded from the TensorFlow models repository at http://github.com/tensorflow/models/. Also see https://github.com/chenxi116/PNASNet.TF and https://github.com/chenxi116/PNASNet.pytorch for author’s reimplementation.

The depthwise-separable convolutions are in fact two repetitions of ReLU-SepConv-BatchNorm; 1x1 convolutions are also inserted when tensor sizes mismatch.

5 symbols per block, times 5 blocks, times 2 for Normal and Reduction cells.

The number of examples is equal to the number of SGD steps times the batch size. Alternatively, it can be measured in terms of number of epoch (passes through the data), but since different papers use different sized training sets, we avoid this measure. In either case, we assume the number of examples is the same for every model, since none of the methods we evaluate use early stopping.

This additional stage is quite important for NAS, as the NASNet-A cell was originally ranked 70th among the top 250.

Baisero, A., Pokorny, F.T., Ek, C.H.: On a family of decomposable kernels on sequences. CoRR abs/1501.06284 (2015)

Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: ICLR (2017)

Baker, B., Gupta, O., Raskar, R., Naik, N.: Accelerating neural architecture search using performance prediction. CoRR abs/1705.10823 (2017)

Brock, A., Lim, T., Ritchie, J.M., Weston, N.: SMASH: one-shot model architecture search through hypernetworks. In: ICLR (2018)

Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: AAAI (2018)

Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: NIPS (2017)

Cortes, C., Gonzalvo, X., Kuznetsov, V., Mohri, M., Yang, S.: AdaNet: adaptive structural learning of artificial neural networks. In: ICML (2017)

Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)

Devries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. CoRR abs/1708.04552 (2017)

10.

Domhan, T., Springenberg, J.T., Hutter, F.: Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. IJCAI (2015)

11.

Dong, J.D., Cheng, A.C., Juan, D.C., Wei, W., Sun, M.: PPP-Net: platform-aware progressive search for pareto-optimal neural architectures. In: ICLR Workshop (2018)

12.

Elsken, T., Metzen, J.H., Hutter, F.: Simple and efficient architecture search for convolutional neural networks. CoRR abs/1711.04528 (2017)

13.

Grosse, R.B., Salakhutdinov, R., Freeman, W.T., Tenenbaum, J.B.: Exploiting compositionality to explore a large space of model structures. In: UAI (2012)

14.

Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017)

15.

Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. CoRR abs/1709.01507 (2017)

16.

Huang, F., Ash, J.T., Langford, J., Schapire, R.E.: Learning deep resnet blocks sequentially using boosting theory. CoRR abs/1706.04964 (2017)

17.

Hutter, F., Hoos, H.H., Leyton-Brown, K.: Sequential model-based optimization for general algorithm configuration. In: Coello, C.A.C. (ed.) LION 2011. LNCS, vol. 6683, pp. 507–523. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-25566-3_40CrossRef

18.

Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)

19.

Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)

20.

Lin, T.-Y., et al.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48CrossRef

21.

Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: ICLR (2018)

22.

Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with restarts. In: ICLR (2017)

23.

Mendoza, H., Klein, A., Feurer, M., Springenberg, J.T., Hutter, F.: Towards Automatically-Tuned neural networks. In: ICML Workshop on AutoML, pp. 58–65, December 2016

24.

Miikkulainen, R., et al.: Evolving deep neural networks. CoRR abs/1703.00548 (2017)

25.

Negrinho, R., Gordon, G.J.: DeepArchitect: automatically designing and training deep architectures. CoRR abs/1704.08792 (2017)

26.

Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. CoRR abs/1802.03268 (2018)

27.

Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. CoRR abs/1802.01548 (2018)

28.

Real, E., et al.: Large-scale evolution of image classifiers. In: ICML (2017)

29.

Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017)

30.

Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104(1), 148–175 (2016)CrossRef

31.

Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: NIPS (2012)

32.

Stanley, K.O.: Neuroevolution: a different kind of deep learning, July 2017

33.

Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRef

34.

Williams, R.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992)MATH

35.

Xie, L., Yuille, A.L.: Genetic CNN. In: ICCV (2017)

36.

Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CVPR (2017)

37.

Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. CoRR abs/1707.01083 (2017)

38.

Zhang, X., Li, Z., Loy, C.C., Lin, D.: PolyNet: a pursuit of structural diversity in very deep networks. In: CVPR (2017)

39.

Zhong, Z., Yan, J., Liu, C.L.: Practical network blocks design with Q-learning. In: AAAI (2018)

40.

Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: ICLR (2017)

41.

Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: CVPR (2018)

Titel: Progressive Neural Architecture Search
verfasst von: Chenxi Liu
Barret Zoph
Maxim Neumann
Jonathon Shlens
Wei Hua
Li-Jia Li
Li Fei-Fei
Alan Yuille
Jonathan Huang
Kevin Murphy
Verlag: Springer International Publishing
Buch: Computer Vision – ECCV 2018
Print ISBN: 978-3-030-01245-8

Electronic ISBN: 978-3-030-01246-5

Copyright-Jahr: 2018
DOI: https://doi.org/10.1007/978-3-030-01246-5_2

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"