Skip to main content

2018 | OriginalPaper | Buchkapitel

Progressive Neural Architecture Search

verfasst von : Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy

Erschienen in: Computer Vision – ECCV 2018

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

We propose a new method for learning the structure of convolutional neural networks (CNNs) that is more efficient than recent state-of-the-art methods based on reinforcement learning and evolutionary algorithms. Our approach uses a sequential model-based optimization (SMBO) strategy, in which we search for structures in order of increasing complexity, while simultaneously learning a surrogate model to guide the search through structure space. Direct comparison under the same search space shows that our method is up to 5 times more efficient than the RL method of Zoph et al. (2018) in terms of number of models evaluated, and 8 times faster in terms of total compute. The structures we discover in this way achieve state of the art classification accuracies on CIFAR-10 and ImageNet.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
The code and checkpoint for the PNAS model trained on ImageNet can be downloaded from the TensorFlow models repository at http://​github.​com/​tensorflow/​models/​. Also see https://​github.​com/​chenxi116/​PNASNet.​TF and https://​github.​com/​chenxi116/​PNASNet.​pytorch for author’s reimplementation.
 
2
The depthwise-separable convolutions are in fact two repetitions of ReLU-SepConv-BatchNorm; 1x1 convolutions are also inserted when tensor sizes mismatch.
 
3
5 symbols per block, times 5 blocks, times 2 for Normal and Reduction cells.
 
4
The number of examples is equal to the number of SGD steps times the batch size. Alternatively, it can be measured in terms of number of epoch (passes through the data), but since different papers use different sized training sets, we avoid this measure. In either case, we assume the number of examples is the same for every model, since none of the methods we evaluate use early stopping.
 
5
This additional stage is quite important for NAS, as the NASNet-A cell was originally ranked 70th among the top 250.
 
Literatur
1.
Zurück zum Zitat Baisero, A., Pokorny, F.T., Ek, C.H.: On a family of decomposable kernels on sequences. CoRR abs/1501.06284 (2015) Baisero, A., Pokorny, F.T., Ek, C.H.: On a family of decomposable kernels on sequences. CoRR abs/1501.06284 (2015)
2.
Zurück zum Zitat Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: ICLR (2017) Baker, B., Gupta, O., Naik, N., Raskar, R.: Designing neural network architectures using reinforcement learning. In: ICLR (2017)
3.
Zurück zum Zitat Baker, B., Gupta, O., Raskar, R., Naik, N.: Accelerating neural architecture search using performance prediction. CoRR abs/1705.10823 (2017) Baker, B., Gupta, O., Raskar, R., Naik, N.: Accelerating neural architecture search using performance prediction. CoRR abs/1705.10823 (2017)
4.
Zurück zum Zitat Brock, A., Lim, T., Ritchie, J.M., Weston, N.: SMASH: one-shot model architecture search through hypernetworks. In: ICLR (2018) Brock, A., Lim, T., Ritchie, J.M., Weston, N.: SMASH: one-shot model architecture search through hypernetworks. In: ICLR (2018)
5.
Zurück zum Zitat Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: AAAI (2018) Cai, H., Chen, T., Zhang, W., Yu, Y., Wang, J.: Efficient architecture search by network transformation. In: AAAI (2018)
6.
Zurück zum Zitat Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: NIPS (2017) Chen, Y., Li, J., Xiao, H., Jin, X., Yan, S., Feng, J.: Dual path networks. In: NIPS (2017)
7.
Zurück zum Zitat Cortes, C., Gonzalvo, X., Kuznetsov, V., Mohri, M., Yang, S.: AdaNet: adaptive structural learning of artificial neural networks. In: ICML (2017) Cortes, C., Gonzalvo, X., Kuznetsov, V., Mohri, M., Yang, S.: AdaNet: adaptive structural learning of artificial neural networks. In: ICML (2017)
8.
Zurück zum Zitat Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009) Deng, J., Dong, W., Socher, R., Li, L., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR (2009)
9.
Zurück zum Zitat Devries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. CoRR abs/1708.04552 (2017) Devries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. CoRR abs/1708.04552 (2017)
10.
Zurück zum Zitat Domhan, T., Springenberg, J.T., Hutter, F.: Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. IJCAI (2015) Domhan, T., Springenberg, J.T., Hutter, F.: Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. IJCAI (2015)
11.
Zurück zum Zitat Dong, J.D., Cheng, A.C., Juan, D.C., Wei, W., Sun, M.: PPP-Net: platform-aware progressive search for pareto-optimal neural architectures. In: ICLR Workshop (2018) Dong, J.D., Cheng, A.C., Juan, D.C., Wei, W., Sun, M.: PPP-Net: platform-aware progressive search for pareto-optimal neural architectures. In: ICLR Workshop (2018)
12.
Zurück zum Zitat Elsken, T., Metzen, J.H., Hutter, F.: Simple and efficient architecture search for convolutional neural networks. CoRR abs/1711.04528 (2017) Elsken, T., Metzen, J.H., Hutter, F.: Simple and efficient architecture search for convolutional neural networks. CoRR abs/1711.04528 (2017)
13.
Zurück zum Zitat Grosse, R.B., Salakhutdinov, R., Freeman, W.T., Tenenbaum, J.B.: Exploiting compositionality to explore a large space of model structures. In: UAI (2012) Grosse, R.B., Salakhutdinov, R., Freeman, W.T., Tenenbaum, J.B.: Exploiting compositionality to explore a large space of model structures. In: UAI (2012)
14.
Zurück zum Zitat Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017) Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. CoRR abs/1704.04861 (2017)
15.
Zurück zum Zitat Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. CoRR abs/1709.01507 (2017) Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. CoRR abs/1709.01507 (2017)
16.
Zurück zum Zitat Huang, F., Ash, J.T., Langford, J., Schapire, R.E.: Learning deep resnet blocks sequentially using boosting theory. CoRR abs/1706.04964 (2017) Huang, F., Ash, J.T., Langford, J., Schapire, R.E.: Learning deep resnet blocks sequentially using boosting theory. CoRR abs/1706.04964 (2017)
18.
Zurück zum Zitat Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015) Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: ICLR (2015)
19.
Zurück zum Zitat Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009) Krizhevsky, A., Hinton, G.: Learning multiple layers of features from tiny images. Technical report, University of Toronto (2009)
21.
Zurück zum Zitat Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: ICLR (2018) Liu, H., Simonyan, K., Vinyals, O., Fernando, C., Kavukcuoglu, K.: Hierarchical representations for efficient architecture search. In: ICLR (2018)
22.
Zurück zum Zitat Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with restarts. In: ICLR (2017) Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with restarts. In: ICLR (2017)
23.
Zurück zum Zitat Mendoza, H., Klein, A., Feurer, M., Springenberg, J.T., Hutter, F.: Towards Automatically-Tuned neural networks. In: ICML Workshop on AutoML, pp. 58–65, December 2016 Mendoza, H., Klein, A., Feurer, M., Springenberg, J.T., Hutter, F.: Towards Automatically-Tuned neural networks. In: ICML Workshop on AutoML, pp. 58–65, December 2016
24.
Zurück zum Zitat Miikkulainen, R., et al.: Evolving deep neural networks. CoRR abs/1703.00548 (2017) Miikkulainen, R., et al.: Evolving deep neural networks. CoRR abs/1703.00548 (2017)
25.
Zurück zum Zitat Negrinho, R., Gordon, G.J.: DeepArchitect: automatically designing and training deep architectures. CoRR abs/1704.08792 (2017) Negrinho, R., Gordon, G.J.: DeepArchitect: automatically designing and training deep architectures. CoRR abs/1704.08792 (2017)
26.
Zurück zum Zitat Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. CoRR abs/1802.03268 (2018) Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. CoRR abs/1802.03268 (2018)
27.
Zurück zum Zitat Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. CoRR abs/1802.01548 (2018) Real, E., Aggarwal, A., Huang, Y., Le, Q.V.: Regularized evolution for image classifier architecture search. CoRR abs/1802.01548 (2018)
28.
Zurück zum Zitat Real, E., et al.: Large-scale evolution of image classifiers. In: ICML (2017) Real, E., et al.: Large-scale evolution of image classifiers. In: ICML (2017)
29.
Zurück zum Zitat Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017) Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. CoRR abs/1707.06347 (2017)
30.
Zurück zum Zitat Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104(1), 148–175 (2016)CrossRef Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Proc. IEEE 104(1), 148–175 (2016)CrossRef
31.
Zurück zum Zitat Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: NIPS (2012) Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: NIPS (2012)
32.
Zurück zum Zitat Stanley, K.O.: Neuroevolution: a different kind of deep learning, July 2017 Stanley, K.O.: Neuroevolution: a different kind of deep learning, July 2017
33.
Zurück zum Zitat Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRef Stanley, K.O., Miikkulainen, R.: Evolving neural networks through augmenting topologies. Evol. Comput. 10(2), 99–127 (2002)CrossRef
34.
Zurück zum Zitat Williams, R.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992)MATH Williams, R.: Simple statistical gradient-following algorithms for connectionist reinforcement learning. Mach. Learn. 8, 229–256 (1992)MATH
35.
Zurück zum Zitat Xie, L., Yuille, A.L.: Genetic CNN. In: ICCV (2017) Xie, L., Yuille, A.L.: Genetic CNN. In: ICCV (2017)
36.
Zurück zum Zitat Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CVPR (2017) Xie, S., Girshick, R.B., Dollár, P., Tu, Z., He, K.: Aggregated residual transformations for deep neural networks. In: CVPR (2017)
37.
Zurück zum Zitat Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. CoRR abs/1707.01083 (2017) Zhang, X., Zhou, X., Lin, M., Sun, J.: ShuffleNet: an extremely efficient convolutional neural network for mobile devices. CoRR abs/1707.01083 (2017)
38.
Zurück zum Zitat Zhang, X., Li, Z., Loy, C.C., Lin, D.: PolyNet: a pursuit of structural diversity in very deep networks. In: CVPR (2017) Zhang, X., Li, Z., Loy, C.C., Lin, D.: PolyNet: a pursuit of structural diversity in very deep networks. In: CVPR (2017)
39.
Zurück zum Zitat Zhong, Z., Yan, J., Liu, C.L.: Practical network blocks design with Q-learning. In: AAAI (2018) Zhong, Z., Yan, J., Liu, C.L.: Practical network blocks design with Q-learning. In: AAAI (2018)
40.
Zurück zum Zitat Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: ICLR (2017) Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: ICLR (2017)
41.
Zurück zum Zitat Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: CVPR (2018) Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: CVPR (2018)
Metadaten
Titel
Progressive Neural Architecture Search
verfasst von
Chenxi Liu
Barret Zoph
Maxim Neumann
Jonathon Shlens
Wei Hua
Li-Jia Li
Li Fei-Fei
Alan Yuille
Jonathan Huang
Kevin Murphy
Copyright-Jahr
2018
DOI
https://doi.org/10.1007/978-3-030-01246-5_2