Skip to main content
Erschienen in: Pattern Analysis and Applications 2/2024

01.06.2024 | Original Article

The limitations of differentiable architecture search

verfasst von: Lacharme Guillaume, Cardot Hubert, Lente Christophe, Monmarche Nicolas

Erschienen in: Pattern Analysis and Applications | Ausgabe 2/2024

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, we will provide a detailed explanation of the limitations behind differentiable architecture search (DARTS). Algorithms based on the DARTS paradigm tend to converge towards degenerate solutions. A degenerate solution corresponds to an architecture with a shallow graph containing mainly skip connections. We have identified 6 sources of errors that could explain this phenomenon. Some of these errors can only be partially eliminated. Therefore, we will propose an innovative solution to remove degenerate solutions from the search space. We will demonstrate the validity of our approach through experiments conducted on the CIFAR10 and CIFAR100 databases. Our code is available at the following link: https://​scm.​univ-tours.​fr/​projetspublics/​lifat/​darts_​ibpria_​sparcity

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Ang A, Ma J, Liu N, Huang K, Wang Y (2021) Fast projection onto the capped simplex with applications to sparse regression in bioinformatics. Neural Inf Process Syst (NeurIPS) Ang A, Ma J, Liu N, Huang K, Wang Y (2021) Fast projection onto the capped simplex with applications to sparse regression in bioinformatics. Neural Inf Process Syst (NeurIPS)
2.
Zurück zum Zitat Balles L, Hennig P (2018) Dissecting adam: the sign, magnitude and variance of stochastic gradients. In: International conference on machine learning (PMLR) Balles L, Hennig P (2018) Dissecting adam: the sign, magnitude and variance of stochastic gradients. In: International conference on machine learning (PMLR)
3.
Zurück zum Zitat Choe H, Na B, Mok J, Yoon S (2021) Variance-stationary differentiable NAS. In: British machine vision conference (BMVC) Choe H, Na B, Mok J, Yoon S (2021) Variance-stationary differentiable NAS. In: British machine vision conference (BMVC)
4.
Zurück zum Zitat Franceschi L, Donini M, Frasconi P, Pontil M (2017) Forward and reverse gradient-based hyperparameter optimization. In: International conference on machine learning (ICML) Franceschi L, Donini M, Frasconi P, Pontil M (2017) Forward and reverse gradient-based hyperparameter optimization. In: International conference on machine learning (ICML)
5.
Zurück zum Zitat Fu J, Luo H, Feng J, Low KH, Chua TS (2016) DrMAD: distilling reverse-mode automatic differentiation for optimizing hyperparameters of deep neural networks. In: International joint conference on artificial intelligence (IJCAI) Fu J, Luo H, Feng J, Low KH, Chua TS (2016) DrMAD: distilling reverse-mode automatic differentiation for optimizing hyperparameters of deep neural networks. In: International joint conference on artificial intelligence (IJCAI)
6.
Zurück zum Zitat Gu YC, Wang LJ, Liu Y, Yang Y, Wu YH, Lu SP, Cheng MM (2021) DOTS: decoupling operation and topology in differentiable architecture search. In: Conference on computer vision and pattern recognition (CVPR) Gu YC, Wang LJ, Liu Y, Yang Y, Wu YH, Lu SP, Cheng MM (2021) DOTS: decoupling operation and topology in differentiable architecture search. In: Conference on computer vision and pattern recognition (CVPR)
7.
Zurück zum Zitat He C, Ye H, Shen L, Zhang T (2020) MiLeNAS: efficient neural architecture search via mixed-level reformulation. In: Conference on computer vision and pattern recognition (CVPR) He C, Ye H, Shen L, Zhang T (2020) MiLeNAS: efficient neural architecture search via mixed-level reformulation. In: Conference on computer vision and pattern recognition (CVPR)
8.
Zurück zum Zitat Hong W, Li G, Zhang W, Tang R, Wang Y, Li Z, Yu Y (2020) DropNAS: grouped operation dropout for differentiable architecture search. In: International joint conference on artificial intelligence (IJCAI) Hong W, Li G, Zhang W, Tang R, Wang Y, Li Z, Yu Y (2020) DropNAS: grouped operation dropout for differentiable architecture search. In: International joint conference on artificial intelligence (IJCAI)
9.
Zurück zum Zitat Hou P, Jin Y, Chen Y (2021) Single-DARTS: towards stable architecture search. In: IEEE/CVF international conference on computer vision workshops (ICCVW) Hou P, Jin Y, Chen Y (2021) Single-DARTS: towards stable architecture search. In: IEEE/CVF international conference on computer vision workshops (ICCVW)
10.
Zurück zum Zitat Kendall MG (1938) A new measure of rank correlation. Biometrika Kendall MG (1938) A new measure of rank correlation. Biometrika
11.
Zurück zum Zitat Kingma DP, Ba JL (2015) ADAM: a method for stochastic optimization. In: International conference on learning representations (ICLR) Kingma DP, Ba JL (2015) ADAM: a method for stochastic optimization. In: International conference on learning representations (ICLR)
12.
Zurück zum Zitat Lacharme G, Cardot H, Lenté C, Monmarché N (2023) DARTS with degeneracy correction. In: Iberian conference on pattern recognition and image analysis (IbPRIA) Lacharme G, Cardot H, Lenté C, Monmarché N (2023) DARTS with degeneracy correction. In: Iberian conference on pattern recognition and image analysis (IbPRIA)
13.
Zurück zum Zitat Lee HB, Lee H, Shin J, Yang E, Hospedales TM, Hwang SJ (2022) online hyperparameter meta-learning with hypergradient distillation. In: International conference on learning representations (ICLR) Lee HB, Lee H, Shin J, Yang E, Hospedales TM, Hwang SJ (2022) online hyperparameter meta-learning with hypergradient distillation. In: International conference on learning representations (ICLR)
14.
Zurück zum Zitat Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2018) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2018) Hyperband: a novel bandit-based approach to hyperparameter optimization. J Mach Learn Res
15.
Zurück zum Zitat Li G, Qian G, Delgadillo IC, Muller M, Thabet A, Ghanem B (2020) SGAS: sequential greedy architecture search. In: Conference on computer vision and pattern recognition (CVPR) Li G, Qian G, Delgadillo IC, Muller M, Thabet A, Ghanem B (2020) SGAS: sequential greedy architecture search. In: Conference on computer vision and pattern recognition (CVPR)
16.
Zurück zum Zitat Liang H, Zhang S, Sun J, He X, Huang W, Zhuang K, Li Z (2021) DARTS+: improved differentiable architecture search with early stopping. http://arxiv.org/abs/1909.06035 Liang H, Zhang S, Sun J, He X, Huang W, Zhuang K, Li Z (2021) DARTS+: improved differentiable architecture search with early stopping. http://arxiv.org/abs/1909.06035
17.
Zurück zum Zitat Lin M, Wang P, Sun Z, Chen H, Sun X, Qian Q, Li H, Jin R (2021) Zen-NAS: a zero-shot NAS for high-performance image recognition. In: International conference on computer vision (ICCV) Lin M, Wang P, Sun Z, Chen H, Sun X, Qian Q, Li H, Jin R (2021) Zen-NAS: a zero-shot NAS for high-performance image recognition. In: International conference on computer vision (ICCV)
18.
Zurück zum Zitat Liu H, Simonyan K, Yang Y (2019) Darts: differentiable architecture search. In: International conference on learning representations (ICLR) Liu H, Simonyan K, Yang Y (2019) Darts: differentiable architecture search. In: International conference on learning representations (ICLR)
19.
Zurück zum Zitat Lorraine J, Vicol P, Duvenaud D (2020) Optimizing millions of hyperparameters by implicit differentiation Lorraine J, Vicol P, Duvenaud D (2020) Optimizing millions of hyperparameters by implicit differentiation
20.
Zurück zum Zitat Loshchilov I, Hutter F (2019) Decoupled weight decay regularization. In: International conference on learning representations (ICLR) Loshchilov I, Hutter F (2019) Decoupled weight decay regularization. In: International conference on learning representations (ICLR)
21.
Zurück zum Zitat Luketina J, Berglund M, Greff K, Raiko T (2016) Scalable gradient-based tuning of continuous regularization hyperparameters. In: International conference on machine learning (ICML) Luketina J, Berglund M, Greff K, Raiko T (2016) Scalable gradient-based tuning of continuous regularization hyperparameters. In: International conference on machine learning (ICML)
22.
Zurück zum Zitat Metz L, Maheswaranathan N, Sun R, Daniel Freeman C, Poole B, Sohl-Dickstein J (2020) Using a thousand optimization tasks to learn hyperparameter search strategies Neural Inf Process Syst (NeurIPS) Metz L, Maheswaranathan N, Sun R, Daniel Freeman C, Poole B, Sohl-Dickstein J (2020) Using a thousand optimization tasks to learn hyperparameter search strategies Neural Inf Process Syst (NeurIPS)
23.
Zurück zum Zitat Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Conference on artificial intelligence (AAAI) Real E, Aggarwal A, Huang Y, Le QV (2019) Regularized evolution for image classifier architecture search. In: Conference on artificial intelligence (AAAI)
24.
Zurück zum Zitat Vicol P, Lorraine JP, Pedregosa F, Duvenaud D, Grosse RB (2022) On implicit bias in overparameterized bilevel optimization. In: International conference on machine learning (ICML) Vicol P, Lorraine JP, Pedregosa F, Duvenaud D, Grosse RB (2022) On implicit bias in overparameterized bilevel optimization. In: International conference on machine learning (ICML)
25.
Zurück zum Zitat Vicol P, Metz L, Sohl-Dickstein J (2021) Persistent unbiased gradient estimation in unrolled computation graphs with persistent evolution strategies Vicol P, Metz L, Sohl-Dickstein J (2021) Persistent unbiased gradient estimation in unrolled computation graphs with persistent evolution strategies
26.
Zurück zum Zitat Wei T, Wang C, Rui Y, Chen CW (2016) Network morphism. In: Proceedings of machine learning research (PMLR) Wei T, Wang C, Rui Y, Chen CW (2016) Network morphism. In: Proceedings of machine learning research (PMLR)
27.
Zurück zum Zitat Wu Y, Ren M, Liao R, Grosse R (2018) Understanding short-horizon bias in stochastic meta-optimizations. In: International conference on learning representations (ICLR) Wu Y, Ren M, Liao R, Grosse R (2018) Understanding short-horizon bias in stochastic meta-optimizations. In: International conference on learning representations (ICLR)
28.
Zurück zum Zitat Yibo Y, Hongyang L, Shan Y, Fei W, Chen Q, Zhouchen L (2020) ISTA-NAS: efficient and consistent neural architecture search by sparse coding. In: Proceedings of the 34th international conference on neural information processing systems (NeurIPS) Yibo Y, Hongyang L, Shan Y, Fei W, Chen Q, Zhouchen L (2020) ISTA-NAS: efficient and consistent neural architecture search by sparse coding. In: Proceedings of the 34th international conference on neural information processing systems (NeurIPS)
29.
Zurück zum Zitat Zhang M, Su S, Pan S, Chang X, Abbasnejad E, Haffari R (2021) iDARTS: differentiable architecture search with stochastic implicit gradients. In: International conference on machine learning (ICML) Zhang M, Su S, Pan S, Chang X, Abbasnejad E, Haffari R (2021) iDARTS: differentiable architecture search with stochastic implicit gradients. In: International conference on machine learning (ICML)
30.
Zurück zum Zitat Zhou P, Xiong C, Socher R, Hoi SCH (2020) Theory-inspired path-regularized differential network architecture search. Neural Inf Process Syst (NeurIPS) Zhou P, Xiong C, Socher R, Hoi SCH (2020) Theory-inspired path-regularized differential network architecture search. Neural Inf Process Syst (NeurIPS)
31.
Zurück zum Zitat Zoph B, Vasudevan V, Shlens J, Le QV (2019) Learning transferable architectures for scalable image recognition. In: Conference on computer vision and pattern recognition (CVPR) Zoph B, Vasudevan V, Shlens J, Le QV (2019) Learning transferable architectures for scalable image recognition. In: Conference on computer vision and pattern recognition (CVPR)
Metadaten
Titel
The limitations of differentiable architecture search
verfasst von
Lacharme Guillaume
Cardot Hubert
Lente Christophe
Monmarche Nicolas
Publikationsdatum
01.06.2024
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 2/2024
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-024-01260-5

Weitere Artikel der Ausgabe 2/2024

Pattern Analysis and Applications 2/2024 Zur Ausgabe

Premium Partner