skip to main content
10.1145/3292500.3330648acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article
Public Access

Auto-Keras: An Efficient Neural Architecture Search System

Authors Info & Claims
Published:25 July 2019Publication History

ABSTRACT

Neural architecture search (NAS) has been proposed to automatically tune deep neural networks, but existing search algorithms, e.g., NASNet, PNAS, usually suffer from expensive computational cost. Network morphism, which keeps the functionality of a neural network while changing its neural architecture, could be helpful for NAS by enabling more efficient training during the search. In this paper, we propose a novel framework enabling Bayesian optimization to guide the network morphism for efficient neural architecture search. The framework develops a neural network kernel and a tree-structured acquisition function optimization algorithm to efficiently explores the search space. Extensive experiments on real-world benchmark datasets have been done to demonstrate the superior performance of the developed framework over the state-of-the-art methods. Moreover, we build an open-source AutoML system based on our method, namely Auto-Keras. The code and documentation are available at https://autokeras.com. The system runs in parallel on CPU and GPU, with an adaptive search strategy for different GPU memory limits.

Skip Supplemental Material Section

Supplemental Material

p1946-jin.mp4

mp4

1.1 GB

References

  1. Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bowen Baker, Otkrist Gupta, Nikhil Naik, and Ramesh Raskar. 2016. Designing neural network architectures using reinforcement learning. arXiv preprint arXiv:1611.02167 (2016).Google ScholarGoogle Scholar
  3. James Bergstra, Dan Yamins, and David D Cox. 2013. Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. In Python in Science Conference .Google ScholarGoogle ScholarCross RefCross Ref
  4. Jean Bourgain. 1985. On Lipschitz embedding of finite metric spaces in Hilbert space. Israel Journal of Mathematics (1985).Google ScholarGoogle Scholar
  5. Andrew Brock, Theodore Lim, James M Ritchie, and Nick Weston. 2017. SMASH: one-shot model architecture search through hypernetworks. arXiv preprint arXiv:1708.05344 (2017).Google ScholarGoogle Scholar
  6. Lars Buitinck, Gilles Louppe, Mathieu Blondel, Fabian Pedregosa, Andreas Mueller, Olivier Grisel, Vlad Niculae, Peter Prettenhofer, Alexandre Gramfort, Jaques Grobler, et almbox. 2013. API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning .Google ScholarGoogle Scholar
  7. Han Cai, Tianyao Chen, Weinan Zhang, Yong Yu, and Jun Wang. 2018. Efficient architecture search by network transformation. In AAAI Conference on Artificial Intelligence .Google ScholarGoogle Scholar
  8. Han Cai, Ligeng Zhu, and Song Han. 2019. ProxylessNAS: Direct neural architecture search on target task and hardware. In International Conference on Learning Representations .Google ScholarGoogle Scholar
  9. Qi Chai and Guang Gong. 2012. Verifiable symmetric searchable encryption for semi-honest-but-curious cloud servers. In International Conference on Communications .Google ScholarGoogle ScholarCross RefCross Ref
  10. Tianqi Chen, Ian Goodfellow, and Jonathon Shlens. 2015. Net2net: Accelerating learning via knowledge transfer. arXiv preprint arXiv:1511.05641 (2015).Google ScholarGoogle Scholar
  11. Franccois Chollet et almbox. 2015. Keras. https://keras.io .Google ScholarGoogle Scholar
  12. Travis Desell. 2017. Large scale evolution of convolutional neural networks using volunteer computing. In Genetic and Evolutionary Computation Conference Companion . Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thomas Elsken, Jan-Hendrik Metzen, and Frank Hutter. 2017. Simple And Efficient Architecture Search for Convolutional Neural Networks. arXiv preprint arXiv:1711.04528 (2017).Google ScholarGoogle Scholar
  14. Thomas Elsken, Jan Hendrik Metzen, and Frank Hutter. 2018. Neural Architecture Search: A Survey. arXiv preprint arXiv:1808.05377 (2018).Google ScholarGoogle Scholar
  15. Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and robust automated machine learning. In Advances in Neural Information Processing Systems . Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Golnaz Ghiasi, Tsung-Yi Lin, Ruoming Pang, and Quoc V Le. 2019. NAS-FPN: Learning Scalable Feature Pyramid Architecture for Object Detection. arXiv preprint arXiv:1904.07392 (2019).Google ScholarGoogle Scholar
  17. Zichao Guo, Xiangyu Zhang, Haoyuan Mu, Wen Heng, Zechun Liu, Yichen Wei, and Jian Sun. 2019. Single Path One-Shot Neural Architecture Search with Uniform Sampling. arXiv preprint arXiv:1904.00420 (2019).Google ScholarGoogle Scholar
  18. Bernard Haasdonk and Claus Bahlmann. 2004. Learning with distance substitution kernels. In Joint Pattern Recognition Symposium .Google ScholarGoogle ScholarCross RefCross Ref
  19. Peter E Hart, Nils J Nilsson, and Bertram Raphael. 1968. A formal basis for the heuristic determination of minimum cost paths. IEEE transactions on Systems Science and Cybernetics (1968).Google ScholarGoogle ScholarCross RefCross Ref
  20. Xiao Huang, Qiangquan Song, Fan Yang, and Xia Hu. 2019. Large-scale heterogeneous feature embedding. In AAAI Conference on Artificial Intelligence .Google ScholarGoogle ScholarCross RefCross Ref
  21. Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-Based Optimization for General Algorithm Configuration. In International Conference on Learning and Intelligent Optimization . Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Kirthevasan Kandasamy, Willie Neiswanger, Jeff Schneider, Barnabas Poczos, and Eric Xing. 2018. Neural Architecture Search with Bayesian Optimisation and Optimal Transport. Advances in Neural Information Processing Systems (2018). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Scott Kirkpatrick, C Daniel Gelatt, and Mario P Vecchi. 1983. Optimization by simulated annealing. science (1983).Google ScholarGoogle Scholar
  24. Lars Kotthoff, Chris Thornton, Holger H Hoos, Frank Hutter, and Kevin Leyton-Brown. 2016. Auto-WEKA 2.0: Automatic model selection and hyperparameter optimization in WEKA. Journal of Machine Learning Research (2016). Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Alex Krizhevsky and Geoffrey Hinton. 2009. Learning multiple layers of features from tiny images . Technical Report. Citeseer.Google ScholarGoogle Scholar
  26. Harold W Kuhn. 1955. The Hungarian method for the assignment problem. Naval Research Logistics (1955).Google ScholarGoogle Scholar
  27. Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE (1998).Google ScholarGoogle Scholar
  28. Chenxi Liu, Liang-Chieh Chen, Florian Schroff, Hartwig Adam, Wei Hua, Alan Yuille, and Li Fei-Fei. 2019. Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation. arXiv preprint arXiv:1901.02985 (2019).Google ScholarGoogle Scholar
  29. Chenxi Liu, Barret Zoph, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, and Kevin Murphy. 2017b. Progressive neural architecture search. In European Conference on Computer Vision .Google ScholarGoogle Scholar
  30. Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, and Koray Kavukcuoglu. 2017a. Hierarchical representations for efficient architecture search. arXiv preprint arXiv:1711.00436 (2017).Google ScholarGoogle Scholar
  31. Hanxiao Liu, Karen Simonyan, and Yiming Yang. 2018. Darts: Differentiable architecture search. arXiv preprint arXiv:1806.09055 (2018).Google ScholarGoogle Scholar
  32. Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. 2016. Ssd: Single shot multibox detector. In European Conference on Computer Vision .Google ScholarGoogle ScholarCross RefCross Ref
  33. Renqian Luo, Fei Tian, Tao Qin, Enhong Chen, and Tie-Yan Liu. 2018. Neural architecture optimization. In Advances in Neural Information Processing Systems . Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Hiroshi Maehara. 2013. Euclidean embeddings of finite metric spaces. Discrete Mathematics (2013).Google ScholarGoogle Scholar
  35. Randal S. Olson, Nathan Bartley, Ryan J. Urbanowicz, and Jason H. Moore. 2016. Evaluation of a Tree-based Pipeline Optimization Tool for Automating Data Science. In Genetic and Evolutionary Computation Conference 2016 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et almbox. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research (2011). Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Hieu Pham, Melody Y Guan, Barret Zoph, Quoc V Le, and Jeff Dean. 2018. Efficient Neural Architecture Search via Parameter Sharing. arXiv preprint arXiv:1802.03268 (2018).Google ScholarGoogle Scholar
  38. Esteban Real, Alok Aggarwal, Yanping Huang, and Quoc V Le. 2018. Regularized Evolution for Image Classifier Architecture Search. arXiv preprint arXiv:1802.01548 (2018).Google ScholarGoogle Scholar
  39. Esteban Real, Sherry Moore, Andrew Selle, Saurabh Saxena, Yutaka Leon Suematsu, Quoc Le, and Alex Kurakin. 2017. Large-scale evolution of image classifiers, In International Conference on Machine Learning. arXiv preprint arXiv:1703.01041 . Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical bayesian optimization of machine learning algorithms. In Advances in Neural Information Processing Systems . Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. Masanori Suganuma, Shinichi Shirakawa, and Tomoharu Nagao. 2017. A genetic programming approach to designing convolutional neural network architectures. In Genetic and Evolutionary Computation Conference . Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, and Quoc V Le. 2018. Mnasnet: Platform-aware neural architecture search for mobile. arXiv preprint arXiv:1807.11626 (2018).Google ScholarGoogle Scholar
  43. Qiaoyu Tan, Ninghao Liu, and Xia Hu. 2019. Deep Representation Learning for Social Network Analysis. arXiv preprint arXiv:1904.08547 (2019).Google ScholarGoogle Scholar
  44. Chris Thornton, Frank Hutter, Holger H Hoos, and Kevin Leyton-Brown. 2013. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. In International Conference on Knowledge Discovery and Data Mining . Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Tao Wei, Changhu Wang, Yong Rui, and Chang Wen Chen. 2016. Network morphism. In International Conference on Machine Learning . Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Han Xiao, Kashif Rasul, and Roland Vollgraf. 2017. Fashion-MNIST: a Novel Image Dataset for Benchmarking Machine Learning Algorithms. showeprint{arXiv}cs.LG/cs.LG/1708.07747Google ScholarGoogle Scholar
  47. Sirui Xie, Hehui Zheng, Chunxiao Liu, and Liang Lin. 2019. SNAS: stochastic neural architecture search. In International Conference on Learning Representations .Google ScholarGoogle Scholar
  48. Pinar Yanardag and SVN Vishwanathan. 2015. Deep graph kernels. In International Conference on Knowledge Discovery and Data Mining . Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Zhiping Zeng, Anthony KH Tung, Jianyong Wang, Jianhua Feng, and Lizhu Zhou. 2009. Comparing stars: On approximating graph edit distance. In International Conference on Very Large Data Bases .Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. Zhao Zhong, Junjie Yan, and Cheng-Lin Liu. 2017. Practical Network Blocks Design with Q-Learning. arXiv preprint arXiv:1708.05552 (2017).Google ScholarGoogle Scholar
  51. Barret Zoph and Quoc V Le. 2016. Neural architecture search with reinforcement learning. In International Conference on Learning Representations .Google ScholarGoogle Scholar

Index Terms

  1. Auto-Keras: An Efficient Neural Architecture Search System

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          KDD '19: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining
          July 2019
          3305 pages
          ISBN:9781450362016
          DOI:10.1145/3292500

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 25 July 2019

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          KDD '19 Paper Acceptance Rate110of1,200submissions,9%Overall Acceptance Rate1,133of8,635submissions,13%

          Upcoming Conference

          KDD '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader