ABSTRACT
The purpose of this study is to introduce new design-criteria for next-generation hyperparameter optimization software. The criteria we propose include (1) define-by-run API that allows users to construct the parameter search space dynamically, (2) efficient implementation of both searching and pruning strategies, and (3) easy-to-setup, versatile architecture that can be deployed for various purposes, ranging from scalable distributed computing to light-weight experiment conducted via interactive interface. In order to prove our point, we will introduce Optuna, an optimization software which is a culmination of our effort in the development of a next generation optimization software. As an optimization software designed with define-by-run principle, Optuna is particularly the first of its kind. We will present the design-techniques that became necessary in the development of the software that meets the above criteria, and demonstrate the power of our new design through experimental results and real world applications. Our software is available under the MIT license (https://github.com/pfnet/optuna/).
- Mart'in Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-scale Machine Learning. In OSDI. 265--283. Google ScholarDigital Library
- Takuya Akiba, Tommi Kerola, Yusuke Niitani, Toru Ogawa, Shotaro Sano, and Shuji Suzuki. 2018. PFDet: 2nd Place Solution to Open Images Challenge 2018 Object Detection Track. In ECCV Workshop on Open Images Challenge.Google Scholar
- James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for Hyper-parameter Optimization. In NIPS. 2546--2554. Google ScholarDigital Library
- James Bergstra, Brent Komer, Chris Eliasmith, Dan Yamins, and David D Cox. 2015. Hyperopt: a Python library for model selection and hyperparameter optimization. Computational Science & Discovery, Vol. 8, 1 (2015), 14008.Google ScholarCross Ref
- Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, and George Ke. 2016. A Strategy for Ranking Optimization Methods using Multiple Criteria. In ICML Workshop on AutoML. 11--20.Google Scholar
- Tobias Domhan, Jost Tobias Springenberg, and Frank Hutter. 2015. Speeding Up Automatic Hyperparameter Optimization of Deep Neural Networks by Extrapolation of Learning Curves. In IJCAI. 3460--3468. Google ScholarDigital Library
- Siying Dong, Mark Callaghan, Leonidas Galanis, Dhruba Borthakur, Tony Savor, and Michael Strum. 2017. Optimizing Space Amplification in RocksDB. In CIDR.Google Scholar
- Daniel Golovin, Benjamin Solnik, Subhodeep Moitra, Greg Kochanski, John Karro, and D Sculley. 2017. Google Vizier: A Service for Black-Box Optimization. In KDD. 1487--1495. Google ScholarDigital Library
- Nikolaus Hansen and Andreas Ostermeier. 2001. Completely Derandomized Self-Adaptation in Evolution Strategies. Evolutionary Computation, Vol. 9, 2 (2001), 159--195. Google ScholarDigital Library
- Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown. 2011. Sequential Model-based Optimization for General Algorithm Configuration. In LION. 507--523. Google ScholarDigital Library
- Frank Hutter, Lars Kotthoff, and Joaquin Vanschoren (Eds.). 2018. Automatic Machine Learning: Methods, Systems, Challenges .Springer. In press, available at http://automl.org/book.Google Scholar
- Kevin Jamieson and Ameet Talwalkar. 2016. Non-stochastic best arm identification and hyperparameter optimization. In Artificial Intelligence and Statistics. 240--248.Google Scholar
- Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, and Frank Hutter. 2017. Learning Curve Prediction with Bayesian Neural Networks. In ICLR.Google Scholar
- Thomas Kluyver, Benjamin Ragan-Kelley, Fernando Pérez, Brian Granger, Matthias Bussonnier, Jonathan Frederic, Kyle Kelley, Jessica Hamrick, Jason Grout, Sylvain Corlay, Paul Ivanov, Damián Avila, Safia Abdalla, and Carol Willing. 2016. Jupyter Notebooks -- a publishing format for reproducible computational workflows. In Positioning and Power in Academic Publishing: Players, Agents and Agendas, F. Loizides and B. Schmidt (Eds.). IOS Press, 87 -- 90.Google Scholar
- Patrick Koch, Oleg Golovidov, Steven Gardner, Brett Wujek, Joshua Griffin, and Yan Xu. 2018. Autotune: A Derivative-free Optimization Framework for Hyperparameter Tuning. In KDD. 443--452. Google ScholarDigital Library
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. ImageNet Classification with Deep Convolutional Neural Networks. In NIPS. 1097--1105. Google ScholarDigital Library
- Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper R. R. Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Tom Duerig, and Vittorio Ferrari. 2018. The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale. CoRR, Vol. abs/1811.00982 (2018). arxiv: 1811.00982Google Scholar
- Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2018a. Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization. Journal of Machine Learning Research, Vol. 18, 185 (2018), 1--52. Google ScholarDigital Library
- Liam Li, Kevin Jamieson, Afshin Rostamizadeh, Ekaterina Gonina, Moritz Hardt, Benjamin Recht, and Ameet Talwalkar. 2018b. Massively Parallel Hyperparameter Tuning. In NeurIPS Workshop on Machine Learning Systems.Google Scholar
- Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, and Ion Stoica. 2018. Tune: A Research Platform for Distributed Model Selection and Training. In ICML Workshop on AutoML.Google Scholar
- Michael McCourt. 2016. Benchmark suite of test functions suitable for evaluating black-box optimization strategies. https://github.com/sigopt/evalset.Google Scholar
- Wes McKinney. 2011. Pandas: a Foundational Python Library for Data Analysis and Statistics. In SC Workshop on Python for High Performance and Scientific Computing.Google Scholar
- Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, William Paul, Michael I. Jordan, and Ion Stoica. 2017. Ray: A Distributed Framework for Emerging AI Applications. CoRR, Vol. abs/1712.05889 (2017). arxiv: 1712.05889 http://arxiv.org/abs/1712.05889 Google ScholarDigital Library
- Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, and Andrew Y Ng. 2011. Reading Digits in Natural Images with Unsupervised Feature Learning. In NIPS Workshop on Deep Learning and Unsupervised Feature Learning.Google Scholar
- Graham Neubig, Chris Dyer, Yoav Goldberg, Austin Matthews, Waleed Ammar, Antonios Anastasopoulos, Miguel Ballesteros, David Chiang, Daniel Clothiaux, Trevor Cohn, Kevin Duh, Manaal Faruqui, Cynthia Gan, Dan Garrette, Yangfeng Ji, Lingpeng Kong, Adhiguna Kuncoro, Gaurav Kumar, Chaitanya Malaviya, Paul Michel, Yusuke Oda, Matthew Richardson, Naomi Saphra, Swabha Swayamdipta, and Pengcheng Yin. 2017. DyNet: The Dynamic Neural Network Toolkit. CoRR, Vol. abs/1701.03980 (2017).Google Scholar
- Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary DeVito, Zeming Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. 2017. Automatic differentiation in PyTorch. In NIPS Autodiff Workshop.Google Scholar
- Bobak Shahriari, Kevin Swersky, Ziyu Wang, Ryan P Adams, and Nando De Freitas. 2016. Taking the human out of the loop: A review of bayesian optimization. Proc. IEEE, Vol. 104, 1 (2016), 148--175.Google ScholarCross Ref
- Jasper Snoek, Hugo Larochelle, and Ryan P Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In NIPS. 2951--2959. Google ScholarDigital Library
- Seiya Tokui, Kenta Oono, Shohei Hido, and Justin Clayton. 2015. Chainer: a Next-Generation Open Source Framework for Deep Learning. In NIPS Workshop on Machine Learning Systems.Google Scholar
Index Terms
- Optuna: A Next-generation Hyperparameter Optimization Framework
Recommendations
Radial Basis Function and Bayesian Methods for the Hyperparameter Optimization of Classification Random Forests
Computational Science and Its Applications – ICCSA 2023 WorkshopsAbstractThe hyperparameter optimization of a random forest (RF) is a discrete black-box optimization problem that aims to find the settings of the hyperparameters that optimize an overall out-of-bag (OOB) performance measure of the RF. This problem is ...
Improved particle swarm optimization algorithm based on grouping and its application in hyperparameter optimization
AbstractIn this article, an Improved Particle Swarm Optimization (IPSO) is proposed for solving global optimization and hyperparameter optimization. This improvement is proposed to reduce the probability of particles falling into local optimum and ...
A data-driven robust optimization algorithm for black-box cases: An application to hyper-parameter optimization of machine learning algorithms
Graphical abstractDisplay Omitted
Highlights- A novel Black-Box data-driven robust optimization approach is proposed.
- A ...
AbstractThe huge availability of data in the last decade has raised the opportunity for the better use of data in decision-making processes. The idea of using the existing data to achieve a more coherent reality solution has led to a branch of ...
Comments