Skip to main content
Erschienen in: KI - Künstliche Intelligenz 4/2015

01.11.2015 | Technical Contribution

Beyond Manual Tuning of Hyperparameters

verfasst von: Frank Hutter, Jörg Lücke, Lars Schmidt-Thieme

Erschienen in: KI - Künstliche Intelligenz | Ausgabe 4/2015

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The success of hand-crafted machine learning systems in many applications raises the question of making machine learning algorithms more autonomous, i.e., to reduce the requirement of expert input to a minimum. We discuss two strategies towards this goal: (1) automated optimization of hyperparameters (including mechanisms for feature selection, preprocessing, model selection, etc) and (2) the development of algorithms with reduced sets of hyperparameters. Since many research directions (e.g., deep learning), show a tendency towards increasingly complex algorithms with more and more hyperparamters, the demand for both of these strategies continuously increases. We review recent hyperparameter optimization methods and discuss data-driven approaches to avoid the introduction of hyperparameters using unsupervised learning. We end in discussing how these complementary strategies can work hand-in-hand, representing a very promising approach towards autonomous machine learning.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

KI - Künstliche Intelligenz

The Scientific journal "KI – Künstliche Intelligenz" is the official journal of the division for artificial intelligence within the "Gesellschaft für Informatik e.V." (GI) – the German Informatics Society - with constributions from troughout the field of artificial intelligence.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Adams RP, Wallach HM, Ghahramani Z (2009) Learning the structure of deep sparse graphical models. ArXiv preprint, arXiv:1001.0160 Adams RP, Wallach HM, Ghahramani Z (2009) Learning the structure of deep sparse graphical models. ArXiv preprint, arXiv:​1001.​0160
2.
Zurück zum Zitat Aha DW (1992) Generalizing from case studies: a case study. In: ML, pp 1–10 Aha DW (1992) Generalizing from case studies: a case study. In: ML, pp 1–10
3.
Zurück zum Zitat Bardenet R, Brendel M, Kégl B, Sebag M (2013) Collaborative hyperparameter tuning. In: Proceeidngs of ICML’13 Bardenet R, Brendel M, Kégl B, Sebag M (2013) Collaborative hyperparameter tuning. In: Proceeidngs of ICML’13
4.
Zurück zum Zitat Bengio Y (2000) Gradient-based optimization of hyperparameters. Neural Comput 12(8):1889–1900CrossRef Bengio Y (2000) Gradient-based optimization of hyperparameters. Neural Comput 12(8):1889–1900CrossRef
5.
Zurück zum Zitat Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Proceedings of NIPS’11 Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Proceedings of NIPS’11
6.
Zurück zum Zitat Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. JMLR 13:281–305MathSciNetMATH Bergstra J, Bengio Y (2012) Random search for hyper-parameter optimization. JMLR 13:281–305MathSciNetMATH
7.
Zurück zum Zitat Bergstra J, Cox D (2013) Hyperparameter optimization and boosting for classifying facial expressions: How good can a “null” model be? ArXiv preprint, arXiv:1306.3476 Bergstra J, Cox D (2013) Hyperparameter optimization and boosting for classifying facial expressions: How good can a “null” model be? ArXiv preprint, arXiv:​1306.​3476
8.
Zurück zum Zitat Bergstra J, Yamins D, Cox DD (2013) Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of ICML’13 Bergstra J, Yamins D, Cox DD (2013) Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures. In: Proceedings of ICML’13
9.
Zurück zum Zitat Berkes P, Turner R, Sahani M (2008) On sparsity and overcompleteness in image models. In: Proceedings of NIPS’08, vol 21 Berkes P, Turner R, Sahani M (2008) On sparsity and overcompleteness in image models. In: Proceedings of NIPS’08, vol 21
10.
Zurück zum Zitat Blockeel H (2006) Experiment databases: a novel methodology for experimental research. In: Knowledge discovery in inductive databases, pp 72–85. Springer Blockeel H (2006) Experiment databases: a novel methodology for experimental research. In: Knowledge discovery in inductive databases, pp 72–85. Springer
11.
Zurück zum Zitat Brazdil P, Gama J, Henery B (1994) Characterizing the applicability of classification algorithms using meta-level learning. In: Proceedings of ECML’94, pp 83–102 Brazdil P, Gama J, Henery B (1994) Characterizing the applicability of classification algorithms using meta-level learning. In: Proceedings of ECML’94, pp 83–102
12.
Zurück zum Zitat Brochu E, Cora, V., de Freitas, N (2010) A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. ArXiv preprint, arXiv:1012.2599 Brochu E, Cora, V., de Freitas, N (2010) A tutorial on Bayesian optimization of expensive cost functions, with application to active user modeling and hierarchical reinforcement learning. ArXiv preprint, arXiv:​1012.​2599
13.
Zurück zum Zitat Castiello C, Castellano G, Fanelli AM (2005) Meta-data: characterization of input features for meta-learning. In: Modeling decisions for artificial intelligence, pp 457–468. Springer Castiello C, Castellano G, Fanelli AM (2005) Meta-data: characterization of input features for meta-learning. In: Modeling decisions for artificial intelligence, pp 457–468. Springer
14.
Zurück zum Zitat Ciresan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: Proceedings of CVPR’12, pp 3642–3649. IEEE Ciresan D, Meier U, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: Proceedings of CVPR’12, pp 3642–3649. IEEE
15.
Zurück zum Zitat Dayan P (1997) Recognition in hierarchical models. In: Foundations of computational mathematics, pp 43–62. Springer Dayan P (1997) Recognition in hierarchical models. In: Foundations of computational mathematics, pp 43–62. Springer
16.
Zurück zum Zitat Domhan T, Springenberg T, Hutter F (2014) Extrapolating learning curves of deep neural networks. In: ICML 2014 AutoML Workshop Domhan T, Springenberg T, Hutter F (2014) Extrapolating learning curves of deep neural networks. In: ICML 2014 AutoML Workshop
17.
Zurück zum Zitat Eggensperger K, Feurer M, Hutter F, Bergstra J, Snoek J, Hoos H, Leyton-Brown K (2013) Towards an empirical foundation for assessing Bayesian optimization of hyperparameters. In: NIPS workshop on Bayesian Optimization in Theory and Practice Eggensperger K, Feurer M, Hutter F, Bergstra J, Snoek J, Hoos H, Leyton-Brown K (2013) Towards an empirical foundation for assessing Bayesian optimization of hyperparameters. In: NIPS workshop on Bayesian Optimization in Theory and Practice
18.
Zurück zum Zitat Engels R, Theusinger C (1998) Using a data metric for preprocessing advice for data mining applications. In: Proceedings of ECAI’98, pp 430–434 Engels R, Theusinger C (1998) Using a data metric for preprocessing advice for data mining applications. In: Proceedings of ECAI’98, pp 430–434
19.
Zurück zum Zitat Fawcett C, Hoos H (2013) Analysing differences between algorithm configurations through ablation. In: Proceedings of MIC’13, pp 123–132 Fawcett C, Hoos H (2013) Analysing differences between algorithm configurations through ablation. In: Proceedings of MIC’13, pp 123–132
20.
Zurück zum Zitat Feurer M, Springenberg T, Hutter F (2015) Initializing Bayesian hyperparameter optimization via meta-learning. In: Proceedings of AAAI’15 Feurer M, Springenberg T, Hutter F (2015) Initializing Bayesian hyperparameter optimization via meta-learning. In: Proceedings of AAAI’15
21.
Zurück zum Zitat Gomes TAF, Prudêncio RBC, Soares C, Rossi ALD (2012) Carvalho, A.C.P.L.F.: combining meta-learning and search techniques to select parameters for support vector machines. Neurocomputing 75(1):3–13CrossRef Gomes TAF, Prudêncio RBC, Soares C, Rossi ALD (2012) Carvalho, A.C.P.L.F.: combining meta-learning and search techniques to select parameters for support vector machines. Neurocomputing 75(1):3–13CrossRef
22.
Zurück zum Zitat Goodfellow I, Courville AC, Bengio Y (2012) Large-scale feature learning with spike-and-slab sparse coding. In: Proceedings of ICML’12 Goodfellow I, Courville AC, Bengio Y (2012) Large-scale feature learning with spike-and-slab sparse coding. In: Proceedings of ICML’12
23.
Zurück zum Zitat Griffiths TL, Kemp C, Tenenbaum JB (2008) Bayesian models of cognition. In: Sun R (ed) Cambridge Handbook of Computational Psychology. Cambridge University Press, New York, NY, USA Griffiths TL, Kemp C, Tenenbaum JB (2008) Bayesian models of cognition. In: Sun R (ed) Cambridge Handbook of Computational Psychology. Cambridge University Press, New York, NY, USA
24.
Zurück zum Zitat Gross S, Mokbel B, Hammer B, Pinkwart N (2012) Feedback provision strategies in intelligent tutoring systems based on clustered solution spaces. In: Desel J, Haake JM, Spannagel C (eds) DeLFI 2012: Die 10. e-Learning Fachtagung Informatik, pp 27–38. Köllen, Hagen, Germany Gross S, Mokbel B, Hammer B, Pinkwart N (2012) Feedback provision strategies in intelligent tutoring systems based on clustered solution spaces. In: Desel J, Haake JM, Spannagel C (eds) DeLFI 2012: Die 10. e-Learning Fachtagung Informatik, pp 27–38. Köllen, Hagen, Germany
25.
Zurück zum Zitat Guerra SB, Prudłncio RB, Ludermir TB (2008) Predicting the performance of learning algorithms using support vector machines as meta-regressors. In: Proceedings of ICANN’08, pp 523–532 Guerra SB, Prudłncio RB, Ludermir TB (2008) Predicting the performance of learning algorithms using support vector machines as meta-regressors. In: Proceedings of ICANN’08, pp 523–532
26.
Zurück zum Zitat Guo X, Yang J, Wu C, Wang C, Liang Y (2008) A novel LS-SVMs hyper-parameter selection based on particle swarm optimization. Neurocomputing 71(16):3211–3215CrossRef Guo X, Yang J, Wu C, Wang C, Liang Y (2008) A novel LS-SVMs hyper-parameter selection based on particle swarm optimization. Neurocomputing 71(16):3211–3215CrossRef
27.
Zurück zum Zitat Henery RJ (1994) Methods for comparison. In: Michie D, Spiegelhalter DJ, Taylor CC (eds) Machine learning, neural and statistical classification. Ellis Horwood, New York Henery RJ (1994) Methods for comparison. In: Michie D, Spiegelhalter DJ, Taylor CC (eds) Machine learning, neural and statistical classification. Ellis Horwood, New York
28.
Zurück zum Zitat Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7) Hinton GE, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput 18(7)
29.
Zurück zum Zitat Ho TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24(3):289–300CrossRef Ho TK, Basu M (2002) Complexity measures of supervised classification problems. IEEE Trans Pattern Anal Mach Intell 24(3):289–300CrossRef
30.
Zurück zum Zitat Hutter F, Hoos H, Leyton-Brown K (2014) An efficient approach for assessing hyperparameter importance. In: Proceeding of ICML’14, pp 754–762 Hutter F, Hoos H, Leyton-Brown K (2014) An efficient approach for assessing hyperparameter importance. In: Proceeding of ICML’14, pp 754–762
31.
Zurück zum Zitat Hutter F, Hoos H, Leyton-Brown K, Stützle T (2009) ParamILS: an automatic algorithm configuration framework. JAIR 36(1):267–306MATH Hutter F, Hoos H, Leyton-Brown K, Stützle T (2009) ParamILS: an automatic algorithm configuration framework. JAIR 36(1):267–306MATH
32.
Zurück zum Zitat Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Proceeidngs of LION-5 Hutter F, Hoos HH, Leyton-Brown K (2011) Sequential model-based optimization for general algorithm configuration. In: Proceeidngs of LION-5
33.
Zurück zum Zitat Hutter F, Hoos HH, Leyton-Brown K (2013) Identifying key algorithm parameters and instance features using forward selection. In: Proceedings of LION-7 Hutter F, Hoos HH, Leyton-Brown K (2013) Identifying key algorithm parameters and instance features using forward selection. In: Proceedings of LION-7
34.
Zurück zum Zitat Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black box functions. Journal of Global Optim 13:455–492MathSciNetCrossRefMATH Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black box functions. Journal of Global Optim 13:455–492MathSciNetCrossRefMATH
35.
Zurück zum Zitat King RD, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell 9(3):289–333CrossRef King RD, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell 9(3):289–333CrossRef
36.
Zurück zum Zitat Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: Proceedings of NIPS’14, pp 3581–3589 Kingma DP, Mohamed S, Rezende DJ, Welling M (2014) Semi-supervised learning with deep generative models. In: Proceedings of NIPS’14, pp 3581–3589
37.
Zurück zum Zitat Kulick J, Toussaint M, Lang T, Lopes M (2013) Active learning for teaching a robot grounded relational symbols. In: Proceedings of IJCAI’13 Kulick J, Toussaint M, Lang T, Lopes M (2013) Active learning for teaching a robot grounded relational symbols. In: Proceedings of IJCAI’13
38.
Zurück zum Zitat LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. Handb Brain Theory Neural Netw 3361:310 LeCun Y, Bengio Y (1995) Convolutional networks for images, speech, and time series. Handb Brain Theory Neural Netw 3361:310
39.
Zurück zum Zitat LeCun Y, Bottou L, Bengio Y, Haffner P (2001) Gradient-based learning applied to document recognition. In: Intelligent Signal Processing, pp 306–351. IEEE Press LeCun Y, Bottou L, Bengio Y, Haffner P (2001) Gradient-based learning applied to document recognition. In: Intelligent Signal Processing, pp 306–351. IEEE Press
40.
Zurück zum Zitat LeCun Y, Kavukcuoglu K, Farabet C (2010) Convolutional networks and applications in vision. Proceeidngs of ISCAS’10 pp 253–6 (2010) LeCun Y, Kavukcuoglu K, Farabet C (2010) Convolutional networks and applications in vision. Proceeidngs of ISCAS’10 pp 253–6 (2010)
41.
Zurück zum Zitat Lee TS, Mumford D (2003) Hierarchical Bayesian inference in the visual cortex. J Opt Soc Am A Opt Image Sci Vis 20(7):1434–1448CrossRef Lee TS, Mumford D (2003) Hierarchical Bayesian inference in the visual cortex. J Opt Soc Am A Opt Image Sci Vis 20(7):1434–1448CrossRef
42.
Zurück zum Zitat Lemke C, Budka M, Gabrys B (2013) Metalearning: a survey of trends and technologies. Artif. Intell. Rev. pp 1–14 Lemke C, Budka M, Gabrys B (2013) Metalearning: a survey of trends and technologies. Artif. Intell. Rev. pp 1–14
43.
Zurück zum Zitat Lücke J, Sahani M (2008) Maximal causes for non-linear component extraction. JMLR 9:1227–67MATH Lücke J, Sahani M (2008) Maximal causes for non-linear component extraction. JMLR 9:1227–67MATH
44.
Zurück zum Zitat Maron O, Moore A (1994) Hoeffding races: accelerating model selection search for classification and function approximation. In: Proceeding of NIPS’94, pp 59–66 Maron O, Moore A (1994) Hoeffding races: accelerating model selection search for classification and function approximation. In: Proceeding of NIPS’94, pp 59–66
46.
Zurück zum Zitat Mohamed S, Heller K, Ghahramani Z (2012) Evaluating Bayesian and L1 approaches for sparse unsupervised learning. In: Proceedings of ICML’12 Mohamed S, Heller K, Ghahramani Z (2012) Evaluating Bayesian and L1 approaches for sparse unsupervised learning. In: Proceedings of ICML’12
47.
Zurück zum Zitat Murray I, Adams RP (2010) Slice sampling covariance hyperparameters of latent Gaussian models. In: Proceedings of NIPS’10, pp 1723–1731 Murray I, Adams RP (2010) Slice sampling covariance hyperparameters of latent Gaussian models. In: Proceedings of NIPS’10, pp 1723–1731
48.
Zurück zum Zitat Pasemann F (2013) Self-regulating neurons in the sensorimotor loop. In: Rojas I, Joya G, Gabestany J (eds) Advances in Computational Intelligence, vol 7902., Lecture Notes in Computer ScienceSpringer, Berlin Heidelberg, pp 481–491CrossRef Pasemann F (2013) Self-regulating neurons in the sensorimotor loop. In: Rojas I, Joya G, Gabestany J (eds) Advances in Computational Intelligence, vol 7902., Lecture Notes in Computer ScienceSpringer, Berlin Heidelberg, pp 481–491CrossRef
49.
Zurück zum Zitat Peng Y, Flach PA, Brazdil P, Soares C (2002) Decision tree-based data characterization for meta-learning. In: ECML/PKDD’02 Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning, pp 111–122 Peng Y, Flach PA, Brazdil P, Soares C (2002) Decision tree-based data characterization for meta-learning. In: ECML/PKDD’02 Workshop on Integration and Collaboration Aspects of Data Mining, Decision Support and Meta-Learning, pp 111–122
50.
Zurück zum Zitat Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Meta-learning by landmarking various learning algorithms. In: Proceedings of ICML’00, pp 743–750 Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Meta-learning by landmarking various learning algorithms. In: Proceedings of ICML’00, pp 743–750
51.
Zurück zum Zitat Pinto F, Soares C, Mendes-Moreira J (2014) A framework to decompose and develop metafeatures. In: ECAI 2014 Workshop on Meta-Learning and Algorithm Selection, p 32 Pinto F, Soares C, Mendes-Moreira J (2014) A framework to decompose and develop metafeatures. In: ECAI 2014 Workshop on Meta-Learning and Algorithm Selection, p 32
52.
Zurück zum Zitat Reif M (2012) A comprehensive dataset for evaluating approaches of various meta-learning tasks. In: Proceedings of ICPRAM’12, vol 1, pp 273–276 Reif M (2012) A comprehensive dataset for evaluating approaches of various meta-learning tasks. In: Proceedings of ICPRAM’12, vol 1, pp 273–276
53.
Zurück zum Zitat Reif M, Shafait F, Dengel A (2012) Meta-learning for evolutionary parameter optimization of classifiers. Mach Learn 87(3):357–380MathSciNetCrossRef Reif M, Shafait F, Dengel A (2012) Meta-learning for evolutionary parameter optimization of classifiers. Mach Learn 87(3):357–380MathSciNetCrossRef
54.
Zurück zum Zitat Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRef Schmidhuber J (2015) Deep learning in neural networks: an overview. Neural Netw 61:85–117CrossRef
55.
Zurück zum Zitat Schonlau M, Welch WJ, Jones DR (1998) Global versus local search in constrained optimization of computer models. In: New developments and applications in experimental design, vol 34, pp 11–25. Institute of Mathematical Statistics, Hayward, California Schonlau M, Welch WJ, Jones DR (1998) Global versus local search in constrained optimization of computer models. In: New developments and applications in experimental design, vol 34, pp 11–25. Institute of Mathematical Statistics, Hayward, California
56.
Zurück zum Zitat Sheikh AS, Shelton JA, Lücke J (2014) A truncated em approach for spike-and-slab sparse coding. JMLR 15:2653–2687MATH Sheikh AS, Shelton JA, Lücke J (2014) A truncated em approach for spike-and-slab sparse coding. JMLR 15:2653–2687MATH
57.
Zurück zum Zitat Sidenbladh H, Black MJ, Fleet DJ (2000) Stochastic tracking of 3d human figures using 2d image motion. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 702–718. Springer Sidenbladh H, Black MJ, Fleet DJ (2000) Stochastic tracking of 3d human figures using 2d image motion. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 702–718. Springer
58.
Zurück zum Zitat Smith MR, Mitchell L, Giraud-Carrier C, Martinez T (2014) Recommending learning algorithms and their associated hyperparameters. ArXiv preprint, arXiv:1407.1890 Smith MR, Mitchell L, Giraud-Carrier C, Martinez T (2014) Recommending learning algorithms and their associated hyperparameters. ArXiv preprint, arXiv:​1407.​1890
59.
Zurück zum Zitat Smith MR, White A, Giraud-Carrier C, Martinez T (2014) An easy to use repository for comparing and improving machine learning algorithm usage. ArXiv preprint, arXiv:1405.7292 Smith MR, White A, Giraud-Carrier C, Martinez T (2014) An easy to use repository for comparing and improving machine learning algorithm usage. ArXiv preprint, arXiv:​1405.​7292
60.
Zurück zum Zitat Smith-Miles K (2009) Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Computing Surveys 41(1), 6:1–6:25 Smith-Miles K (2009) Cross-disciplinary perspectives on meta-learning for algorithm selection. ACM Computing Surveys 41(1), 6:1–6:25
61.
Zurück zum Zitat Snoek J, Larochelle H, Adams R (2012) Practical Bayesian optimization of machine learning algorithms. In: Proceedings of NIPS’12 Snoek J, Larochelle H, Adams R (2012) Practical Bayesian optimization of machine learning algorithms. In: Proceedings of NIPS’12
62.
Zurück zum Zitat Srinivas N, Krause A, Kakade S, Seeger M (2010) Gaussian process optimization in the bandit setting: No regret and experimental design. In: Proceedings of ICML’10 Srinivas N, Krause A, Kakade S, Seeger M (2010) Gaussian process optimization in the bandit setting: No regret and experimental design. In: Proceedings of ICML’10
63.
Zurück zum Zitat Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1):1929–1958MathSciNetMATH Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. JMLR 15(1):1929–1958MathSciNetMATH
64.
Zurück zum Zitat Swersky K, Duvenaud D, Snoek J, Hutter F, Osborne M (2013) Raiders of the lost architecture: Kernels for Bayesian optimization in conditional parameter spaces. In: NIPS workshop on Bayesian Optimization in theory and practice (BayesOpt’13) Swersky K, Duvenaud D, Snoek J, Hutter F, Osborne M (2013) Raiders of the lost architecture: Kernels for Bayesian optimization in conditional parameter spaces. In: NIPS workshop on Bayesian Optimization in theory and practice (BayesOpt’13)
65.
Zurück zum Zitat Swersky K, Snoek J, Adams R (2013) Multi-task bayesian optimization. In: Proc. of ICML’13 Swersky K, Snoek J, Adams R (2013) Multi-task bayesian optimization. In: Proc. of ICML’13
67.
Zurück zum Zitat Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013) Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of KDD’13 Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013) Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of KDD’13
68.
Zurück zum Zitat Vanschoren J, Blockeel H, Pfahringer B, Holmes G (2012) Experiment databases: a new way to share, organize and learn from experiments. Machine Learning 87(2):127–158MathSciNetCrossRefMATH Vanschoren J, Blockeel H, Pfahringer B, Holmes G (2012) Experiment databases: a new way to share, organize and learn from experiments. Machine Learning 87(2):127–158MathSciNetCrossRefMATH
69.
Zurück zum Zitat Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77–95CrossRef Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77–95CrossRef
70.
Zurück zum Zitat Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. JMLR 11:3371–3408MathSciNetMATH Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. JMLR 11:3371–3408MathSciNetMATH
71.
Zurück zum Zitat Wager S, Wang S, Liang PS (2013) Dropout training as adaptive regularization. In: Proceedings of NIPS’13, pp 351–359 Wager S, Wang S, Liang PS (2013) Dropout training as adaptive regularization. In: Proceedings of NIPS’13, pp 351–359
72.
Zurück zum Zitat Weng P, Busa-Fekete R, Hüllermeier E (2013) Interactive Q-learning with ordinal rewards and unreliable tutor. In: Proceedings ECML/PKDD Workshop on Reinforcement learning from Generalized Feedback: Beyond Numerical Rewards Weng P, Busa-Fekete R, Hüllermeier E (2013) Interactive Q-learning with ordinal rewards and unreliable tutor. In: Proceedings ECML/PKDD Workshop on Reinforcement learning from Generalized Feedback: Beyond Numerical Rewards
73.
Zurück zum Zitat Yogatama D, Mann G (2014) Efficient transfer learning method for automatic hyperparameter tuning. In: Proceedings of AISTATS’14, pp 1077–1085 Yogatama D, Mann G (2014) Efficient transfer learning method for automatic hyperparameter tuning. In: Proceedings of AISTATS’14, pp 1077–1085
Metadaten
Titel
Beyond Manual Tuning of Hyperparameters
verfasst von
Frank Hutter
Jörg Lücke
Lars Schmidt-Thieme
Publikationsdatum
01.11.2015
Verlag
Springer Berlin Heidelberg
Erschienen in
KI - Künstliche Intelligenz / Ausgabe 4/2015
Print ISSN: 0933-1875
Elektronische ISSN: 1610-1987
DOI
https://doi.org/10.1007/s13218-015-0381-0

Weitere Artikel der Ausgabe 4/2015

KI - Künstliche Intelligenz 4/2015 Zur Ausgabe