Skip to main content
Erschienen in: Natural Computing 4/2021

30.10.2020

A study of model and hyper-parameter selection strategies for classifier ensembles: a robust analysis on different optimization algorithms and extended results

verfasst von: Antonino A. Feitosa-Neto, João C. Xavier-Júnior, Anne M. P. Canuto, Alexandre C. M. Oliveira

Erschienen in: Natural Computing | Ausgabe 4/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

It is well known that machine learning (ML) techniques have been playing an important role in several real world applications. However, one of the main challenges is the selection of the most accurate technique to be used in a specific application. In the classification context, for instance, two main approaches can be applied, model selection and hyper-parameter selection. In the first approach, the best classification algorithm is selected for a given input dataset, by doing a heuristic search in a large space of candidate classification algorithms and their corresponding hyper-parameter settings. As the main focus of this approach is the selection of the classification algorithms, it is referred to as model selection and they are also called automated machine learning (Auto-ML). The second approach defines one classification system and performs an extensive search to select the best hyper-parameters for this model. In this paper, we perform a wide and robust comparative analysis of both approaches for Classifier Ensembles. In this analysis, two methods of the first approach (Auto-WEKA and H\(_{2}\)O) are compared to four methods of the second approach (Genetic Algorithm, Particle Swarm Optimization, Tabu Search and GRASP). The main aim is to determine which of these techniques generate more accurate Classifier Ensembles, given a time constraint. Additionally, an empirical analysis will be conducted with 21 classification datasets for evaluating the performance of the aforementioned techniques. Our findings indicate that the use of a hyper-parameter selection method provides the most accurate classifier ensembles, but this improvement was not detected by the statistical test.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Anh T, Austin W, Jeremy G, Keegan H, Bayan BC, Reza F (2019) Towards automated machine learning: evaluation and comparison of AutoML approaches and tools. ArXiv e-prints arXiv:1908.05557 Anh T, Austin W, Jeremy G, Keegan H, Bayan BC, Reza F (2019) Towards automated machine learning: evaluation and comparison of AutoML approaches and tools. ArXiv e-prints arXiv:​1908.​05557
Zurück zum Zitat Apoorva C (2018) A study on framework of H\(_{2}\)O for data science. Int J Adv Res Big Data Manag Syst 2(2):1–8 Apoorva C (2018) A study on framework of H\(_{2}\)O for data science. Int J Adv Res Big Data Manag Syst 2(2):1–8
Zurück zum Zitat Bergstra J, Komer B, Eliasmith C, Yamins D, Cox DD (2015) Hyperopt: a python library for model selection and hyperparameter optimization. Comput Sci Discov 8(1):014008CrossRef Bergstra J, Komer B, Eliasmith C, Yamins D, Cox DD (2015) Hyperopt: a python library for model selection and hyperparameter optimization. Comput Sci Discov 8(1):014008CrossRef
Zurück zum Zitat Charon I, Hudry O (2001) The noising methods: a generalization of some metaheuristics. Eur J Oper Res 135:86–101CrossRef Charon I, Hudry O (2001) The noising methods: a generalization of some metaheuristics. Eur J Oper Res 135:86–101CrossRef
Zurück zum Zitat de S’a AGC, Pinto WJGS, Oliveira LOVB, Pappa GL (2017) ’RECIPE: a grammar-based framework for automatically evolving classification pipelines. In: Proceedings of the 20th European conference on genetic programming (EuroGP’17), LNCS 10196. Springer, pp 246–261 de S’a AGC, Pinto WJGS, Oliveira LOVB, Pappa GL (2017) ’RECIPE: a grammar-based framework for automatically evolving classification pipelines. In: Proceedings of the 20th European conference on genetic programming (EuroGP’17), LNCS 10196. Springer, pp 246–261
Zurück zum Zitat Demšar J (2006) Statistical comparisons of classifiers over multiple datasets. J Mach Learn Res 7:1–30MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple datasets. J Mach Learn Res 7:1–30MathSciNetMATH
Zurück zum Zitat Feitosa-Neto A, Xavier-Junior JC, Canuto A, Oliveira A (2019) A comparative study on automatic model and hyper-parameter selection in classifier ensembles. In: 8th Brazilian conference on intelligent systems (BRACIS). pp. 323–328 Feitosa-Neto A, Xavier-Junior JC, Canuto A, Oliveira A (2019) A comparative study on automatic model and hyper-parameter selection in classifier ensembles. In: 8th Brazilian conference on intelligent systems (BRACIS). pp. 323–328
Zurück zum Zitat Feo TA, Resende MGC (1989) A probabilistic heuristic for a computationally difficult set covering problem. Oper Res Lett 8(2):67–71MathSciNetCrossRef Feo TA, Resende MGC (1989) A probabilistic heuristic for a computationally difficult set covering problem. Oper Res Lett 8(2):67–71MathSciNetCrossRef
Zurück zum Zitat Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. Adva Neural Info Process Syst 28:2962–2970 Feurer M, Klein A, Eggensperger K, Springenberg J, Blum M, Hutter F (2015) Efficient and robust automated machine learning. Adva Neural Info Process Syst 28:2962–2970
Zurück zum Zitat Gendreau M, Potvin J (2010) Handbook of metaheuristics, 2nd edn. Springer, New YorkCrossRef Gendreau M, Potvin J (2010) Handbook of metaheuristics, 2nd edn. Springer, New YorkCrossRef
Zurück zum Zitat Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549MathSciNetCrossRef Glover F (1986) Future paths for integer programming and links to artificial intelligence. Comput Oper Res 13(5):533–549MathSciNetCrossRef
Zurück zum Zitat Glover F, Laguna M, Martí R (2000) Fundamentals of scatter search and path relinking. Control Cybern 29(3):653–684MathSciNetMATH Glover F, Laguna M, Martí R (2000) Fundamentals of scatter search and path relinking. Control Cybern 29(3):653–684MathSciNetMATH
Zurück zum Zitat Goldbarg EFG, Goldbarg MC, de Souza GR (2006) Particle swarm optimization algorithm for the traveling salesman problem. In: Gottlieb J, Raidl GR (eds) Evolutionary computation in combinatorial optimization. EvoCOP, Lecture notes in computer science, vol 3906. Springer, Berlin Goldbarg EFG, Goldbarg MC, de Souza GR (2006) Particle swarm optimization algorithm for the traveling salesman problem. In: Gottlieb J, Raidl GR (eds) Evolutionary computation in combinatorial optimization. EvoCOP, Lecture notes in computer science, vol 3906. Springer, Berlin
Zurück zum Zitat Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRef Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten I (2009) The WEKA data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRef
Zurück zum Zitat Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4. pp 1942–1948 Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of the IEEE international conference on neural networks, vol 4. pp 1942–1948
Zurück zum Zitat Kotthoff L, Thornton C, Hoos HH, Hutter F, Leyton-Brown K (2017) Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA. J Mach Learn Res 18(1):826–830lMathSciNet Kotthoff L, Thornton C, Hoos HH, Hutter F, Leyton-Brown K (2017) Auto-WEKA 2.0: automatic model selection and hyperparameter optimization in WEKA. J Mach Learn Res 18(1):826–830lMathSciNet
Zurück zum Zitat Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, HobokenCrossRef Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, HobokenCrossRef
Zurück zum Zitat Lacoste A, Larochelle H, Laviolette F, Marchand M (2014) Sequential model-based ensemble optimization. Computing Research Repository (CoRR) Lacoste A, Larochelle H, Laviolette F, Marchand M (2014) Sequential model-based ensemble optimization. Computing Research Repository (CoRR)
Zurück zum Zitat Lawal IA, Abdulkarim SA (2017) Adaptive SVM for data stream classification. S Afr Comput J 29(1):27–42 Lawal IA, Abdulkarim SA (2017) Adaptive SVM for data stream classification. S Afr Comput J 29(1):27–42
Zurück zum Zitat Lévesque J, Gagné C, Sabourin R (2016) Bayesian hyperparameter optimization for ensemble learning. In: Proceedings of the 32nd conference on uncertainty in artificial intelligence (UAI). Jersey City, pp 437–446 Lévesque J, Gagné C, Sabourin R (2016) Bayesian hyperparameter optimization for ensemble learning. In: Proceedings of the 32nd conference on uncertainty in artificial intelligence (UAI). Jersey City, pp 437–446
Zurück zum Zitat Mohr F, Wever M, Hüllermeier E (2018) ML-Plan: automated machine learning via hierarchical planning. Mach Learn 107:1495–1515MathSciNetCrossRef Mohr F, Wever M, Hüllermeier E (2018) ML-Plan: automated machine learning via hierarchical planning. Mach Learn 107:1495–1515MathSciNetCrossRef
Zurück zum Zitat Neto AF, Canuto A (2018) An exploratory study of mono and multi-objective metaheuristics to ensemble of classifiers. Appl Intell J 48:416–431CrossRef Neto AF, Canuto A (2018) An exploratory study of mono and multi-objective metaheuristics to ensemble of classifiers. Appl Intell J 48:416–431CrossRef
Zurück zum Zitat Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013) Auto-WEKA: combined Selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, pp 847–855 Thornton C, Hutter F, Hoos HH, Leyton-Brown K (2013) Auto-WEKA: combined Selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD international conference on knowledge discovery and data mining. ACM Press, pp 847–855
Zurück zum Zitat Veloso B, Gama J, Malheiro B (2018) Self hyper-parameter tuning for data streams. In: International conference on discovery science. Springer, Cham, pp 241–255 Veloso B, Gama J, Malheiro B (2018) Self hyper-parameter tuning for data streams. In: International conference on discovery science. Springer, Cham, pp 241–255
Zurück zum Zitat Wang Y, Ni XS (2019) A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization. arXiv e-prints Wang Y, Ni XS (2019) A XGBoost risk model via feature selection and Bayesian hyper-parameter optimization. arXiv e-prints
Zurück zum Zitat Wistuba M, Schilling N and Schmidt-Thieme L (2017) Automatic Frankensteining: creating complex ensembles autonomously. In: Proceedings SIAM international conference on data mining. SIAM, pp 741–749 Wistuba M, Schilling N and Schmidt-Thieme L (2017) Automatic Frankensteining: creating complex ensembles autonomously. In: Proceedings SIAM international conference on data mining. SIAM, pp 741–749
Zurück zum Zitat Wolpert D (1996) The lack of a priori distinctions between learning algorithms. Neural Comput 8:1341–1390CrossRef Wolpert D (1996) The lack of a priori distinctions between learning algorithms. Neural Comput 8:1341–1390CrossRef
Zurück zum Zitat Xavier-Junior JC, Freitas AA, Feitosa-Neto A, Ludermir T (2018) A novel evolutionary algorithm for automated machine learning focusing on classifier ensembles. In: Proceedings of the 7th Brazilian conference on intelligent systems (BRACIS). São Paulo, pp 462–467 Xavier-Junior JC, Freitas AA, Feitosa-Neto A, Ludermir T (2018) A novel evolutionary algorithm for automated machine learning focusing on classifier ensembles. In: Proceedings of the 7th Brazilian conference on intelligent systems (BRACIS). São Paulo, pp 462–467
Metadaten
Titel
A study of model and hyper-parameter selection strategies for classifier ensembles: a robust analysis on different optimization algorithms and extended results
verfasst von
Antonino A. Feitosa-Neto
João C. Xavier-Júnior
Anne M. P. Canuto
Alexandre C. M. Oliveira
Publikationsdatum
30.10.2020
Verlag
Springer Netherlands
Erschienen in
Natural Computing / Ausgabe 4/2021
Print ISSN: 1567-7818
Elektronische ISSN: 1572-9796
DOI
https://doi.org/10.1007/s11047-020-09816-0

Weitere Artikel der Ausgabe 4/2021

Natural Computing 4/2021 Zur Ausgabe