Skip to main content
Erschienen in: Pattern Analysis and Applications 1/2014

01.02.2014 | Theoretical Advances

Automatic classifier selection for non-experts

verfasst von: Matthias Reif, Faisal Shafait, Markus Goldstein, Thomas Breuel, Andreas Dengel

Erschienen in: Pattern Analysis and Applications | Ausgabe 1/2014

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Choosing a suitable classifier for a given dataset is an important part of developing a pattern recognition system. Since a large variety of classification algorithms are proposed in literature, non-experts do not know which method should be used in order to obtain good classification results on their data. Meta-learning tries to address this problem by recommending promising classifiers based on meta-features computed from a given dataset. In this paper, we empirically evaluate five different categories of state-of-the-art meta-features for their suitability in predicting classification accuracies of several widely used classifiers (including Support Vector Machines, Neural Networks, Random Forests, Decision Trees, and Logistic Regression). Based on the evaluation results, we have developed the first open source meta-learning system that is capable of accurately predicting accuracies of target classifiers. The user provides a dataset as input and gets an automatically created high-performance ready-to-use pattern recognition system in a few simple steps. A user study of the system with non-experts showed that the users were able to develop more accurate pattern recognition systems in significantly less development time when using our system as compared to using a state-of-the-art data mining software.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Literatur
1.
Zurück zum Zitat Abdelmessih SD, Shafait F, Reif M, Goldstein M (2010) Landmarking for meta-learning using RapidMiner. In: RapidMiner community meeting and conference Abdelmessih SD, Shafait F, Reif M, Goldstein M (2010) Landmarking for meta-learning using RapidMiner. In: RapidMiner community meeting and conference
2.
Zurück zum Zitat Ali S, Smith KA (2006) On learning algorithm selection for classification. Applied Soft Comput. 6:119–138CrossRef Ali S, Smith KA (2006) On learning algorithm selection for classification. Applied Soft Comput. 6:119–138CrossRef
4.
Zurück zum Zitat Bensusan H, Giraud-Carrier C (2000) Casa batló is in passeig de gràcia or how landmark performances can describe tasks. In: Proceedings of the ECML-00 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 29–46 Bensusan H, Giraud-Carrier C (2000) Casa batló is in passeig de gràcia or how landmark performances can describe tasks. In: Proceedings of the ECML-00 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 29–46
5.
Zurück zum Zitat Bensusan H, Giraud-Carrier C, Kennedy C (2000) A higher-order approach to meta-learning. In: Proceedings of the ECML’2000 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 109–117 Bensusan H, Giraud-Carrier C, Kennedy C (2000) A higher-order approach to meta-learning. In: Proceedings of the ECML’2000 workshop on meta-learning: building automatic advice strategies for model selection and method combination, pp. 109–117
6.
Zurück zum Zitat Bensusan H, Giraud-Carrier CG (2000) Discovering task neighbourhoods through landmark learning performances. In: PKDD ’00: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, Springer-Verlag, London, UK, pp 325–330 Bensusan H, Giraud-Carrier CG (2000) Discovering task neighbourhoods through landmark learning performances. In: PKDD ’00: Proceedings of the 4th European Conference on Principles of Data Mining and Knowledge Discovery, Springer-Verlag, London, UK, pp 325–330
7.
Zurück zum Zitat Bensusan H, Kalousis A (2001) Estimating the predictive accuracy of a classifier. In: De Raedt L, Flach P (eds.) Machine Learning: ECML 2001, Lecture Notes in Computer Science, vol. 2167 Springer, Berlin, pp 25–36 Bensusan H, Kalousis A (2001) Estimating the predictive accuracy of a classifier. In: De Raedt L, Flach P (eds.) Machine Learning: ECML 2001, Lecture Notes in Computer Science, vol. 2167 Springer, Berlin, pp 25–36
8.
Zurück zum Zitat Brazdil P, Soares C, da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn. 50(3):251–277CrossRefMATH Brazdil P, Soares C, da Costa JP (2003) Ranking learning algorithms: using IBL and meta-learning on accuracy and time results. Mach Learn. 50(3):251–277CrossRefMATH
9.
Zurück zum Zitat Brazdil PB, Soares C (2000) Zoomed ranking: Selection of classification algorithms based on relevant performance information. In: Proceedings of principles of data mining and knowledge discovery, 4th European conference (PKDD-2000). Springer, pp 126–135 Brazdil PB, Soares C (2000) Zoomed ranking: Selection of classification algorithms based on relevant performance information. In: Proceedings of principles of data mining and knowledge discovery, 4th European conference (PKDD-2000). Springer, pp 126–135
12.
Zurück zum Zitat Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn. 20(3):273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn. 20(3):273–297MATH
13.
Zurück zum Zitat Engels R, Theusinger C (1998) Using a data metric for preprocessing advice for data mining applications. In: Proceedings of the European Conference on artificial intelligence (ECAI-98, Wiley, pp 430–434 Engels R, Theusinger C (1998) Using a data metric for preprocessing advice for data mining applications. In: Proceedings of the European Conference on artificial intelligence (ECAI-98, Wiley, pp 430–434
14.
Zurück zum Zitat Frasch JV, Lodwich A, Shafait F, Breuel TM (2011) A bayes-true data generator for evaluation of supervised and unsupervised learning methods. Pattern Recogn Lett. 32(11):1523–1531CrossRef Frasch JV, Lodwich A, Shafait F, Breuel TM (2011) A bayes-true data generator for evaluation of supervised and unsupervised learning methods. Pattern Recogn Lett. 32(11):1523–1531CrossRef
15.
Zurück zum Zitat Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc. 84(405):165–175CrossRef Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc. 84(405):165–175CrossRef
16.
Zurück zum Zitat Fürnkranz J, Petrak J (2001) An evaluation of landmarking variants. In: C. Giraud-Carrier, N. Lavrač, S. Moyle, B. Kavšek (eds.) Proceedings of the ECML/PKDD workshop on integrating aspects of data mining, decision support and meta-learning (IDDM-2001), Freiburg, Germany, pp 57–68 Fürnkranz J, Petrak J (2001) An evaluation of landmarking variants. In: C. Giraud-Carrier, N. Lavrač, S. Moyle, B. Kavšek (eds.) Proceedings of the ECML/PKDD workshop on integrating aspects of data mining, decision support and meta-learning (IDDM-2001), Freiburg, Germany, pp 57–68
17.
Zurück zum Zitat Gama J, Brazdil P (1995) Characterization of classification algorithms. In: C. Pinto-Ferreira, N. Mamede (eds.) Progress in artificial intelligence, Lecture Notes in Computer Science, vol. 990, Springer Heidelberg, pp 189–200 Gama J, Brazdil P (1995) Characterization of classification algorithms. In: C. Pinto-Ferreira, N. Mamede (eds.) Progress in artificial intelligence, Lecture Notes in Computer Science, vol. 990, Springer Heidelberg, pp 189–200
18.
Zurück zum Zitat Giraud-Carrier C (2005) The data mining advisor: meta-learning at the service of practitioners. In: Proceedings of the fourth international conference on machine learning and applications, 2005, pp 113–119 Giraud-Carrier C (2005) The data mining advisor: meta-learning at the service of practitioners. In: Proceedings of the fourth international conference on machine learning and applications, 2005, pp 113–119
19.
Zurück zum Zitat Hilario M, Nguyen P, Do H, Woznica A, Kalousis A (2011) Ontology-based meta-mining of knowledge discovery workflows. In: Jankowski N, Duch W, Grąbczewski K (eds.) Meta-Learning in Computational Intelligence, Studies in Computational Intelligence, vol. 358, Springer Heidelberg, pp 273–315 Hilario M, Nguyen P, Do H, Woznica A, Kalousis A (2011) Ontology-based meta-mining of knowledge discovery workflows. In: Jankowski N, Duch W, Grąbczewski K (eds.) Meta-Learning in Computational Intelligence, Studies in Computational Intelligence, vol. 358, Springer Heidelberg, pp 273–315
20.
Zurück zum Zitat John GH, Kohavi R, Pfleger K (1994) Irrelevant features and the subset selection problem. In: International Conference on machine learning, Morgan Kaufmann, pp 121–129 John GH, Kohavi R, Pfleger K (1994) Irrelevant features and the subset selection problem. In: International Conference on machine learning, Morgan Kaufmann, pp 121–129
21.
Zurück zum Zitat Kalousis A, Hilario M (2001) Feature selection for meta-learning. In: Cheung D, Williams G, Li Q (eds.) Advances in Knowledge Discovery and Data Mining, Lecture Notes in Computer Science, vol. 2035, Springer Heidelberg, pp 222–233 Kalousis A, Hilario M (2001) Feature selection for meta-learning. In: Cheung D, Williams G, Li Q (eds.) Advances in Knowledge Discovery and Data Mining, Lecture Notes in Computer Science, vol. 2035, Springer Heidelberg, pp 222–233
22.
Zurück zum Zitat Kietz JU, Serban F, Bernstein A, Fischer S (2010) Data mining workflow templates for intelligent discovery assistance and auto-experimentation. In: Proceedings of the ECML/PKDD-10 Workshop on Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery, pp 1–12 Kietz JU, Serban F, Bernstein A, Fischer S (2010) Data mining workflow templates for intelligent discovery assistance and auto-experimentation. In: Proceedings of the ECML/PKDD-10 Workshop on Third Generation Data Mining: Towards Service-Oriented Knowledge Discovery, pp 1–12
23.
Zurück zum Zitat King RD, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell. 9(3):289–333CrossRef King RD, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell. 9(3):289–333CrossRef
24.
Zurück zum Zitat Köpf C, Taylor C, Keller J (2000) Meta-analysis: from data characterisation for meta-learning to meta-regression. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP Köpf C, Taylor C, Keller J (2000) Meta-analysis: from data characterisation for meta-learning to meta-regression. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP
25.
Zurück zum Zitat Lindner G, Studer R (1999) Ast: support for algorithm selection with a cbr approach. In: Recent Advances in Meta-Learning and Future Work, pp 418–423 Lindner G, Studer R (1999) Ast: support for algorithm selection with a cbr approach. In: Recent Advances in Meta-Learning and Future Work, pp 418–423
27.
Zurück zum Zitat Michie D, Spiegelhalter D, Taylor C (1994) Machine Learning, Neural & Statistical Classification. Ellis Horwood, Chichester Michie D, Spiegelhalter D, Taylor C (1994) Machine Learning, Neural & Statistical Classification. Ellis Horwood, Chichester
28.
Zurück zum Zitat Mierswa I, Wurst M, Klinkenberg R, Scholz M, Euler T (2006) Yale: Rapid prototyping for complex data mining tasks. In: Ungar L, Craven M, Gunopulos D, Eliassi-Rad T (eds.) KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, pp 935–940 Mierswa I, Wurst M, Klinkenberg R, Scholz M, Euler T (2006) Yale: Rapid prototyping for complex data mining tasks. In: Ungar L, Craven M, Gunopulos D, Eliassi-Rad T (eds.) KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, New York, NY, USA, pp 935–940
29.
Zurück zum Zitat Peng Y, Flach P, Soares C, Brazdil P (2002) Improved dataset characterisation for meta-learning. In: S. Lange, K. Satoh, C. Smith (eds.) Discovery Science, Lecture Notes in Computer Science, vol. 2534, Springer, Heidelberg, pp 193–208 Peng Y, Flach P, Soares C, Brazdil P (2002) Improved dataset characterisation for meta-learning. In: S. Lange, K. Satoh, C. Smith (eds.) Discovery Science, Lecture Notes in Computer Science, vol. 2534, Springer, Heidelberg, pp 193–208
30.
Zurück zum Zitat Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Meta-learning by landmarking various learning algorithms. In: In Proceedings of the Seventeenth international conference on machine learning, Morgan Kaufmann, pp 743–750 Pfahringer B, Bensusan H, Giraud-Carrier C (2000) Meta-learning by landmarking various learning algorithms. In: In Proceedings of the Seventeenth international conference on machine learning, Morgan Kaufmann, pp 743–750
32.
Zurück zum Zitat Qiao Z, Zhou L, Huang JZ (2009) Sparse linear discriminant analysis with applications to high dimensional low sample size data. IAENG Int J Appl Math. 39(1):48–60MATHMathSciNet Qiao Z, Zhou L, Huang JZ (2009) Sparse linear discriminant analysis with applications to high dimensional low sample size data. IAENG Int J Appl Math. 39(1):48–60MATHMathSciNet
34.
Zurück zum Zitat Quinlan JR (1992) Learning with continuous classes. In Proceedings AI’92, pp. 343–348 Quinlan JR (1992) Learning with continuous classes. In Proceedings AI’92, pp. 343–348
36.
Zurück zum Zitat Rendell L, Cho H (1990) Empirical learning as a function of concept character. Mach Learn. 5:267–298 Rendell L, Cho H (1990) Empirical learning as a function of concept character. Mach Learn. 5:267–298
37.
Zurück zum Zitat Rice JR (1976) The algorithm selection problem. Adv Comput. 15:65–118 Rice JR (1976) The algorithm selection problem. Adv Comput. 15:65–118
38.
Zurück zum Zitat Segrera S, Pinho J, Moreno M (2008) Information-theoretic measures for meta-learning. In: Corchado E, Abraham A, Pedrycz W (eds.) Hybrid Artificial Intelligence Systems, Lecture Notes in Computer Science, vol. 5271, Springer, Heidelberg, pp 458–465 Segrera S, Pinho J, Moreno M (2008) Information-theoretic measures for meta-learning. In: Corchado E, Abraham A, Pedrycz W (eds.) Hybrid Artificial Intelligence Systems, Lecture Notes in Computer Science, vol. 5271, Springer, Heidelberg, pp 458–465
39.
Zurück zum Zitat Sohn SY (1999) Meta analysis of classification algorithms for pattern recognition. IEEE Trans Pattern Anal Mach Intell. 21(11):1137 –1144CrossRef Sohn SY (1999) Meta analysis of classification algorithms for pattern recognition. IEEE Trans Pattern Anal Mach Intell. 21(11):1137 –1144CrossRef
40.
Zurück zum Zitat Todorovski L, Brazdil P, Soares C (2000) Report on the experiments with feature selection in meta-level learning. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP, pp 27–39 Todorovski L, Brazdil P, Soares C (2000) Report on the experiments with feature selection in meta-level learning. In: Proceedings of the PKDD-00 Workshop on Data Mining, Decision Support, Meta-Learning and ILP, pp 27–39
42.
Zurück zum Zitat Wolpert DH (1996) The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7):1341–1390CrossRef Wolpert DH (1996) The lack of a priori distinctions between learning algorithms. Neural Comput. 8(7):1341–1390CrossRef
Metadaten
Titel
Automatic classifier selection for non-experts
verfasst von
Matthias Reif
Faisal Shafait
Markus Goldstein
Thomas Breuel
Andreas Dengel
Publikationsdatum
01.02.2014
Verlag
Springer London
Erschienen in
Pattern Analysis and Applications / Ausgabe 1/2014
Print ISSN: 1433-7541
Elektronische ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-012-0280-z

Weitere Artikel der Ausgabe 1/2014

Pattern Analysis and Applications 1/2014 Zur Ausgabe

Industrial and Commercial Application

Recognizing objects with multiple configurations

Premium Partner