Skip to main content
Top
Published in: International Journal of Machine Learning and Cybernetics 6/2017

27-07-2016 | Original Article

Predicting run time of classification algorithms using meta-learning

Authors: Tri Doan, Jugal Kalita

Published in: International Journal of Machine Learning and Cybernetics | Issue 6/2017

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Selecting a right classification algorithm is an important step for the success of any data mining project. Run time can be used to assess efficiency of a classification algorithm of interest. Experimenting with several algorithms can increase the cost of a data mining project. Using the idea of meta-learning, we present an approach to estimate the run time of a particular classification algorithm on an arbitrary dataset. Our work provides a way to evaluate the choice of an algorithm without experimental execution. We demonstrate that the use of multivariate adaptive regression splines significantly outperforms other regression models, even a ‘state-of-the-art’ algorithm such as support vector regression.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Show more products
Appendix
Available only for authorised users
Literature
1.
go back to reference You Z, Lei Y, Zhu L, Xia J, Wang B (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal components analysis. BMC Bioinform 14:1CrossRef You Z, Lei Y, Zhu L, Xia J, Wang B (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal components analysis. BMC Bioinform 14:1CrossRef
2.
go back to reference Smith W, Foster I, Taylor V (1998) Predicting application run times using historical information. In: Feitelson DG, Rudolph L (eds) Workshop on Job Scheduling Strategies for Parallel Processing. Springer, Berlin, Heidelberg, p 122–142 Smith W, Foster I, Taylor V (1998) Predicting application run times using historical information. In: Feitelson DG, Rudolph L (eds) Workshop on Job Scheduling Strategies for Parallel Processing. Springer, Berlin, Heidelberg, p 122–142
3.
go back to reference Dinda P, O’Hallaron D (2000) Host load prediction using linear models. Clust Comput 3(4):265–280CrossRef Dinda P, O’Hallaron D (2000) Host load prediction using linear models. Clust Comput 3(4):265–280CrossRef
4.
go back to reference Lee B, Schopf J Run-time prediction of parallel applications on shared environments. In Proceedings of 2003 IEEE International Conference on Cluster Computing, p 487–491 Lee B, Schopf J Run-time prediction of parallel applications on shared environments. In Proceedings of 2003 IEEE International Conference on Cluster Computing, p 487–491
5.
go back to reference Zhang Y, Sun W, Inoguchi Y (2008) Predict task running time in grid environments based on CPU load predictions. Future Gener Comput Syst 24:489–497CrossRef Zhang Y, Sun W, Inoguchi Y (2008) Predict task running time in grid environments based on CPU load predictions. Future Gener Comput Syst 24:489–497CrossRef
6.
go back to reference Weichslgartner A, Gangadharan D, Wildermann S, Glab M, Teich J (2014) DAARM: design-time application analysis and run-time mapping for predictable execution in many-core systems. In: Hardware/Software Codesign and System Synthesis (CODES + ISSS) Weichslgartner A, Gangadharan D, Wildermann S, Glab M, Teich J (2014) DAARM: design-time application analysis and run-time mapping for predictable execution in many-core systems. In: Hardware/Software Codesign and System Synthesis (CODES + ISSS)
7.
go back to reference King R, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell Int J 9:289–333CrossRef King R, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell Int J 9:289–333CrossRef
8.
go back to reference Berrer H, Paterson I, Keller J (2000) Evaluation of machine-learning algorithm ranking advisors. In: Proceedings of the PKDD-2000 Workshop on Data Mining, Decision Support, Meta-Learning and ILP: Forum for Practical Problem Presentation and Prospective Solutions, Citeseer Berrer H, Paterson I, Keller J (2000) Evaluation of machine-learning algorithm ranking advisors. In: Proceedings of the PKDD-2000 Workshop on Data Mining, Decision Support, Meta-Learning and ILP: Forum for Practical Problem Presentation and Prospective Solutions, Citeseer
9.
go back to reference Reif M, Shafait M, Andreas D (2011) Prediction of classifier training time including parameter optimization. In: Bach J, Edelkamp S (eds) KI 2011: Advances in artificial intelligence. 34th annual German conference on AI, Berlin, Germany, October 4-7,2011. Proceedings. Springer, Berlin, Heidelberg, p 260–271 Reif M, Shafait M, Andreas D (2011) Prediction of classifier training time including parameter optimization. In: Bach J, Edelkamp S (eds) KI 2011: Advances in artificial intelligence. 34th annual German conference on AI, Berlin, Germany, October 4-7,2011. Proceedings. Springer, Berlin, Heidelberg, p 260–271
10.
go back to reference Reif M, Shafait F, Goldstein M, Breuel T, Dengel A (2014) Automatic classifier selection for non-experts. Pattern Anal Appl 17:83–96CrossRefMathSciNet Reif M, Shafait F, Goldstein M, Breuel T, Dengel A (2014) Automatic classifier selection for non-experts. Pattern Anal Appl 17:83–96CrossRefMathSciNet
11.
go back to reference Thornton C, Hutter F, Hoos H, Leyton K (2013) Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the 19th ACM SIGKDD International conference on Knowledge Discovery and Data Mining Thornton C, Hutter F, Hoos H, Leyton K (2013) Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the 19th ACM SIGKDD International conference on Knowledge Discovery and Data Mining
12.
go back to reference Ali S, Smith K (2006) On learning algorithm selection for classification. Appl Soft Comput 6:119–138CrossRef Ali S, Smith K (2006) On learning algorithm selection for classification. Appl Soft Comput 6:119–138CrossRef
13.
go back to reference Aha D (1992) Generalizing from case studies: a case study 1992. In: Proceeding of the 9th International Conference on Machine Learning. Morgan Kaufmann Publishers Inc, San Francisco, pp 1–10 Aha D (1992) Generalizing from case studies: a case study 1992. In: Proceeding of the 9th International Conference on Machine Learning. Morgan Kaufmann Publishers Inc, San Francisco, pp 1–10
14.
go back to reference Smola A (1996) Regression estimation with support vector learning machines. Master’s thesis, Technische Universit at M unchen Smola A (1996) Regression estimation with support vector learning machines. Master’s thesis, Technische Universit at M unchen
15.
go back to reference Bellman R (1956) Dynamic programming and Lagrange multipliers. In: Proceedings of the National Academy of Sciences of the United States of America, p 767 Bellman R (1956) Dynamic programming and Lagrange multipliers. In: Proceedings of the National Academy of Sciences of the United States of America, p 767
16.
go back to reference Burges C (2005) Geometric methods for feature selection and dimensional reduction: a guided tour. In: Rokach L, Maimon O (eds) Data mining and knowledge discovery handbook: a complete guide for practitioners and researchers, vol 1. Kluwer Academic, p 5 Burges C (2005) Geometric methods for feature selection and dimensional reduction: a guided tour. In: Rokach L, Maimon O (eds) Data mining and knowledge discovery handbook: a complete guide for practitioners and researchers, vol 1. Kluwer Academic, p 5
17.
go back to reference Schölkopf B, Smola A, Muller K (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319CrossRef Schölkopf B, Smola A, Muller K (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319CrossRef
18.
go back to reference van der Maaten (2009) Dimensionality reduction: a comparative review. Tilburg, Netherlands: Tilburg Centre for Creative Computing, Tilburg University, Technical Report: 2009-005 van der Maaten (2009) Dimensionality reduction: a comparative review. Tilburg, Netherlands: Tilburg Centre for Creative Computing, Tilburg University, Technical Report: 2009-005
19.
go back to reference Massy F (1965) Principal components regression in exploratory statistical research. J Am Stat Assoc 60:234–256CrossRef Massy F (1965) Principal components regression in exploratory statistical research. J Am Stat Assoc 60:234–256CrossRef
20.
go back to reference Jolliffe I (2002) Principal component analysis. Wiley StatsRef: Statistics Reference Online. John Wiley & Sons, Ltd Jolliffe I (2002) Principal component analysis. Wiley StatsRef: Statistics Reference Online. John Wiley & Sons, Ltd
21.
go back to reference Tipping M, Micheal E, Bishop C (1999) Probabilistic principal components analysis. J R Stat Soc Ser B (Stat Methodol) 61:61–622CrossRefMathSciNet Tipping M, Micheal E, Bishop C (1999) Probabilistic principal components analysis. J R Stat Soc Ser B (Stat Methodol) 61:61–622CrossRefMathSciNet
22.
go back to reference Liberty E, Wolf F, Martinsson P, Roklin V, Tygert M, Randomized algorithms for the low-rank approximation of matrices. In: Proceedings of the National Academy of Sciences Liberty E, Wolf F, Martinsson P, Roklin V, Tygert M, Randomized algorithms for the low-rank approximation of matrices. In: Proceedings of the National Academy of Sciences
23.
go back to reference Martinsson P, Rokhlin V, Tygert M (2011) A randomized algorithm for the decomposition of matrices. Appl Comput Harmon Anal 30:47–68CrossRefMATHMathSciNet Martinsson P, Rokhlin V, Tygert M (2011) A randomized algorithm for the decomposition of matrices. Appl Comput Harmon Anal 30:47–68CrossRefMATHMathSciNet
25.
go back to reference Hyviirinen A, Karhunen J, Oja E (2001) Independent components analysis. Wiley, Singapore Hyviirinen A, Karhunen J, Oja E (2001) Independent components analysis. Wiley, Singapore
26.
go back to reference Hyvärinen A (2004) Independent component analysis. Wiley Hyvärinen A (2004) Independent component analysis. Wiley
27.
go back to reference Japkowicz N, Shah M (2011) Evaluating learning algorithms: a classification perspective. Cambridge University Press, CambridgeCrossRefMATH Japkowicz N, Shah M (2011) Evaluating learning algorithms: a classification perspective. Cambridge University Press, CambridgeCrossRefMATH
28.
go back to reference Hennessy P (2011) Computer architecture: a quantitative approach. Elsevier Hennessy P (2011) Computer architecture: a quantitative approach. Elsevier
29.
go back to reference Castiello C, Castellano G, Fanelli A (2005) Meta-data: characterization of input features for meta-learning. In: International Conference on Modeling Decisions for Artificial Intelligence. Springer, Berlin, Heidelberg, pp 457–468 Castiello C, Castellano G, Fanelli A (2005) Meta-data: characterization of input features for meta-learning. In: International Conference on Modeling Decisions for Artificial Intelligence. Springer, Berlin, Heidelberg, pp 457–468
30.
go back to reference Box G, Cox D (1964) An analysis of transformations. J R Stat Soc Ser B (Methodol) 30:211–252MATH Box G, Cox D (1964) An analysis of transformations. J R Stat Soc Ser B (Methodol) 30:211–252MATH
31.
go back to reference Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRef Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRef
32.
go back to reference Stone M, Brook R (1990) Continuum regression: cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. J R Stat Soc Ser B (Methodol) 237–269 Stone M, Brook R (1990) Continuum regression: cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. J R Stat Soc Ser B (Methodol) 237–269
33.
go back to reference Hoerl A, Kennard R (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67CrossRefMATH Hoerl A, Kennard R (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67CrossRefMATH
35.
go back to reference Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58:267–288MATHMathSciNet Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58:267–288MATHMathSciNet
36.
37.
go back to reference Kramer O (2013) Dimensionality reduction with unsupervised nearest neighbors. Springer, Berlin, Heidelberg Kramer O (2013) Dimensionality reduction with unsupervised nearest neighbors. Springer, Berlin, Heidelberg
39.
go back to reference Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297MATH Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297MATH
40.
go back to reference Vapnik V (2013) The nature of statistical learning theory. Springer Science & Business Media. Springer, New York Vapnik V (2013) The nature of statistical learning theory. Springer Science & Business Media. Springer, New York
43.
go back to reference Hall M, Frank E, Holmes G, Pfahringer B (2009) The WEKA data mining software: an update. In: ACM SIGKDD Explorations Newsletter, p 10–18 Hall M, Frank E, Holmes G, Pfahringer B (2009) The WEKA data mining software: an update. In: ACM SIGKDD Explorations Newsletter, p 10–18
44.
go back to reference Blake C, Mers C (1998){UCI} Repository of machine learning databases, University of California, Department of Information and Computer Science Blake C, Mers C (1998){UCI} Repository of machine learning databases, University of California, Department of Information and Computer Science
45.
go back to reference Pedregosa F, Varoquaux G, Grmfort A, Menel V et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MATHMathSciNet Pedregosa F, Varoquaux G, Grmfort A, Menel V et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MATHMathSciNet
46.
go back to reference Quinlan J (1992) Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence, vol 92. World Scientific, Singapore, pp 343–348 Quinlan J (1992) Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence, vol 92. World Scientific, Singapore, pp 343–348
47.
go back to reference He YL, Liu J, Hu Y (2015) OWA operator based link prediction ensemble for social network. Expert Syst Appl 42:21–50CrossRef He YL, Liu J, Hu Y (2015) OWA operator based link prediction ensemble for social network. Expert Syst Appl 42:21–50CrossRef
48.
go back to reference Wang X, Xing H, Li Y, Hua Q, Dong C (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23:1638–1654CrossRef Wang X, Xing H, Li Y, Hua Q, Dong C (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23:1638–1654CrossRef
49.
go back to reference Wang Z, Ashfaq R, Fu A (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29:1185–1196CrossRefMathSciNet Wang Z, Ashfaq R, Fu A (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29:1185–1196CrossRefMathSciNet
50.
go back to reference He Y, Wang X, Huang J (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364:222–240CrossRef He Y, Wang X, Huang J (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364:222–240CrossRef
Metadata
Title
Predicting run time of classification algorithms using meta-learning
Authors
Tri Doan
Jugal Kalita
Publication date
27-07-2016
Publisher
Springer Berlin Heidelberg
Published in
International Journal of Machine Learning and Cybernetics / Issue 6/2017
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-016-0571-6

Other articles of this Issue 6/2017

International Journal of Machine Learning and Cybernetics 6/2017 Go to the issue