Top

International Journal of Machine Learning and Cybernetics

Published in:

27-07-2016 | Original Article

Predicting run time of classification algorithms using meta-learning

Authors: Tri Doan, Jugal Kalita

Published in: International Journal of Machine Learning and Cybernetics | Issue 6/2017

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Selecting a right classification algorithm is an important step for the success of any data mining project. Run time can be used to assess efficiency of a classification algorithm of interest. Experimenting with several algorithms can increase the cost of a data mining project. Using the idea of meta-learning, we present an approach to estimate the run time of a particular classification algorithm on an arbitrary dataset. Our work provides a way to evaluate the choice of an algorithm without experimental execution. We demonstrate that the use of multivariate adaptive regression splines significantly outperforms other regression models, even a ‘state-of-the-art’ algorithm such as support vector regression.

previous article Pseudo almost periodic solutions for neutral type high-order Hopfield neural networks with mixed time-varying delays and leakage delays on time scales

next article Generalized competitive agglomeration clustering algorithm

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

ATZelectronics worldwide

ATZlectronics worldwide is up-to-speed on new trends and developments in automotive electronics on a scientific level with a high depth of information.

Order your 30-days-trial for free and without any commitment.

inform now

ATZelektronik

Die Fachzeitschrift ATZelektronik bietet für Entwickler und Entscheider in der Automobil- und Zulieferindustrie qualitativ hochwertige und fundierte Informationen aus dem gesamten Spektrum der Pkw- und Nutzfahrzeug-Elektronik.

Lassen Sie sich jetzt unverbindlich 2 kostenlose Ausgabe zusenden.

inform now

Available only for authorised users

You Z, Lei Y, Zhu L, Xia J, Wang B (2013) Prediction of protein-protein interactions from amino acid sequences with ensemble extreme learning machines and principal components analysis. BMC Bioinform 14:1CrossRef

Smith W, Foster I, Taylor V (1998) Predicting application run times using historical information. In: Feitelson DG, Rudolph L (eds) Workshop on Job Scheduling Strategies for Parallel Processing. Springer, Berlin, Heidelberg, p 122–142

Dinda P, O’Hallaron D (2000) Host load prediction using linear models. Clust Comput 3(4):265–280CrossRef

Lee B, Schopf J Run-time prediction of parallel applications on shared environments. In Proceedings of 2003 IEEE International Conference on Cluster Computing, p 487–491

Zhang Y, Sun W, Inoguchi Y (2008) Predict task running time in grid environments based on CPU load predictions. Future Gener Comput Syst 24:489–497CrossRef

Weichslgartner A, Gangadharan D, Wildermann S, Glab M, Teich J (2014) DAARM: design-time application analysis and run-time mapping for predictable execution in many-core systems. In: Hardware/Software Codesign and System Synthesis (CODES + ISSS)

King R, Feng C, Sutherland A (1995) Statlog: comparison of classification algorithms on large real-world problems. Appl Artif Intell Int J 9:289–333CrossRef

Berrer H, Paterson I, Keller J (2000) Evaluation of machine-learning algorithm ranking advisors. In: Proceedings of the PKDD-2000 Workshop on Data Mining, Decision Support, Meta-Learning and ILP: Forum for Practical Problem Presentation and Prospective Solutions, Citeseer

Reif M, Shafait M, Andreas D (2011) Prediction of classifier training time including parameter optimization. In: Bach J, Edelkamp S (eds) KI 2011: Advances in artificial intelligence. 34th annual German conference on AI, Berlin, Germany, October 4-7,2011. Proceedings. Springer, Berlin, Heidelberg, p 260–271

10.

Reif M, Shafait F, Goldstein M, Breuel T, Dengel A (2014) Automatic classifier selection for non-experts. Pattern Anal Appl 17:83–96CrossRefMathSciNet

11.

Thornton C, Hutter F, Hoos H, Leyton K (2013) Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In Proceedings of the 19th ACM SIGKDD International conference on Knowledge Discovery and Data Mining

12.

Ali S, Smith K (2006) On learning algorithm selection for classification. Appl Soft Comput 6:119–138CrossRef

13.

Aha D (1992) Generalizing from case studies: a case study 1992. In: Proceeding of the 9th International Conference on Machine Learning. Morgan Kaufmann Publishers Inc, San Francisco, pp 1–10

14.

Smola A (1996) Regression estimation with support vector learning machines. Master’s thesis, Technische Universit at M unchen

15.

Bellman R (1956) Dynamic programming and Lagrange multipliers. In: Proceedings of the National Academy of Sciences of the United States of America, p 767

16.

Burges C (2005) Geometric methods for feature selection and dimensional reduction: a guided tour. In: Rokach L, Maimon O (eds) Data mining and knowledge discovery handbook: a complete guide for practitioners and researchers, vol 1. Kluwer Academic, p 5

17.

Schölkopf B, Smola A, Muller K (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput 10:1299–1319CrossRef

18.

van der Maaten (2009) Dimensionality reduction: a comparative review. Tilburg, Netherlands: Tilburg Centre for Creative Computing, Tilburg University, Technical Report: 2009-005

19.

Massy F (1965) Principal components regression in exploratory statistical research. J Am Stat Assoc 60:234–256CrossRef

20.

Jolliffe I (2002) Principal component analysis. Wiley StatsRef: Statistics Reference Online. John Wiley & Sons, Ltd

21.

Tipping M, Micheal E, Bishop C (1999) Probabilistic principal components analysis. J R Stat Soc Ser B (Stat Methodol) 61:61–622CrossRefMathSciNet

22.

Liberty E, Wolf F, Martinsson P, Roklin V, Tygert M, Randomized algorithms for the low-rank approximation of matrices. In: Proceedings of the National Academy of Sciences

23.

Martinsson P, Rokhlin V, Tygert M (2011) A randomized algorithm for the decomposition of matrices. Appl Comput Harmon Anal 30:47–68CrossRefMATHMathSciNet

24.

Hansen P (1987) The truncated SVD as a method for regularization. BIT Numer Math 27:534–553CrossRefMATHMathSciNet

25.

Hyviirinen A, Karhunen J, Oja E (2001) Independent components analysis. Wiley, Singapore

26.

Hyvärinen A (2004) Independent component analysis. Wiley

27.

Japkowicz N, Shah M (2011) Evaluating learning algorithms: a classification perspective. Cambridge University Press, CambridgeCrossRefMATH

28.

Hennessy P (2011) Computer architecture: a quantitative approach. Elsevier

29.

Castiello C, Castellano G, Fanelli A (2005) Meta-data: characterization of input features for meta-learning. In: International Conference on Modeling Decisions for Artificial Intelligence. Springer, Berlin, Heidelberg, pp 457–468

30.

Box G, Cox D (1964) An analysis of transformations. J R Stat Soc Ser B (Methodol) 30:211–252MATH

31.

Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22CrossRef

32.

Stone M, Brook R (1990) Continuum regression: cross-validated sequentially constructed prediction embracing ordinary least squares, partial least squares and principal components regression. J R Stat Soc Ser B (Methodol) 237–269

33.

Hoerl A, Kennard R (1970) Ridge regression: biased estimation for nonorthogonal problems. Technometrics 12:55–67CrossRefMATH

34.

Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32:407–499CrossRefMATHMathSciNet

35.

Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58:267–288MATHMathSciNet

36.

Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B (Stat Methodol) 67:301–320CrossRefMATHMathSciNet

37.

Kramer O (2013) Dimensionality reduction with unsupervised nearest neighbors. Springer, Berlin, Heidelberg

38.

Friedman H (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–67CrossRefMATHMathSciNet

39.

Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297MATH

40.

Vapnik V (2013) The nature of statistical learning theory. Springer Science & Business Media. Springer, New York

41.

Baum E (1998) On the capabilities of multilayer perceptrons. J complex 4:193–215CrossRefMATHMathSciNet

42.

Breiman L (2001) Random forests. Mach Learn 45:5–32CrossRefMATH

43.

Hall M, Frank E, Holmes G, Pfahringer B (2009) The WEKA data mining software: an update. In: ACM SIGKDD Explorations Newsletter, p 10–18

44.

Blake C, Mers C (1998){UCI} Repository of machine learning databases, University of California, Department of Information and Computer Science

45.

Pedregosa F, Varoquaux G, Grmfort A, Menel V et al (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MATHMathSciNet

46.

Quinlan J (1992) Learning with continuous classes. In: 5th Australian joint conference on artificial intelligence, vol 92. World Scientific, Singapore, pp 343–348

47.

He YL, Liu J, Hu Y (2015) OWA operator based link prediction ensemble for social network. Expert Syst Appl 42:21–50CrossRef

48.

Wang X, Xing H, Li Y, Hua Q, Dong C (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23:1638–1654CrossRef

49.

Wang Z, Ashfaq R, Fu A (2015) Fuzziness based sample categorization for classifier performance improvement. J Intell Fuzzy Syst 29:1185–1196CrossRefMathSciNet

50.

He Y, Wang X, Huang J (2016) Fuzzy nonlinear regression analysis using a random weight network. Inf Sci 364:222–240CrossRef

Title: Predicting run time of classification algorithms using meta-learning
Authors: Tri Doan
Jugal Kalita
Publication date: 27-07-2016
Publisher: Springer Berlin Heidelberg
Published in: International Journal of Machine Learning and Cybernetics / Issue 6/2017
Print ISSN: 1868-8071
Electronic ISSN: 1868-808X
DOI: https://doi.org/10.1007/s13042-016-0571-6

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

ATZelectronics worldwide

ATZelektronik

Other articles of this Issue 6/2017

A new type of hybrid synchronization between arbitrary hyperchaotic maps

Relaxed exponential passivity criteria for memristor-based neural networks with leakage and time-varying delays

Improving landmark detection accuracy for self-localization through baseboard recognition

Background subtraction based on modified online robust principal component analysis

Effective micro-expression recognition using relaxed K-SVD algorithm

A projected-based neural network method for second-order cone programming