Skip to main content
Erschienen in: Advances in Data Analysis and Classification 1/2016

01.03.2016 | Regular Article

Extreme logistic regression

verfasst von: Che Ngufor, Janusz Wojtusiak

Erschienen in: Advances in Data Analysis and Classification | Ausgabe 1/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Kernel logistic regression (KLR) is a very powerful algorithm that has been shown to be very competitive with many state-of the art machine learning algorithms such as support vector machines (SVM). Unlike SVM, KLR can be easily extended to multi-class problems and produces class posterior probability estimates making it very useful for many real world applications. However, the training of KLR using gradient based methods or iterative re-weighted least squares can be unbearably slow for large datasets. Coupled with poor conditioning and parameter tuning, training KLR can quickly design matrix become infeasible for some real datasets. The goal of this paper is to present simple, fast, scalable, and efficient algorithms for learning KLR. First, based on a simple approximation of the logistic function, a least square algorithm for KLR is derived that avoids the iterative tuning of gradient based methods. Second, inspired by the extreme learning machine (ELM) theory, an explicit feature space is constructed through a generalized single hidden layer feedforward network and used for training iterative re-weighted least squares KLR (IRLS-KLR) and the newly proposed least squares KLR (LS-KLR). Finally, for large-scale and/or poorly conditioned problems, a robust and efficient preconditioned learning technique is proposed for learning the algorithms presented in the paper. Numerical results on a series of artificial and 12 real bench-mark datasets show first that LS-KLR compares favorable with SVM and traditional IRLS-KLR in terms of accuracy and learning speed. Second, the extension of ELM to KLR results in simple, scalable and very fast algorithms with comparable generalization performance to their original versions. Finally, the introduced preconditioned learning method can significantly increase the learning speed of IRLS-KLR.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Alcalá-Fdez J, Sánchez L, García S, Del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM et al (2009) Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318CrossRef Alcalá-Fdez J, Sánchez L, García S, Del Jesus MJ, Ventura S, Garrell JM, Otero J, Romero C, Bacardit J, Rivas VM et al (2009) Keel: a software tool to assess evolutionary algorithms for data mining problems. Soft Comput 13(3):307–318CrossRef
Zurück zum Zitat Bach FR, Jordan MI (2005) Predictive low-rank decomposition for kernel methods. In: Proceedings of the 22nd international conference on machine learning. ACM, pp 33–40 Bach FR, Jordan MI (2005) Predictive low-rank decomposition for kernel methods. In: Proceedings of the 22nd international conference on machine learning. ACM, pp 33–40
Zurück zum Zitat Cawley GC, Talbot NLC (2004) Efficient model selection for kernel logistic regression. In: IEEE pattern recognition, 2004. ICPR 2004. Proceedings of the 17th international conference, vol 2, pp 439–442 Cawley GC, Talbot NLC (2004) Efficient model selection for kernel logistic regression. In: IEEE pattern recognition, 2004. ICPR 2004. Proceedings of the 17th international conference, vol 2, pp 439–442
Zurück zum Zitat Cawley GC, Talbot NLC (2008) Efficient approximate leave-one-out cross-validation for kernel logistic regression. Mach Learn 71(2–3):243–264CrossRef Cawley GC, Talbot NLC (2008) Efficient approximate leave-one-out cross-validation for kernel logistic regression. Mach Learn 71(2–3):243–264CrossRef
Zurück zum Zitat Chu W, Ong CJ, Keerthi SS (2005) An improved conjugate gradient scheme to the solution of least squares svm. IEEE Trans Neural Netw 16(2):498–501CrossRef Chu W, Ong CJ, Keerthi SS (2005) An improved conjugate gradient scheme to the solution of least squares svm. IEEE Trans Neural Netw 16(2):498–501CrossRef
Zurück zum Zitat De Kruif BJ, De Vries TJA (2003) Pruning error minimization in least squares support vector machines. IEEE Trans Neural Netw 14(3):696–702CrossRef De Kruif BJ, De Vries TJA (2003) Pruning error minimization in least squares support vector machines. IEEE Trans Neural Netw 14(3):696–702CrossRef
Zurück zum Zitat Fine S, Scheinberg K (2002) Efficient svm training using low-rank kernel representations. J Mach Learn Res 2:243–264MATH Fine S, Scheinberg K (2002) Efficient svm training using low-rank kernel representations. J Mach Learn Res 2:243–264MATH
Zurück zum Zitat Frénay B, Verleysen M (2010) Using svms with randomised feature spaces: an extreme learning approach. In: ESANN Frénay B, Verleysen M (2010) Using svms with randomised feature spaces: an extreme learning approach. In: ESANN
Zurück zum Zitat Gestel T, Suykens J, Lanckriet G, Lambrechts A, Moor B, Vandewalle J (2002) Bayesian framework for least-squares support vector machine classifiers, gaussian processes, and kernel fisher discriminant analysis. Neural Comput 14(5):1115–1147CrossRefMATH Gestel T, Suykens J, Lanckriet G, Lambrechts A, Moor B, Vandewalle J (2002) Bayesian framework for least-squares support vector machine classifiers, gaussian processes, and kernel fisher discriminant analysis. Neural Comput 14(5):1115–1147CrossRefMATH
Zurück zum Zitat Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. Springer, New York Hastie T, Tibshirani R, Friedman JH (2001) The elements of statistical learning: data mining, inference, and prediction: with 200 full-color illustrations. Springer, New York
Zurück zum Zitat Huang G-B, Chen L, Siew C-K (2006a) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892CrossRef Huang G-B, Chen L, Siew C-K (2006a) Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans Neural Netw 17(4):879–892CrossRef
Zurück zum Zitat Huang G-B, Zhu Q-Y, Siew C-K (2006b) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501CrossRef Huang G-B, Zhu Q-Y, Siew C-K (2006b) Extreme learning machine: theory and applications. Neurocomputing 70(1):489–501CrossRef
Zurück zum Zitat Huang G-B, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74(1):155–163CrossRef Huang G-B, Ding X, Zhou H (2010) Optimization method based extreme learning machine for classification. Neurocomputing 74(1):155–163CrossRef
Zurück zum Zitat Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 42(2):513–529CrossRef Huang G-B, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybern Part B Cybern 42(2):513–529CrossRef
Zurück zum Zitat Jiao L, Bo L, Wang L (2007) Fast sparse approximation for least squares support vector machine. IEEE Trans Neural Netw 18(3):685–697CrossRef Jiao L, Bo L, Wang L (2007) Fast sparse approximation for least squares support vector machine. IEEE Trans Neural Netw 18(3):685–697CrossRef
Zurück zum Zitat Katz M, Schaffner M, Andelic E, Krüger S, Wendemuth A (2005) Sparse kernel logistic regression for phoneme classification. In: Proceedings of 10th international conference on speech and computer (SPECOM), Citeseer, vol 2, pp 523–526 Katz M, Schaffner M, Andelic E, Krüger S, Wendemuth A (2005) Sparse kernel logistic regression for phoneme classification. In: Proceedings of 10th international conference on speech and computer (SPECOM), Citeseer, vol 2, pp 523–526
Zurück zum Zitat Keerthi SS, Shevade SK (2003) Smo algorithm for least-squares svm formulations. Neural Comput 15(2):487–507CrossRefMATH Keerthi SS, Shevade SK (2003) Smo algorithm for least-squares svm formulations. Neural Comput 15(2):487–507CrossRefMATH
Zurück zum Zitat Keerthi SS, Duan KB, Shevade SK, Poo AN (2005) A fast dual algorithm for kernel logistic regression. Mach Learn 61(1–3):151–165CrossRefMATH Keerthi SS, Duan KB, Shevade SK, Poo AN (2005) A fast dual algorithm for kernel logistic regression. Mach Learn 61(1–3):151–165CrossRefMATH
Zurück zum Zitat Komarek P (2004) Logistic regression for data mining and high-dimensional classification. Robotics Institute, p 222 Komarek P (2004) Logistic regression for data mining and high-dimensional classification. Robotics Institute, p 222
Zurück zum Zitat Kuh A (2004) Least squares kernel methods and applications. In: Soft computing in communications. Springer, Berlin Heidelberg, pp 365–387 Kuh A (2004) Least squares kernel methods and applications. In: Soft computing in communications. Springer, Berlin Heidelberg, pp 365–387
Zurück zum Zitat Kulis B, Sustik M, Dhillon I (2006) Learning low-rank kernel matrices. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 505–512 Kulis B, Sustik M, Dhillon I (2006) Learning low-rank kernel matrices. In: Proceedings of the 23rd international conference on machine learning. ACM, pp 505–512
Zurück zum Zitat Le Borne S, Ngufor C (2010) An implicit approximate inverse preconditioner for saddle point problems. Electron Trans Numer Anal 37:173–188MathSciNetMATH Le Borne S, Ngufor C (2010) An implicit approximate inverse preconditioner for saddle point problems. Electron Trans Numer Anal 37:173–188MathSciNetMATH
Zurück zum Zitat Liu Q, He Q, Shi Z (2008) Extreme support vector machine classifier. In: Advances in knowledge discovery and data mining. Springer, pp 222–233 Liu Q, He Q, Shi Z (2008) Extreme support vector machine classifier. In: Advances in knowledge discovery and data mining. Springer, pp 222–233
Zurück zum Zitat Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. In: Philosophical transactions of the Royal Society of London. Series A, containing papers of a mathematical or physical character, vol 209, pp 415–446 Mercer J (1909) Functions of positive and negative type, and their connection with the theory of integral equations. In: Philosophical transactions of the Royal Society of London. Series A, containing papers of a mathematical or physical character, vol 209, pp 415–446
Zurück zum Zitat Ngufor C, Wojtusiak J (2013) Learning from large-scale distributed health data: an approximate logistic regression approach. ICML 13: role of machine learning in transforming healthcare Ngufor C, Wojtusiak J (2013) Learning from large-scale distributed health data: an approximate logistic regression approach. ICML 13: role of machine learning in transforming healthcare
Zurück zum Zitat Ramani S, Fessler JA (2010) An accelerated iterative reweighted least squares algorithm for compressed sensing mri. In: 2010 IEEE international symposium, IEEE biomedical imaging: from nano to macro, pp 257–260 Ramani S, Fessler JA (2010) An accelerated iterative reweighted least squares algorithm for compressed sensing mri. In: 2010 IEEE international symposium, IEEE biomedical imaging: from nano to macro, pp 257–260
Zurück zum Zitat Suykens JAK, Lukas L, Van Dooren P, De Moor B, Vandewalle J (1999) Least squares support vector machine classifiers: a large scale algorithm. In: European conference on circuit theory and design, ECCTD, Citeseer, vol 99, pp 839–842 Suykens JAK, Lukas L, Van Dooren P, De Moor B, Vandewalle J (1999) Least squares support vector machine classifiers: a large scale algorithm. In: European conference on circuit theory and design, ECCTD, Citeseer, vol 99, pp 839–842
Zurück zum Zitat Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300CrossRefMathSciNet Suykens JAK, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300CrossRefMathSciNet
Zurück zum Zitat Suykens JAK, Lukas L, Vandewalle J (2000) Sparse approximation using least squares support vector machines. In: The 2000 IEEE international symposium on circuits and systems, 2000. IEEE Proceedings. ISCAS 2000 Geneva, vol 2, pp 757–760 Suykens JAK, Lukas L, Vandewalle J (2000) Sparse approximation using least squares support vector machines. In: The 2000 IEEE international symposium on circuits and systems, 2000. IEEE Proceedings. ISCAS 2000 Geneva, vol 2, pp 757–760
Zurück zum Zitat Suykens JAK, De Brabanter J, Lukas L, Vandewalle J (2002a) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1):85–105CrossRefMATH Suykens JAK, De Brabanter J, Lukas L, Vandewalle J (2002a) Weighted least squares support vector machines: robustness and sparse approximation. Neurocomputing 48(1):85–105CrossRefMATH
Zurück zum Zitat Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J, Suykens JAK, Van Gestel T (2002b) Least squares support vector machines, vol 4. World Scientific, Singapore Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J, Suykens JAK, Van Gestel T (2002b) Least squares support vector machines, vol 4. World Scientific, Singapore
Zurück zum Zitat Zeng X, Chen X-W (2005) Smo-based pruning methods for sparse least squares support vector machines. IEEE Trans Neural Netw 16(6):1541–1546CrossRef Zeng X, Chen X-W (2005) Smo-based pruning methods for sparse least squares support vector machines. IEEE Trans Neural Netw 16(6):1541–1546CrossRef
Zurück zum Zitat Zhu J, Hastie T (2002) Support vector machines, kernel logistic regression and boosting. In: Multiple classifier systems. Springer, pp 16–26 Zhu J, Hastie T (2002) Support vector machines, kernel logistic regression and boosting. In: Multiple classifier systems. Springer, pp 16–26
Zurück zum Zitat Zhu J, Hastie T (2005) Kernel logistic regression and the import vector machine. J Comput Graph Stat 14(1):185–205CrossRefMathSciNet Zhu J, Hastie T (2005) Kernel logistic regression and the import vector machine. J Comput Graph Stat 14(1):185–205CrossRefMathSciNet
Metadaten
Titel
Extreme logistic regression
verfasst von
Che Ngufor
Janusz Wojtusiak
Publikationsdatum
01.03.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
Advances in Data Analysis and Classification / Ausgabe 1/2016
Print ISSN: 1862-5347
Elektronische ISSN: 1862-5355
DOI
https://doi.org/10.1007/s11634-014-0194-2

Weitere Artikel der Ausgabe 1/2016

Advances in Data Analysis and Classification 1/2016 Zur Ausgabe

Premium Partner