Top

Published in:

2013 | OriginalPaper | Chapter

13. Nonlinear Classification Models

Authors : Max Kuhn, Kjell Johnson

Published in: Applied Predictive Modeling

Publisher: Springer New York

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

Chapter 12 discussed classification models that defined linear classification boundaries. In this chapter we present models that generate nonlinear boundaries. We begin with explaining several generalizations to the linear discriminant analysis framework such as quadratic discriminant analysis, regularized discriminant analysis, and mixture discriminant analysis (Section 13.1). Other nonlinear classification models include neural networks (Section 13.2), flexible discriminant analysis (Section 13.3), support vector machines (Section 13.4), K-nearest neighbors (Section 13.5), and naive Bayes (Section 13.6). In the Computing Section (13.7) we demonstrate how to train each of these models in R. Finally, exercises are provided at the end of the chapter to solidify the concepts.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Discriminant Analysis and Other Linear Classification Models

next chapter Classification Trees and Rule-Based Models

However, MARS and FDA models tend to be more stable than tree-based models since they use linear regression to estimate the model parameters.

Recall a similar situation with support vector regression models where the prediction function was determined by the samples with the largest residuals.

Bergmeir C, Benitez JM (2012). “Neural Networks in R Using the Stuttgart Neural Network Simulator: RSNNS.” Journal of Statistical Software, 46(7), 1–26.CrossRef

Bishop C (1995). Neural Networks for Pattern Recognition. Oxford University Press, Oxford.MATH

Boser B, Guyon I, Vapnik V (1992). “A Training Algorithm for Optimal Margin Classifiers.” In “Proceedings of the Fifth Annual Workshop on Computational Learning Theory,” pp. 144–152.

Cancedda N, Gaussier E, Goutte C, Renders J (2003). “Word–Sequence Kernels.” The Journal of Machine Learning Research, 3, 1059–1082.MathSciNetMATH

Clemmensen L, Hastie T, Witten D, Ersboll B (2011). “Sparse Discriminant Analysis.” Technometrics, 53(4), 406–413.MathSciNetCrossRef

Cortes C, Vapnik V (1995). “Support–Vector Networks.” Machine Learning, 20(3), 273–297.MATH

Dillon W, Goldstein M (1984). Multivariate Analysis: Methods and Applications. Wiley, New York.MATH

Duan K, Keerthi S (2005). “Which is the Best Multiclass SVM Method? An Empirical Study.” Multiple Classifier Systems, pp. 278–285.

Friedman J (1989). “Regularized Discriminant Analysis.” Journal of the American Statistical Association, 84(405), 165–175.MathSciNetCrossRef

Hardle W, Werwatz A, Müller M, Sperlich S, Hardle W, Werwatz A, Müller M, Sperlich S (2004). “Nonparametric Density Estimation.” In “Nonparametric and Semiparametric Models,” pp. 39–83. Springer Berlin Heidelberg.

Hastie T, Tibshirani R (1996). “Discriminant Analysis by Gaussian Mixtures.” Journal of the Royal Statistical Society. Series B, pp. 155–176.

Hastie T, Tibshirani R, Buja A (1994). “Flexible Discriminant Analysis by Optimal Scoring.” Journal of the American Statistical Association, 89(428), 1255–1270.MathSciNetCrossRefMATH

Hsu C, Lin C (2002). “A Comparison of Methods for Multiclass Support Vector Machines.” IEEE Transactions on Neural Networks, 13(2), 415–425.CrossRef

Kline DM, Berardi VL (2005). “Revisiting Squared–Error and Cross–Entropy Functions for Training Neural Network Classifiers.” Neural Computing and Applications, 14(4), 310–318.CrossRef

Lodhi H, Saunders C, Shawe-Taylor J, Cristianini N, Watkins C (2002). “Text Classification Using String Kernels.” The Journal of Machine Learning Research, 2, 419–444.MATH

Mahé P, Ueda N, Akutsu T, Perret J, Vert J (2005). “Graph Kernels for Molecular Structure–Activity Relationship Analysis with Support Vector Machines.” Journal of Chemical Information and Modeling, 45(4), 939–951.CrossRef

Mahé P, Vert J (2009). “Graph Kernels Based on Tree Patterns for Molecules.” Machine Learning, 75(1), 3–35.CrossRef

Milborrow S (2012). Notes On the earth Package. URL http://cran.r-project.org/package=earth.

Niblett T (1987). “Constructing Decision Trees in Noisy Domains.” In I Bratko, N Lavrač (eds.), “Progress in Machine Learning: Proceedings of EWSL–87,” pp. 67–78. Sigma Press, Bled, Yugoslavia.

Osuna E, Freund R, Girosi F (1997). “Support Vector Machines: Training and Applications.” Technical report, MIT Artificial Intelligence Laboratory.

Platt J (2000). “Probabilistic Outputs for Support Vector Machines and Comparison to Regularized Likelihood Methods.” In B Bartlett, B Schölkopf, D Schuurmans, A Smola (eds.), “Advances in Kernel Methods Support Vector Learning,” pp. 61–74. Cambridge, MA: MIT Press.

Provost F, Domingos P (2003). “Tree Induction for Probability–Based Ranking.” Machine Learning, 52(3), 199–215.CrossRefMATH

Suykens J, Vandewalle J (1999). “Least Squares Support Vector Machine Classifiers.” Neural processing letters, 9(3), 293–300.MathSciNetCrossRefMATH

Tipping M (2001). “Sparse Bayesian Learning and the Relevance Vector Machine.” Journal of Machine Learning Research, 1, 211–244.MathSciNetMATH

Vapnik V (2010). The Nature of Statistical Learning Theory. Springer.

Venables W, Ripley B (2002). Modern Applied Statistics with S. Springer.

Zadrozny B, Elkan C (2001). “Obtaining Calibrated Probability Estimates from Decision Trees and Naive Bayesian Classifiers.” In “Proceedings of the 18th International Conference on Machine Learning,” pp. 609–616. Morgan Kaufmann.

Zhu J, Hastie T (2005). “Kernel Logistic Regression and the Import Vector Machine.” Journal of Computational and Graphical Statistics, 14(1), 185–205.MathSciNetCrossRef

Title: Nonlinear Classification Models
Authors: Max Kuhn
Kjell Johnson
Publisher: Springer New York
Book: Applied Predictive Modeling
Print ISBN: 978-1-4614-6848-6

Electronic ISBN: 978-1-4614-6849-3

Copyright Year: 2013
DOI: https://doi.org/10.1007/978-1-4614-6849-3_13

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner