nach oben

Erschienen in:

2013 | OriginalPaper | Buchkapitel

14. Classification Trees and Rule-Based Models

verfasst von : Max Kuhn, Kjell Johnson

Erschienen in: Applied Predictive Modeling

Verlag: Springer New York

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Classification trees fall within the family of tree-based models and, similar to regression trees (Chapter 8), consist of nested if-then statements. Classification trees and rules are basic partitioning models and are covered in Sections 14.1 and 14.2, respectively. Ensemble methods combine many trees (or rules) into one model and tend to have much better predictive performance than single tree- or rule-based model. Popular ensemble techniques are bagging (Section 14.3), random forests (Section 14.4), boosting (Section 14.5), and C5.0 (Section 14.6). In Section 14.7 we compare the model results from two different encodings for the categorical predictors. Then in Section 14.8, we demonstrate how to train each of these models in R. Finally, exercises are provided at the end of the chapter to solidify the concepts.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Nonlinear Classification Models

Nächstes Kapitel A Summary of Grant Application Models

See Breiman (1996c) for a discussion of the technical nuances of splitting algorithms.

An alternate way to think of this is in terms of entropy, a measure of uncertainty. When the classes are balanced 50/50, we have no real ability to guess the outcome: it is as uncertain as possible. However, if ten samples were in class 1, we would have less uncertainty since it is more likely that a random data point would be in class 1.

Also known as the mutual information statistic. This statistic is discussed again in Chap. 18.

By default, C4.5 uses simple binary split of continuous predictors. However, Quinlan (1993b) also describes a technique called soft thresholding that treats values near the split point differently. For brevity, this is not discussed further here.

Because a weak classifier is used, the stage values are often close to zero.

An example of this type of argument is shown in Sect. 16.9where rpart is fit using with differential costs for different types of errors.

Bauer E, Kohavi R (1999). “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants.” Machine Learning, 36, 105–142.CrossRef

Breiman L (1996c). “Technical Note: Some Properties of Splitting Criteria.” Machine Learning, 24(1), 41–47.MathSciNetMATH

Breiman L (1998). “Arcing Classifiers.” The Annals of Statistics, 26, 123–140.MathSciNetMATH

Breiman L (2000). “Randomizing Outputs to Increase Prediction Accuracy.” Mach. Learn., 40, 229–242. ISSN 0885-6125.

Breiman L (2001). “Random Forests.” Machine Learning, 45, 5–32.CrossRefMATH

Breiman L, Friedman J, Olshen R, Stone C (1984). Classification and Regression Trees. Chapman and Hall, New York.MATH

Chan K, Loh W (2004). “LOTUS: An Algorithm for Building Accurate and Comprehensible Logistic Regression Trees.” Journal of Computational and Graphical Statistics, 13(4), 826–852.MathSciNetCrossRef

Cover TM, Thomas JA (2006). Elements of Information Theory. Wiley–Interscience.

Frank E, Wang Y, Inglis S, Holmes G (1998). “Using Model Trees for Classification.” Machine Learning.

Frank E, Witten I (1998). “Generating Accurate Rule Sets Without Global Optimization.” Proceedings of the Fifteenth International Conference on Machine Learning, pp. 144–151.

Freund Y (1995). “Boosting a Weak Learning Algorithm by Majority.” Information and Computation, 121, 256–285.MathSciNetCrossRefMATH

Freund Y, Schapire R (1996). “Experiments with a New Boosting Algorithm.” Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156.

Friedman J, Hastie T, Tibshirani R (2000). “Additive Logistic Regression: A Statistical View of Boosting.” Annals of Statistics, 38, 337–374.MathSciNetCrossRefMATH

Hastie T, Tibshirani R, Friedman J (2008). The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2 edition.

Hothorn T, Hornik K, Zeileis A (2006). “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics, 15(3), 651–674.MathSciNetCrossRef

Johnson K, Rayens W (2007). “Modern Classification Methods for Drug Discovery.” In A Dmitrienko, C Chuang-Stein, R D’Agostino (eds.), “Pharmaceutical Statistics Using SAS: A Practical Guide,” pp. 7–43. Cary, NC: SAS Institute Inc.

Kearns M, Valiant L (1989). “Cryptographic Limitations on Learning Boolean Formulae and Finite Automata.” In “Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing,”.

Loh WY (2002). “Regression Trees With Unbiased Variable Selection and Interaction Detection.” Statistica Sinica, 12, 361–386.MathSciNetMATH

Menze B, Kelm B, Splitthoff D, Koethe U, Hamprecht F (2011). “On Oblique Random Forests.” Machine Learning and Knowledge Discovery in Databases, pp. 453–469.

Molinaro A, Lostritto K, Van Der Laan M (2010). “partDSA: Deletion/Substitution/Addition Algorithm for Partitioning the Covariate Space in Prediction.” Bioinformatics, 26(10), 1357–1363.CrossRef

Ozuysal M, Calonder M, Lepetit V, Fua P (2010). “Fast Keypoint Recognition Using Random Ferns.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 448–461.CrossRef

Quinlan R (1987). “Simplifying Decision Trees.” International Journal of Man–Machine Studies, 27(3), 221–234.CrossRef

Quinlan R (1993b). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.

Quinlan R (1996a). “Bagging, Boosting, and C4.5.” In “In Proceedings of the Thirteenth National Conference on Artificial Intelligence,”.

Quinlan R (1996b). “Improved use of continuous attributes in C4.5.” Journal of Artificial Intelligence Research, 4, 77–90.

Quinlan R, Rivest R (1989). “Inferring Decision Trees Using the Minimum Description Length Principle.” Information and computation, 80(3), 227–248.MathSciNetCrossRefMATH

Ruczinski I, Kooperberg C, Leblanc M (2003). “Logic Regression.” Journal of Computational and Graphical Statistics, 12(3), 475–511.MathSciNetCrossRefMATH

Schapire R (1990). “The Strength of Weak Learnability.” Machine Learning, 45, 197–227.

Shannon C (1948). “A Mathematical Theory of Communication.” The Bell System Technical Journal, 27(3), 379–423.MathSciNetCrossRefMATH

Valiant L (1984). “A Theory of the Learnable.” Communications of the ACM, 27, 1134–1142.CrossRefMATH

Wallace C (2005). Statistical and Inductive Inference by Minimum Message Length. Springer–Verlag.

Zeileis A, Hothorn T, Hornik K (2008). “Model–Based Recursive Partitioning.” Journal of Computational and Graphical Statistics, 17(2), 492–514.MathSciNetCrossRef

Titel: Classification Trees and Rule-Based Models
verfasst von: Max Kuhn
Kjell Johnson
Verlag: Springer New York
Buch: Applied Predictive Modeling
Print ISBN: 978-1-4614-6848-6

Electronic ISBN: 978-1-4614-6849-3

Copyright-Jahr: 2013
DOI: https://doi.org/10.1007/978-1-4614-6849-3_14

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner