Skip to main content

2013 | OriginalPaper | Buchkapitel

14. Classification Trees and Rule-Based Models

verfasst von : Max Kuhn, Kjell Johnson

Erschienen in: Applied Predictive Modeling

Verlag: Springer New York

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Classification trees fall within the family of tree-based models and, similar to regression trees (Chapter 8), consist of nested if-then statements. Classification trees and rules are basic partitioning models and are covered in Sections 14.1 and 14.2, respectively. Ensemble methods combine many trees (or rules) into one model and tend to have much better predictive performance than single tree- or rule-based model. Popular ensemble techniques are bagging (Section 14.3), random forests (Section 14.4), boosting (Section 14.5), and C5.0 (Section 14.6). In Section 14.7 we compare the model results from two different encodings for the categorical predictors. Then in Section 14.8, we demonstrate how to train each of these models in R. Finally, exercises are provided at the end of the chapter to solidify the concepts.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
See Breiman (1996c) for a discussion of the technical nuances of splitting algorithms.
 
2
An alternate way to think of this is in terms of entropy, a measure of uncertainty. When the classes are balanced 50/50, we have no real ability to guess the outcome: it is as uncertain as possible. However, if ten samples were in class 1, we would have less uncertainty since it is more likely that a random data point would be in class 1.
 
3
Also known as the mutual information statistic. This statistic is discussed again in Chap.​ 18.
 
4
By default, C4.5 uses simple binary split of continuous predictors. However, Quinlan (1993b) also describes a technique called soft thresholding that treats values near the split point differently. For brevity, this is not discussed further here.
 
5
Because a weak classifier is used, the stage values are often close to zero.
 
6
An example of this type of argument is shown in Sect.​ 16.​9where rpart is fit using with differential costs for different types of errors.
 
Literatur
Zurück zum Zitat Bauer E, Kohavi R (1999). “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants.” Machine Learning, 36, 105–142.CrossRef Bauer E, Kohavi R (1999). “An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants.” Machine Learning, 36, 105–142.CrossRef
Zurück zum Zitat Breiman L (1996c). “Technical Note: Some Properties of Splitting Criteria.” Machine Learning, 24(1), 41–47.MathSciNetMATH Breiman L (1996c). “Technical Note: Some Properties of Splitting Criteria.” Machine Learning, 24(1), 41–47.MathSciNetMATH
Zurück zum Zitat Breiman L (2000). “Randomizing Outputs to Increase Prediction Accuracy.” Mach. Learn., 40, 229–242. ISSN 0885-6125. Breiman L (2000). “Randomizing Outputs to Increase Prediction Accuracy.” Mach. Learn., 40, 229–242. ISSN 0885-6125.
Zurück zum Zitat Breiman L, Friedman J, Olshen R, Stone C (1984). Classification and Regression Trees. Chapman and Hall, New York.MATH Breiman L, Friedman J, Olshen R, Stone C (1984). Classification and Regression Trees. Chapman and Hall, New York.MATH
Zurück zum Zitat Chan K, Loh W (2004). “LOTUS: An Algorithm for Building Accurate and Comprehensible Logistic Regression Trees.” Journal of Computational and Graphical Statistics, 13(4), 826–852.MathSciNetCrossRef Chan K, Loh W (2004). “LOTUS: An Algorithm for Building Accurate and Comprehensible Logistic Regression Trees.” Journal of Computational and Graphical Statistics, 13(4), 826–852.MathSciNetCrossRef
Zurück zum Zitat Cover TM, Thomas JA (2006). Elements of Information Theory. Wiley–Interscience. Cover TM, Thomas JA (2006). Elements of Information Theory. Wiley–Interscience.
Zurück zum Zitat Frank E, Wang Y, Inglis S, Holmes G (1998). “Using Model Trees for Classification.” Machine Learning. Frank E, Wang Y, Inglis S, Holmes G (1998). “Using Model Trees for Classification.” Machine Learning.
Zurück zum Zitat Frank E, Witten I (1998). “Generating Accurate Rule Sets Without Global Optimization.” Proceedings of the Fifteenth International Conference on Machine Learning, pp. 144–151. Frank E, Witten I (1998). “Generating Accurate Rule Sets Without Global Optimization.” Proceedings of the Fifteenth International Conference on Machine Learning, pp. 144–151.
Zurück zum Zitat Freund Y, Schapire R (1996). “Experiments with a New Boosting Algorithm.” Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156. Freund Y, Schapire R (1996). “Experiments with a New Boosting Algorithm.” Machine Learning: Proceedings of the Thirteenth International Conference, pp. 148–156.
Zurück zum Zitat Friedman J, Hastie T, Tibshirani R (2000). “Additive Logistic Regression: A Statistical View of Boosting.” Annals of Statistics, 38, 337–374.MathSciNetCrossRefMATH Friedman J, Hastie T, Tibshirani R (2000). “Additive Logistic Regression: A Statistical View of Boosting.” Annals of Statistics, 38, 337–374.MathSciNetCrossRefMATH
Zurück zum Zitat Hastie T, Tibshirani R, Friedman J (2008). The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2 edition. Hastie T, Tibshirani R, Friedman J (2008). The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer, 2 edition.
Zurück zum Zitat Hothorn T, Hornik K, Zeileis A (2006). “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics, 15(3), 651–674.MathSciNetCrossRef Hothorn T, Hornik K, Zeileis A (2006). “Unbiased Recursive Partitioning: A Conditional Inference Framework.” Journal of Computational and Graphical Statistics, 15(3), 651–674.MathSciNetCrossRef
Zurück zum Zitat Johnson K, Rayens W (2007). “Modern Classification Methods for Drug Discovery.” In A Dmitrienko, C Chuang-Stein, R D’Agostino (eds.), “Pharmaceutical Statistics Using SAS: A Practical Guide,” pp. 7–43. Cary, NC: SAS Institute Inc. Johnson K, Rayens W (2007). “Modern Classification Methods for Drug Discovery.” In A Dmitrienko, C Chuang-Stein, R D’Agostino (eds.), “Pharmaceutical Statistics Using SAS: A Practical Guide,” pp. 7–43. Cary, NC: SAS Institute Inc.
Zurück zum Zitat Kearns M, Valiant L (1989). “Cryptographic Limitations on Learning Boolean Formulae and Finite Automata.” In “Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing,”. Kearns M, Valiant L (1989). “Cryptographic Limitations on Learning Boolean Formulae and Finite Automata.” In “Proceedings of the Twenty-First Annual ACM Symposium on Theory of Computing,”.
Zurück zum Zitat Loh WY (2002). “Regression Trees With Unbiased Variable Selection and Interaction Detection.” Statistica Sinica, 12, 361–386.MathSciNetMATH Loh WY (2002). “Regression Trees With Unbiased Variable Selection and Interaction Detection.” Statistica Sinica, 12, 361–386.MathSciNetMATH
Zurück zum Zitat Menze B, Kelm B, Splitthoff D, Koethe U, Hamprecht F (2011). “On Oblique Random Forests.” Machine Learning and Knowledge Discovery in Databases, pp. 453–469. Menze B, Kelm B, Splitthoff D, Koethe U, Hamprecht F (2011). “On Oblique Random Forests.” Machine Learning and Knowledge Discovery in Databases, pp. 453–469.
Zurück zum Zitat Molinaro A, Lostritto K, Van Der Laan M (2010). “partDSA: Deletion/Substitution/Addition Algorithm for Partitioning the Covariate Space in Prediction.” Bioinformatics, 26(10), 1357–1363.CrossRef Molinaro A, Lostritto K, Van Der Laan M (2010). “partDSA: Deletion/Substitution/Addition Algorithm for Partitioning the Covariate Space in Prediction.” Bioinformatics, 26(10), 1357–1363.CrossRef
Zurück zum Zitat Ozuysal M, Calonder M, Lepetit V, Fua P (2010). “Fast Keypoint Recognition Using Random Ferns.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 448–461.CrossRef Ozuysal M, Calonder M, Lepetit V, Fua P (2010). “Fast Keypoint Recognition Using Random Ferns.” IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(3), 448–461.CrossRef
Zurück zum Zitat Quinlan R (1987). “Simplifying Decision Trees.” International Journal of Man–Machine Studies, 27(3), 221–234.CrossRef Quinlan R (1987). “Simplifying Decision Trees.” International Journal of Man–Machine Studies, 27(3), 221–234.CrossRef
Zurück zum Zitat Quinlan R (1993b). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers. Quinlan R (1993b). C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers.
Zurück zum Zitat Quinlan R (1996a). “Bagging, Boosting, and C4.5.” In “In Proceedings of the Thirteenth National Conference on Artificial Intelligence,”. Quinlan R (1996a). “Bagging, Boosting, and C4.5.” In “In Proceedings of the Thirteenth National Conference on Artificial Intelligence,”.
Zurück zum Zitat Quinlan R (1996b). “Improved use of continuous attributes in C4.5.” Journal of Artificial Intelligence Research, 4, 77–90. Quinlan R (1996b). “Improved use of continuous attributes in C4.5.” Journal of Artificial Intelligence Research, 4, 77–90.
Zurück zum Zitat Quinlan R, Rivest R (1989). “Inferring Decision Trees Using the Minimum Description Length Principle.” Information and computation, 80(3), 227–248.MathSciNetCrossRefMATH Quinlan R, Rivest R (1989). “Inferring Decision Trees Using the Minimum Description Length Principle.” Information and computation, 80(3), 227–248.MathSciNetCrossRefMATH
Zurück zum Zitat Ruczinski I, Kooperberg C, Leblanc M (2003). “Logic Regression.” Journal of Computational and Graphical Statistics, 12(3), 475–511.MathSciNetCrossRefMATH Ruczinski I, Kooperberg C, Leblanc M (2003). “Logic Regression.” Journal of Computational and Graphical Statistics, 12(3), 475–511.MathSciNetCrossRefMATH
Zurück zum Zitat Schapire R (1990). “The Strength of Weak Learnability.” Machine Learning, 45, 197–227. Schapire R (1990). “The Strength of Weak Learnability.” Machine Learning, 45, 197–227.
Zurück zum Zitat Valiant L (1984). “A Theory of the Learnable.” Communications of the ACM, 27, 1134–1142.CrossRefMATH Valiant L (1984). “A Theory of the Learnable.” Communications of the ACM, 27, 1134–1142.CrossRefMATH
Zurück zum Zitat Wallace C (2005). Statistical and Inductive Inference by Minimum Message Length. Springer–Verlag. Wallace C (2005). Statistical and Inductive Inference by Minimum Message Length. Springer–Verlag.
Zurück zum Zitat Zeileis A, Hothorn T, Hornik K (2008). “Model–Based Recursive Partitioning.” Journal of Computational and Graphical Statistics, 17(2), 492–514.MathSciNetCrossRef Zeileis A, Hothorn T, Hornik K (2008). “Model–Based Recursive Partitioning.” Journal of Computational and Graphical Statistics, 17(2), 492–514.MathSciNetCrossRef
Metadaten
Titel
Classification Trees and Rule-Based Models
verfasst von
Max Kuhn
Kjell Johnson
Copyright-Jahr
2013
Verlag
Springer New York
DOI
https://doi.org/10.1007/978-1-4614-6849-3_14

Premium Partner