Top

Published in:

2013 | OriginalPaper | Chapter

5. Tree-Based Methods

Authors : Chris Aldrich, Lidia Auret

Published in: Unsupervised Process Monitoring and Fault Diagnosis with Machine Learning Methods

Publisher: Springer London

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

Abstract

In this chapter, tree-based methods are discussed as another of the three major machine learning paradigms considered in the book. This includes the basic information theoretical approach used to construct classification and regression trees and a few simple examples to illustrate the characteristics of decision tree models. Following this is a short introduction to ensemble theory and ensembles of decision trees, leading to random forest models, which are discussed in detail. Unsupervised learning of random forests in particular is reviewed, as these characteristics are potentially important in unsupervised fault diagnostic systems. The interpretation of random forest models includes a discussion on the assessment of the importance of variables in the model, as well as partial dependence analysis to examine the relationship between predictor variables and the response variable. A brief review of boosted trees follows that of random forests, including discussion of concepts, such as gradient boosting and the AdaBoost algorithm. The use of tree-based ensemble models is illustrated by an example on rotogravure printing and the identification of defects in hot rolled steel plate.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

inform now

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

inform now

previous chapter Statistical Learning Theory and Kernel-Based Methods

next chapter Fault Diagnosis in Steady-State Process Systems

Available only for authorised users

Binary splitting is considered here; extension to multiple splits is trivial.

The C4.5 algorithm (Quinlan 1993) scales the decrease in impurity for categorical input variables, as a bias favouring multilevel variables exists in the cross-entropy impurity function. This corrected impurity decrease is known as the gain ratio.

See “The Elements of Statistical Learning” (Hastie et al. 2009) for details.

Shi and Horvath (2006) focused on the clustering utility of random forest proximities, a subtle difference to general feature extraction applications. Here, clustering refers to the ability of a feature extraction method to generate projections where known clusters are separate, without using cluster information in training.

Amit, Y., & Geman, D. (1997). Shape quantization and recognition with randomized trees. Neural Computation, 9(7), 1545–1588.CrossRef

Archer, K. J., & Kimes, R. V. (2008). Empirical characterization of random forest variable importance measures. Computational Statistics & Data Analysis, 52(4), 2249–2260.MathSciNetMATHCrossRef

Auret, L., & Aldrich, C. (2012). Interpretation of nonlinear relationships between process variables by use of random forests. Minerals Engineering, 35, 27–42.CrossRef

Belson, W. A. (1959). Matching and prediction on the principle of biological classification. Journal of the Royal Statistical Society Series C (Applied Statistics), 8(2), 65–75.

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2), 123–140.MathSciNetMATH

Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.MATHCrossRef

Breiman, L., & Cutler, A. (2003). Manual on setting up, using, and understanding random forests v4.0. ftp://ftp.stat.berkeley.edu/pub/users/breiman/Using_random_forests_v4.0.pdf. Available at: ftp://ftp.stat.berkeley.edu/pub/users/breiman/Using_random_forests_v4.0.pdf. Accessed 30 May 2008.

Breiman, L., Friedman, J. H., Olshen, R., & Stone, C. J. (1984). Classification and regression trees. Belmont: Wadsworth.MATH

Cox, T. F., & Cox, M. A. A. (2001). Multidimensional scaling. Boca Raton: Chapman & Hall.MATH

Cutler, A. (2009). Random forests. In useR! The R User Conference 2009. Available at: http://www.agrocampus-ouest.fr/math/useR-2009/

Cutler, A., & Stevens, J. R. (2006). Random forests for microarrays. In Methods in enzymology; DNA microarrays, Part B: Databases and statistics (pp. 422–432). San Diego: Academic Press.

Dietterich, T. G. (2000a). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40(2), 139–157.CrossRef

Dietterich, T. (2000b). Ensemble methods in machine learning. In Multiple classifier systems (Lecture notes in computer science, pp. 1–15). Berlin/Heidelberg: Springer. Available at: http://dx.doi.org/10.1007/3-540-45014-9_1.

Evans, B., & Fisher, D. (1994). Overcoming process delays with decision tree induction. IEEE Expert, 9(1), 60–66.CrossRef

Frank, A., & Asuncion, A. (2010). UCI machine learning repository. University of California, Irvine, School of Information and Computer Sciences. Available at: http://archive.ics.uci.edu/ml

Freund, Y., & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Machine Learning. Proceedings of the Thirteenth International Conference (ICML’96)| (pp.148–156|558).

Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139.MathSciNetMATHCrossRef

Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232.MathSciNetMATHCrossRef

Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367–378.MathSciNetMATHCrossRef

Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 28(2), 337–374.MathSciNetMATHCrossRef

Gillo, M. W., & Shelly, M. W. (1974). Predictive modeling of multivariable and multivariate data. Journal of the American Statistical Association, 69(347), 646–653.MATHCrossRef

Hansen, L., & Salamon, P. (1990). Neural network ensembles. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(10), 993–1001.CrossRef

Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning – Data mining, inference and prediction. New York: Springer.MATHCrossRef

Ho, T. K. (1995). Random decision forests. In Proceedings of the Third International Conference on Document Analysis and Recognition (pp. 278–282). ICDAR1995. Montreal: IEEE Computer Society.

Izenman, A. (2008). Modern multivariate statistical techniques: Regression, classification, and manifold learning. New York/London: Springer.CrossRef

Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. Journal of the Royal Statistical Society Series C (Applied Statistics), 29(2), 119–127.

Messenger, R., & Mandell, L. (1972). A modal search technique for predictive nominal scale multivariate analysis. Journal of the American Statistical Association, 67(340), 768–772.

Morgan, J. N., & Sonquist, J. A. (1963). Problems in the analysis of survey data, and a proposal. Journal of the American Statistical Association, 58(302), 415–434.MATHCrossRef

Nicodemus, K. K., & Malley, J. D. (2009). Predictor correlation impacts machine learning algorithms: Implications for genomic studies. Bioinformatics, 25(15), 1884–1890.CrossRef

Polikar, R. (2006). Ensemble based systems in decision making. Circuits and Systems Magazine, IEEE, 6(3), 21–45.CrossRef

Quinlan, J. (1986). Induction of decision trees. Machine Learning, 1(1), 81–106.

Quinlan, R. (1993). C4.5: Programs for machine learning. Palo Alto: Morgan Kaufmann.

Ratsch, G., Onoda, T., & Muller, K. (2001). Soft margins for AdaBoost. Machine Learning, 42(3), 287–320.CrossRef

RuleQuest Research. (2011). Data mining tools See5 and C5.0. Information on See5/C5.0. Available at: http://www.rulequest.com/see5-info.html. Accessed 10 Feb 2011.

Sammon, J. W. (1969). A nonlinear mapping for data structure analysis. IEEE Transactions on Computers, C-18(5), 401–409.CrossRef

Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5(2), 197–227.

Schapire, R., Freund, Y., Bartlett, P., & Lee, W. (1998). Boosting the margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26(5), 1651–1686.MathSciNetMATHCrossRef

Shi, T., & Horvath, S. (2006). Unsupervised learning with random forest predictors. Journal of Computational and Graphical Statistics, 15(1), 118–138.MathSciNetCrossRef

Strobl, C., Boulesteix, A., Kneib, T., Augustin, T., & Zeileis, A. (2008). Conditional variable importance for random forests. BMC Bioinformatics, 9(1), 307–317.CrossRef

Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14(4), 323–348.CrossRef

Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27(11), 1134–1142.MATHCrossRef

Title: Tree-Based Methods
Authors: Chris Aldrich
Lidia Auret
Publisher: Springer London
Book: Unsupervised Process Monitoring and Fault Diagnosis with Machine Learning Methods
Print ISBN: 978-1-4471-5184-5

Electronic ISBN: 978-1-4471-5185-2

Copyright Year: 2013
DOI: https://doi.org/10.1007/978-1-4471-5185-2_5

Springer Professional

Abstract

Please log in to get access to your license.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner