Open Access
December 2009 Improving the precision of classification trees
Wei-Yin Loh
Ann. Appl. Stat. 3(4): 1710-1737 (December 2009). DOI: 10.1214/09-AOAS260

Abstract

Besides serving as prediction models, classification trees are useful for finding important predictor variables and identifying interesting subgroups in the data. These functions can be compromised by weak split selection algorithms that have variable selection biases or that fail to search beyond local main effects at each node of the tree. The resulting models may include many irrelevant variables or select too few of the important ones. Either eventuality can lead to erroneous conclusions. Four techniques to improve the precision of the models are proposed and their effectiveness compared with that of other algorithms, including tree ensembles, on real and simulated data sets.

Citation

Download Citation

Wei-Yin Loh. "Improving the precision of classification trees." Ann. Appl. Stat. 3 (4) 1710 - 1737, December 2009. https://doi.org/10.1214/09-AOAS260

Information

Published: December 2009
First available in Project Euclid: 1 March 2010

zbMATH: 1184.62109
MathSciNet: MR2752155
Digital Object Identifier: 10.1214/09-AOAS260

Keywords: bagging , discrimination , kernel density , nearest neighbor , prediction , Random forest , selection bias , Variable selection

Rights: Copyright © 2009 Institute of Mathematical Statistics

Vol.3 • No. 4 • December 2009
Back to Top