Abstract
Bootstrap aggregation, or bagging, is a method of reducing the prediction error of a statistical learner. The goal of bagging is to construct a new learner which is the expectation of the original learner with respect to the empirical distribution function. In nearly all cases, the expectation cannot be computed analytically, and bootstrap sampling is used to produce an approximation. The k-nearest neighbor learners are exceptions to this generalization, and exact bagging of many k-nearest neighbor learners is straightforward. This article presents computationally simple and fast formulae for exact bagging of k-nearest neighbor learners and extends exact bagging methods from the conventional bootstrap sampling (sampling n observations with replacement from a set of n observations) to bootstrap sub-sampling schemes (with and without replacement). In addition, a partially exact k-nearest neighbor regression learner is developed. The article also compares the prediction error associated with elementary and exact bagging k-nearest neighbor learners, and several other ensemble methods using a suite of publicly available data sets.
Article PDF
Similar content being viewed by others
References
Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. American Statistician, 46, 175–185.
Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: bagging, boosting, and variants. Journal of Machine Learning, 36, 105–139.
Bhattacharya, P. K. (1974). Convergence of sample paths of normalized sums of induced order statistics. Annals of Statistics, 2, 1034–1039.
Biau, G., Devroye, L., & Lugosi, L. (2008). Consistency of random forests and other averaging classifiers. Preprint, University Paris VI.
Bickel, P. J., Götze, F., & van Zwet, W. R. (1997). Resampling fewer than n observations: gains, losses, and remedies for losses. Statistica Sinica, 7, 1–31.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Bühlmann, P., & Yu, B. (2002). Analyzing bagging. Annals of Statistics, 30, 927–961.
Buja, A., & Stuetzle, W. (2006). Observations on bagging. Statistica Sinica, 16(2), 323–352.
Caprile, B., Merler, S., Furlanello, C., & Jurman, G. (2004). Exact bagging with k-nearest neighbor classifiers. In F. Roli, J. Kittler, & T. Windeatt (Eds.), LNCS : Vol. 3077. MCS 2004 (pp. 72–81). Berlin: Springer.
Cleveland, W. S., & Devlin, S. J. (1988). Locally weighted regression: an approach to regression analysis by local fitting. Journal of the American Statistical Association, 83, 596–610.
Cover, T., & Hart, P. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13, 21–27.
Efron, B., & Tibshirani, R. (1997). Improvements on cross-validation: the 632+ bootstrap method. Journal of the American Statistical Association, 92, 548–560.
Friedman, J. (1989). Regularized discriminant analysis. Journal of the American Statistical Association, 84, 165–175.
Friedman, J., & Hall, P. (2007). On bagging and nonlinear estimation. Journal of Statistical Planning and Inference, 137(3), 669–683.
Friedman, J., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting. The Annals of Statistics, 38(2), 337–374.
Gertheiss, J., & Tutz, G. (2008). Feature selection and weighting by nearest neighbor ensembles (Technical Report Number 033). Department of Statistics, University of Munich.
Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York: Springer.
Hutson, A. D., & Ernst, M. D. (2000). The exact bootstrap mean and variance of an L-estimator. Journal of the Royal Statistical Society, Series B, 62, 89–94.
Freund, Y., & Schapire, R. (1997). A decision-theoretic generalization of online learning and an application to boosting. Journal of Computer and Systems Sciences, 5, 119–139.
Grandvalet, Y. (2004). Bagging equalizes influence. Machine Learning, 55(3), 251–270.
Hall, P., & Samworth, R. L. (2005). Properties of bagged nearest neighbor classifiers. Journal of the Royal Statistical Society, Series B, 67(3), 363–379.
Hoerl, A. E., & Kennard, R. (1970). Ridge regression: biased estimation for nonorthogonal problems. Technometrics, 12, 55–67.
Loader, C. (1999). Local regression and likelihood. New York: Springer.
Loh, W. L. (1995). On linear discriminant analysis with adaptive ridge classification rules. Journal of Multivariate Analysis, 53, 264–278.
Maclin, R., & Opitz, D. (1997). An empirical evaluation of bagging and boosting. In Proceedings of the fourteenth national conference on artificial intelligence (pp. 546–551), Providence, RI.
Mood, A. M., Graybill, F. A., & Boes, D. C. (1974). An introduction to the theory of statistics. New York: Wiley.
Pančov, P., & Džeroski, S. (2007). Combining bagging and random subspaces to create better ensembles. In Advances in intelligent data analysis VII (pp. 118–129). Berlin: Springer.
Ripley, B. D. (1996). Pattern recognition and neural networks. Cambridge: Cambridge University Press.
Skurichina, M., & Duin, R. P. W. (1998). Bagging for linear classifiers. Pattern Recognition, 31, 909–930.
Steele, B. M., Patterson, D. A., & Redmond, R. L. (2003). Toward estimating thematic map accuracy without a probability test sample. Ecological and Environmental Statistics, 10, 333–356.
Wolpert, D. H. (1992). Stacked generalization. Neural Networks, 5(2), 241–259.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editor: Phil Long.
Rights and permissions
About this article
Cite this article
Steele, B.M. Exact bootstrap k-nearest neighbor learners. Mach Learn 74, 235–255 (2009). https://doi.org/10.1007/s10994-008-5096-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-008-5096-0