Skip to main content
Log in

Twin Boosting: improved feature selection and prediction

  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

We propose Twin Boosting which has much better feature selection behavior than boosting, particularly with respect to reducing the number of false positives (falsely selected features). In addition, for cases with a few important effective and many noise features, Twin Boosting also substantially improves the predictive accuracy of boosting. Twin Boosting is as general and generic as (gradient-based) boosting. It can be used with general weak learners and in a wide variety of situations, including generalized regression, classification or survival modeling. Furthermore, it is computationally feasible for large problems with potentially many more features than observed samples. Finally, for the special case of orthonormal linear models, we prove equivalence of Twin Boosting to the adaptive Lasso which provides some theoretical aspects on feature selection with Twin Boosting.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Breiman, L.: Arcing classifiers (with discussion). Ann. Stat. 26, 801–849 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  • Breiman, L.: Prediction games & arcing algorithms. Neural Comput. 11, 1493–1517 (1999)

    Article  Google Scholar 

  • Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  • Bühlmann, P.: Boosting for high-dimensional linear models. Ann. Stat. 34, 559–583 (2006)

    Article  MATH  Google Scholar 

  • Bühlmann, P., Hothorn, T.: Boosting algorithms: regularization, prediction and model fitting (with discussion). Stat. Sci. 22, 477–505 (2007)

    Article  Google Scholar 

  • Bühlmann, P., Meier, L.: Discussion of “One-step sparse estimates in nonconcave penalized likelihood models” (H. Zou and R. Li, auths.). Ann. Stat. 36, 1534–1541 (2008)

    Article  MATH  Google Scholar 

  • Bühlmann, P., Yu, B.: Boosting with the L 2 loss: regression and classification. J. Am. Stat. Assoc. 98, 324–339 (2003)

    Article  MATH  Google Scholar 

  • Bühlmann, P., Yu, B.: Sparse boosting. J. Mach. Learn. Res. 7, 1001–1024 (2006)

    MathSciNet  Google Scholar 

  • Conlon, E., Liu, X., Lieb, J., Liu, J.: Integrating regulatory motif discovery and genome-wide expression analysis. Proc. Natl. Acad. Sci. USA 100, 3339–3344 (2003)

    Article  Google Scholar 

  • Cox, D.: Partial likelihood. Biometrika 62, 269–276 (1975)

    Article  MATH  MathSciNet  Google Scholar 

  • Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression (with discussion). Ann. Stat. 32, 407–451 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Freund, Y., Schapire, R.: Experiments with a new boosting algorithm. In: Proceedings of the Thirteenth International Conference on Machine Learning. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  • Freund, Y., Schapire, R.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55, 119–139 (1997)

    Article  MATH  MathSciNet  Google Scholar 

  • Friedman, J.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)

    Article  MATH  Google Scholar 

  • Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: a statistical view of boosting (with discussion). Ann. Stat. 28, 337–407 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  • Guyon, I., Gunn, S., Nikravesh, M., Zadeh, L.: Feature Extraction, Foundations and Applications, Studies in Fuzziness and Soft Computing. Springer, Heidelberg (2006)

    Google Scholar 

  • Hand, D.: Classifier technology and the illusion of progress (with discussion). Stat. Sci. 21, 1–34 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Huang, J., Ma, S., Zhang, C.-H.: Adaptive Lasso for sparse high-dimensional regression models. Stat. Sin. 18, 1603–1618 (2008)

    MATH  MathSciNet  Google Scholar 

  • Jamain, A., Hand, D.: The naive Bayes mystery. Pattern Recogn. Lett. 26, 1752–1760 (2005)

    Article  Google Scholar 

  • Lutz, R.: Logitboost with trees applied to the WCCI 2006 performance prediction challenge datasets. In: Proceedings of the IJCNN 2006

  • Lutz, R.W., Bühlmann, P.: Conjugate direction boosting. J. Comput. Graph. Stat. 15, 287–311 (2006)

    Article  Google Scholar 

  • Meinshausen, N.: Relaxed Lasso. Comput. Stat. Data Anal. 52, 374–393 (2007)

    Article  MATH  MathSciNet  Google Scholar 

  • Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the Lasso. Ann. Stat. 34, 1436–1462 (2006)

    Article  MATH  Google Scholar 

  • Meir, R., Rätsch, G.: An introduction to boosting and leveraging. In: Mendelson, S., Smola, A. (eds.) Advanced Lectures on Machine Learning. Lecture Notes in Computer Science. Springer, Berlin (2003)

    Google Scholar 

  • Rätsch, G., Onoda, T., Müller, K.: Soft margins for AdaBoost. Mach. Learn. 42, 287–320 (2001)

    Article  MATH  Google Scholar 

  • Ridgeway, G.: The state of boosting. Comput. Sci. Stat. 31, 172–181 (1999)

    Google Scholar 

  • Schapire, R.: The boosting approach to machine learning: an overview. In: Denison, D., Hansen, M., Holmes, C., Mallick, B., Yu, B. (eds.) MSRI Workshop on Nonlinear Estimation and Classification. Springer, Berlin (2002)

    Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc., Ser. B 58, 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  • Tutz, G., Binder, H.: Generalized additive modeling with implicit variable selection by likelihood-based boosting. Biometrics 62, 961–971 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Tutz, G., Reithinger, F.: A boosting approach to flexible semiparametric mixed models. Stat. Med. 26, 2872–2900 (2007)

    Article  MathSciNet  Google Scholar 

  • Zhao, P., Yu, B.: On model selection consistency of Lasso. J. Mach. Learn. Res. 7, 2541–2563 (2006)

    MathSciNet  Google Scholar 

  • Zou, H.: The adaptive Lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)

    Article  MATH  Google Scholar 

  • Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc., Ser. B 67, 301–320 (2005)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Bühlmann.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bühlmann, P., Hothorn, T. Twin Boosting: improved feature selection and prediction. Stat Comput 20, 119–138 (2010). https://doi.org/10.1007/s11222-009-9148-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11222-009-9148-5

Keywords

Navigation