Online Ensemble Learning: An Empirical Study

Fern, Alan; Givan, Robert

doi:10.1023/A:1025619426553

Online Ensemble Learning: An Empirical Study

Published: October 2003

Volume 53, pages 71–109, (2003)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Online Ensemble Learning: An Empirical Study

Download PDF

Alan Fern¹ &
Robert Givan¹

2141 Accesses
87 Citations
Explore all metrics

Abstract

We study resource-limited online learning, motivated by the problem of conditional-branch outcome prediction in computer architecture. In particular, we consider (parallel) time and space-efficient ensemble learners for online settings, empirically demonstrating benefits similar to those shown previously for offline ensembles. Our learning algorithms are inspired by the previously published “boosting by filtering” framework as well as the offline Arc-x4 boosting-style algorithm. We train ensembles of online decision trees using a novel variant of the ID4 online decision-tree algorithm as the base learner, and show empirical results for both boosting and bagging-style online ensemble methods. Our results evaluate these methods on both our branch prediction domain and online variants of three familiar machine-learning benchmarks. Our data justifies three key claims. First, we show empirically that our extensions to ID4 significantly improve performance for single trees and additionally are critical to achieving performance gains in tree ensembles. Second, our results indicate significant improvements in predictive accuracy with ensemble size for the boosting-style algorithm. The bagging algorithms we tried showed poor performance relative to the boosting-style algorithm (but still improve upon individual base learners). Third, we show that ensembles of small trees are often able to outperform large single trees with the same number of nodes (and similarly outperform smaller ensembles of larger trees that use the same total number of nodes). This makes online boosting particularly useful in domains such as branch prediction with tight space restrictions (i.e., the available real-estate on a microprocessor chip).

References

Bauer, E., & Kohavi, R. (1999). An empirical comparison of voting classification algorithms: Bagging, boosting and variants. Machine Learning, 36:1/2, 105-142.
Google Scholar
Breiman, L. (1996a). Bagging predictors. Machine Learning, 24:2, 123-140.
Google Scholar
Breiman, L. (1996b). Arcing classifiers. Technical Report 460, Department of Statistics, University of California, Berkeley, CA.
Google Scholar
Burger, D., & Austin, T. (1997). The SimpleScalar tool set, version 2.0. Technical Report 1342, Department of Computer Science, University of Wisconsin-Madison.
Calder, B., Grunwald, D., Lindsay, D., Jones, M., Martin, J., Mozer, M., & Zorn, B. (1997). Evidence-based static branch prediction using machine learning. ACM Transactions on Programming Languages and Systems, 19:1, 188-222.
Google Scholar
Chang, P.-Y., Hao, E., & Patt, Y. (1995). Alternative implementations of hybrid branch predictors. In Proceedings of the 28th ACM/IEEE International Symposium on Microarchitecture (pp. 252-257).
Cormen, T. H., Leiserson, C. E., & Rivest, R. L. (1997). Arithmetic circuits. In Introduction to Algorithms. Cambridge, MA: MIT Press.
Google Scholar
Dietterich, T. G. (2000). An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Machine Learning, 40:2, 139-158.
Google Scholar
Domingo, C., & Watanabe, O. (2000). MadaBoost: A modification of AdaBoost. In Proceedings of the Thirteenth Annual Conference on Computational Learning Theory (pp. 180-189).
Eden, A., & Mudge, T. (1998). The YAGS branch predictor. In 31th Annual IEEE/ACM Symposium on Microarchitecture(pp. 69-77).
Emer, J., & Gloy, N. (1997). A language for describing predictors and its application to automatic synthesis. In Proceedings of the 24th International Symposium on Computer Architecture (pp. 304-314).
Fan,W., Stolfo, S., & Zhang, J. (1999). The application of AdaBoost for distributed, scalable and on-line learning. In Proceedings of the Fifth SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 362-366).
Fern, A., Givan, R., Falsafi, B., & Vijaykumar, T. N. (2000). Dynamic feature selection for hardware prediction. Technical Report TR-ECE 00-12, School of Electrical & Computer Engineering, Purdue University.
Freund,Y. (1995). Boosting a weak learning algorithm by majority. Information and Computation, 121:2, 256-285.
Google Scholar
Freund, Y.,& Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Proceedings of the Thirteenth International Conference on Machine Learning (pp. 148-156).
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:1, 119-139.
Google Scholar
Grove, A., & Schuurmans, D. (1998). Boosting in the limit: Maximizing the margin of learned ensembles. In Proceedings of the Fifteenth National Conference on Artificial Intelligence (pp. 692-699).
Gwennap, L. (1996). Digital 21264 sets new standard. In Microprocessor Report, Oct. (pp. 9-15).
Heil, T., Smith, Z., & Smith, J. (1999). Improving branch predictors by correlation on data values. In Proceedings of the 32nd Annual International Symposium on Microarchitecutre (pp. 28-37).
Jimenez, D., Hanson, H., & Lin, C. (2001). Boolean formula-based branch prediction for future technologies. In Proceedings of the International Conference on Parallel Architectures and Compilation Technologies (pp. 97-106).
Jimenez, D.,& Lin, C. (2001). Dynamic branch prediction with perceptrons. In Proceedings of the 7th International Symposium on High Performance Computer Architecture (pp. 197-206).
Matheus, C. J., & Rendell, L. A. (1989). Constructive induction on decision trees. In Proceedings of the Eleventh International Joint Conference on Artificial Intelligence (pp. 645-650).
McFarling, S. (1993). Combining branch predictors. WRL Technical Note TN-36.
Merz, C. J., & Murphy, P. M. (1996). UCI repository of machine learning databases. http://www.ics.uci.edu/ ~mlearn/MLRepository.html.
Nair, R. (1995). Dynamic path-based branch correlation. In 28th International Symposium on Microarchitecture (pp. 15-23).
Quinlan, J. R. (1986). Induction of decision trees. Machine Learning, 1, 81-106.
Google Scholar
Quinlan, J. R. (1988). An empirical comparison of genetic and decision-tree classifiers. In Proceedings of the Fifth International Conference on Machine Learning (135-141).
Quinlan, J. R. (1996). Bagging, boosting and C4.5. In Proceedings of the Thirteenth National Conference on Artificial Intelligence (pp. 725-730).
Reilly, J. (1995). SPEC describes SPEC95 products and benchmarks. Standard Performance Evaluation Corporation August 1995 newsletter, http://www.spec.org.
Schlimmer, J. C., & Fisher, D. (1986). A case study of incremental concept induction. In Proceedings of the Fifth National Conference on Artificial Intelligence (pp. 496-501).
Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5:2, 197-227.
Google Scholar
Sherwood, T.,& Calder, B. (2001). Automated design of finite state machine predictors for customized processors. In Proceedings of the 28th International Symposium on Computer Architecture (pp. 86-97).
Utgoff, P. E. (1989). Incremental induction of decision trees. Machine Learning, 4:2, 161-186.
Google Scholar
Utgoff, P. E., Berkman, N. C., & Clouse, J. A. (1997). Decision tree induction based on efficient tree restructuring. Machine Learning, 29:1, 5-44.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, 47907, USA
Alan Fern & Robert Givan

Authors

Alan Fern
View author publications
You can also search for this author in PubMed Google Scholar
Robert Givan
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fern, A., Givan, R. Online Ensemble Learning: An Empirical Study. Machine Learning 53, 71–109 (2003). https://doi.org/10.1023/A:1025619426553

Download citation

Issue Date: October 2003
DOI: https://doi.org/10.1023/A:1025619426553

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Online Ensemble Learning: An Empirical Study

Abstract

Article PDF

Similar content being viewed by others

A random forest guided tour

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A comparative analysis of gradient boosting algorithms

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Online Ensemble Learning: An Empirical Study

Abstract

Article PDF

Similar content being viewed by others

A random forest guided tour

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A comparative analysis of gradient boosting algorithms

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation