nach oben

Erschienen in:

2023 | OriginalPaper | Buchkapitel

12. Ensemble Models

verfasst von : Frank Acito

Erschienen in: Predictive Analytics with KNIME

Verlag: Springer Nature Switzerland

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Ensemble models in machine learning involve combining predictions from multiple diverse models to achieve improved accuracy and stability. This chapter explores various ensemble techniques and their benefits.

The search for the best machine learning algorithm for a particular problem is an ongoing challenge. Studies have shown that no single algorithm performs best across all datasets. This has led to the concept of ensemble learning, where the predictions of multiple models are aggregated to produce a final estimate.

The effectiveness of combining diverse independent estimates was first highlighted in “The Wisdom of Crowds.” A classic example by Sir Francis Galton demonstrated the power of combining individual estimates, leading to a more accurate prediction.

Ensemble models are created using different approaches, such as employing multiple algorithms, varying model parameters, sampling different subsets of predictor variables, or sampling observations. The benefits of ensemble models lie in reduced variation and improved accuracy.

Reduced variation ensures reliability in predictions with different data samples, allowing for a better understanding of the model’s performance with unseen data. Improved accuracy is achieved by combining independent predictions, which helps cancel out errors, resulting in better overall predictions.

Bagging, Random Forests, AdaBoost, Gradient Tree Boosting, and XGBoost are discussed. These models are popular for their ability to handle different types of data and achieve state-of-the-art performance in various contexts.

The chapter includes practical examples of ensemble modeling with continuous and binary targets. One example uses a KNIME workflow to predict used car prices using ordinary least regression (OLS) and Gradient Boosted Trees. Another example involves predicting credit status using XGBoost.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Vorheriges Kapitel Neural Networks

Nächstes Kapitel Cluster Analysis

https://www.kaggle.com/datasets/tolgahancepel/toyota-corolla

Adebayo, S. (2020). How to solve a problem. https://dataaspirant.com/author/samuel-adebayo/. Accessed 29 July 2023.

Breiman, L. (1966). Bagging predictors. Machine Learning., 24, 123–140.CrossRefMATH

Brownlee, J. (2020). Why use ensemble learning? https://machinelearningmastery.com/why-use-ensemble-learning/. Accessed 29 July 2023.

Chen, T., & Guestrin, C. (1996). XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 785–94). https://doi.org/10.1145/2939672.2939785. Accessed 29 July 2023.

Freund, Y, & Schapire, R. E. (1996). Experiments with a new boosting algorithm. In Machine learning: Proceedings of the thirteenth international conference. https://cseweb.ucsd.edu//yfreund/papers/boostingexperiments.pdf. Accessed 29 July 2023.

Friedman, J. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics., 20(5), 1189–1232.MathSciNetMATH

Friedman, J. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis., 38(4), 367–378.MathSciNetCrossRefMATH

Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence., 20(8), 832–844.CrossRef

Seni, G., & Elder, J. F. (2010). Ensemble methods in data mining: Improving accuracy through combining predictions. Morgan & Claypool Publishers.CrossRef

Surowiecki, J. (2005). The wisdom of crowds: Why the many are smarter than the few and how collective wisdom shapes business, economies. Societies and Nations. Anchor Books.

Wallis, K. F. (2014). Revisiting Francis Galton’s forecasting competition. Statistical Science., 29(3), 420–424.MathSciNetCrossRefMATH

Titel: Ensemble Models
verfasst von: Frank Acito
Verlag: Springer Nature Switzerland
Buch: Predictive Analytics with KNIME
Print ISBN: 978-3-031-45629-9

Electronic ISBN: 978-3-031-45630-5

Copyright-Jahr: 2023
DOI: https://doi.org/10.1007/978-3-031-45630-5_12

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"