Skip to main content

2021 | OriginalPaper | Buchkapitel

9. Decision Trees

verfasst von : Vladimir Shikhman, David Müller

Erschienen in: Mathematical Foundations of Big Data Analytics

Verlag: Springer Berlin Heidelberg

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Decision tree learning is one of the predictive modelling approaches widely used in the fields of data mining and machine learning. It uses a decision tree to go through testing an object in the nodes to conclusions about its target variable’s value in the leaves. Decision trees are among the most popular machine learning algorithms given their intelligibility and simplicity. The applications range from the prediction of Titanic survival to the artificial intelligence chess playing. In this chapter, we focus on the classification decision trees, so that their leaves represent class labels. The quality of such decision trees is measured by means of both the misclassification rate on the given data, and the average external path length. Especially for identification decision trees with zero misclassification rate, we show that finding those with the minimal average external path length is an NP-complete problem. Due to this negative theoretical result, top-down and bottom-up heuristics are proposed to nevertheless construct decision trees whose quality is sufficient at least from the practical point of view. Based on various generalization errors, such as train error, entropy, and Gini index, we present the iterative dichotomizer algorithm for this purpose. The iterative dichotomizer splits at each step the data set by maximizing the gain derived from the chosen generalization error. Afterwards, we briefly elaborate on the pruning of decision trees.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat L. Breiman, J. Friedman, R. Olshen, C. Stone, Classification and Regression Trees (Brooks/Cole Publishing, Monterey, 1984)MATH L. Breiman, J. Friedman, R. Olshen, C. Stone, Classification and Regression Trees (Brooks/Cole Publishing, Monterey, 1984)MATH
Zurück zum Zitat L. Hyafil, R.L. Rivest, Constructing optimal binary decision trees is NP-complete. Information Processing Letters 5, 15–17 (2009)MathSciNetCrossRef L. Hyafil, R.L. Rivest, Constructing optimal binary decision trees is NP-complete. Information Processing Letters 5, 15–17 (2009)MathSciNetCrossRef
Zurück zum Zitat R.M. Karp, Reducibility among combinatorial problems, in Complexity of Computer Computations. Proceedings of a Symposium on the Complexity of Computer Computations, ed. by R.E. Miller, J.W. Thatcher (Plenum, New York, 1972), pp. 85–1103CrossRef R.M. Karp, Reducibility among combinatorial problems, in Complexity of Computer Computations. Proceedings of a Symposium on the Complexity of Computer Computations, ed. by R.E. Miller, J.W. Thatcher (Plenum, New York, 1972), pp. 85–1103CrossRef
Zurück zum Zitat J.R. Quinlan, Induction of decision trees. Machine Learning 1, 81–106 (1986) J.R. Quinlan, Induction of decision trees. Machine Learning 1, 81–106 (1986)
Zurück zum Zitat L. Rokach, O. Maimon, Data Mining with Decision Trees: Theory and Applications (World Scientific, Singapore, 2015)MATH L. Rokach, O. Maimon, Data Mining with Decision Trees: Theory and Applications (World Scientific, Singapore, 2015)MATH
Zurück zum Zitat S. Shalev-Shwartz, S. Ben-David, Understanding Machine Learning (Cambridge University Press, New York, 2014)CrossRef S. Shalev-Shwartz, S. Ben-David, Understanding Machine Learning (Cambridge University Press, New York, 2014)CrossRef
Metadaten
Titel
Decision Trees
verfasst von
Vladimir Shikhman
David Müller
Copyright-Jahr
2021
Verlag
Springer Berlin Heidelberg
DOI
https://doi.org/10.1007/978-3-662-62521-7_9