Skip to main content
Top

2019 | OriginalPaper | Chapter

Ensembles of Nested Dichotomies with Multiple Subset Evaluation

Authors : Tim Leathart, Eibe Frank, Bernhard Pfahringer, Geoffrey Holmes

Published in: Advances in Knowledge Discovery and Data Mining

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A system of nested dichotomies (NDs) is a method of decomposing a multiclass problem into a collection of binary problems. Such a system recursively applies binary splits to divide the set of classes into two subsets, and trains a binary classifier for each split. Many methods have been proposed to perform this split, each with various advantages and disadvantages. In this paper, we present a simple, general method for improving the predictive performance of NDs produced by any subset selection techniques that employ randomness to construct the subsets. We provide a theoretical expectation for performance improvements, as well as empirical results showing that our method improves the root mean squared error of NDs, regardless of whether they are employed as an individual model or in an ensemble setting.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
This is a variant of the approach from [11], where each member of the space of NDs has an equal probability of being sampled.
 
2
Appropriate values for \(\alpha \) for a given \(\lambda \) can be found in Table 3 of [15].
 
Literature
1.
go back to reference Bengio, S., Weston, J., Grangier, D.: Label embedding trees for large multi-class tasks. In: NIPS, pp. 163–171 (2010) Bengio, S., Weston, J., Grangier, D.: Label embedding trees for large multi-class tasks. In: NIPS, pp. 163–171 (2010)
2.
go back to reference Beygelzimer, A., Langford, J., Lifshits, Y., Sorkin, G., Strehl, A.: Conditional probability tree estimation analysis and algorithms. In: UAI, pp. 51–58 (2009) Beygelzimer, A., Langford, J., Lifshits, Y., Sorkin, G., Strehl, A.: Conditional probability tree estimation analysis and algorithms. In: UAI, pp. 51–58 (2009)
4.
go back to reference Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)MATH Breiman, L.: Bagging predictors. Mach. Learn. 24(2), 123–140 (1996)MATH
5.
go back to reference Brier, G.: Verification of forecasts expressed in term of probabilities. Mon. Weather Rev. 78, 1–3 (1950)CrossRef Brier, G.: Verification of forecasts expressed in term of probabilities. Mon. Weather Rev. 78, 1–3 (1950)CrossRef
6.
go back to reference Demšar, J.: Statistical comparisons of classifiers over multiple data sets. JMLR 7(Jan), 1–30 (2006)MathSciNetMATH Demšar, J.: Statistical comparisons of classifiers over multiple data sets. JMLR 7(Jan), 1–30 (2006)MathSciNetMATH
7.
go back to reference Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. JAIR 2, 263–286 (1995)CrossRefMATH Dietterich, T.G., Bakiri, G.: Solving multiclass learning problems via error-correcting output codes. JAIR 2, 263–286 (1995)CrossRefMATH
8.
10.
go back to reference Fox, J.: Applied Regression Analysis, Linear Models, and Related Methods. Sage, Thousand Oaks (1997) Fox, J.: Applied Regression Analysis, Linear Models, and Related Methods. Sage, Thousand Oaks (1997)
11.
go back to reference Frank, E., Kramer, S.: Ensembles of nested dichotomies for multi-class problems. In: ICML, p. 39. ACM (2004) Frank, E., Kramer, S.: Ensembles of nested dichotomies for multi-class problems. In: ICML, p. 39. ACM (2004)
12.
go back to reference Freund, Y., Schapire, R.E.: Game theory, on-line prediction and boosting. In: COLT, pp. 325–332 (1996) Freund, Y., Schapire, R.E.: Game theory, on-line prediction and boosting. In: COLT, pp. 325–332 (1996)
14.
go back to reference Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRef Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRef
17.
go back to reference Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51(2), 181–207 (2003)CrossRefMATH Kuncheva, L.I., Whitaker, C.J.: Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach. Learn. 51(2), 181–207 (2003)CrossRefMATH
18.
go back to reference Leathart, T., Frank, E., Holmes, G., Pfahringer, B.: On calibration of nested dichotomies. In: Yang, Q., et al. (eds.) Advances in Knowledge Discovery and Data Mining. LNAI, vol. 11439, pp. 69–80. Springer, Heidelberg (2019)CrossRef Leathart, T., Frank, E., Holmes, G., Pfahringer, B.: On calibration of nested dichotomies. In: Yang, Q., et al. (eds.) Advances in Knowledge Discovery and Data Mining. LNAI, vol. 11439, pp. 69–80. Springer, Heidelberg (2019)CrossRef
20.
go back to reference LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRef
21.
go back to reference Lichman, M.: UCI machine learning repository (2013) Lichman, M.: UCI machine learning repository (2013)
23.
go back to reference Melnikov, V., Hüllermeier, E.: On the effectiveness of heuristics for learning nested dichotomies: an empirical analysis. Mach. Learn. 107(8–10), 1–24 (2018)MathSciNetMATH Melnikov, V., Hüllermeier, E.: On the effectiveness of heuristics for learning nested dichotomies: an empirical analysis. Mach. Learn. 107(8–10), 1–24 (2018)MathSciNetMATH
24.
go back to reference Niculescu-Mizil, A., Caruana, R.: Predicting good probabilities with supervised learning. In: ICML, pp. 625–632. ACM (2005) Niculescu-Mizil, A., Caruana, R.: Predicting good probabilities with supervised learning. In: ICML, pp. 625–632. ACM (2005)
25.
go back to reference Pimenta, E., Gama, J.: A study on error correcting output codes. In: Portuguese Conference on Artificial Intelligence, pp. 218–223. IEEE (2005) Pimenta, E., Gama, J.: A study on error correcting output codes. In: Portuguese Conference on Artificial Intelligence, pp. 218–223. IEEE (2005)
26.
27.
go back to reference Rodríguez, J.J., García-Osorio, C., Maudes, J.: Forests of nested dichotomies. Pattern Recognit. Lett. 31(2), 125–132 (2010)CrossRef Rodríguez, J.J., García-Osorio, C., Maudes, J.: Forests of nested dichotomies. Pattern Recognit. Lett. 31(2), 125–132 (2010)CrossRef
28.
go back to reference Royston, J.: Algorithm AS 177: expected normal order statistics (exact and approximate). J. R. Stat. Soc. Ser. C (Appl. Stat.) 31(2), 161–165 (1982) Royston, J.: Algorithm AS 177: expected normal order statistics (exact and approximate). J. R. Stat. Soc. Ser. C (Appl. Stat.) 31(2), 161–165 (1982)
29.
go back to reference Wever, M., Mohr, F., Hüllermeier, E.: Ensembles of evolved nested dichotomies for classification. In: GECCO, pp. 561–568. ACM (2018) Wever, M., Mohr, F., Hüllermeier, E.: Ensembles of evolved nested dichotomies for classification. In: GECCO, pp. 561–568. ACM (2018)
Metadata
Title
Ensembles of Nested Dichotomies with Multiple Subset Evaluation
Authors
Tim Leathart
Eibe Frank
Bernhard Pfahringer
Geoffrey Holmes
Copyright Year
2019
DOI
https://doi.org/10.1007/978-3-030-16148-4_7

Premium Partner