Skip to main content
Top

Hint

Swipe to navigate through the chapters of this book

2020 | OriginalPaper | Chapter

Dynamic Classifier Selection for Data with Skewed Class Distribution Using Imbalance Ratio and Euclidean Distance

Authors : Paweł Zyblewski, Michał Woźniak

Published in: Computational Science – ICCS 2020

Publisher: Springer International Publishing

share
SHARE

Abstract

Imbalanced data analysis remains one of the critical challenges in machine learning. This work aims to adapt the concept of Dynamic Classifier Selection (dcs) to the pattern classification task with the skewed class distribution. Two methods, using the similarity (distance) to the reference instances and class imbalance ratio to select the most confident classifier for a given observation, have been proposed. Both approaches come in two modes, one based on the k-Nearest Oracles (knora) and the other also considering those cases where the classifier makes a mistake. The proposed methods were evaluated based on computer experiments carried out on https://static-content.springer.com/image/chp%3A10.1007%2F978-3-030-50423-6_5/MediaObjects/500804_1_En_5_Figa_HTML.gif datasets with a high imbalance ratio. The obtained results and statistical analysis confirm the usefulness of the proposed solutions.

To get access to this content you need the following product:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt 90 Tage mit der neuen Mini-Lizenz testen!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe



 


Jetzt 90 Tage mit der neuen Mini-Lizenz testen!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko





Jetzt 90 Tage mit der neuen Mini-Lizenz testen!

Literature
1.
go back to reference Alcala-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult. Valued Log. Soft Comput. 17, 255–287 (2010) Alcala-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult. Valued Log. Soft Comput. 17, 255–287 (2010)
3.
go back to reference Brodersen, K.H., Ong, C.S., Stephan, K.E., Buhmann, J.M.: The balanced accuracy and its posterior distribution. In: Proceedings of the 2010 20th International Conference on Pattern Recognition, ICPR 2010, Washington, DC, USA, pp. 3121–3124. IEEE Computer Society (2010) Brodersen, K.H., Ong, C.S., Stephan, K.E., Buhmann, J.M.: The balanced accuracy and its posterior distribution. In: Proceedings of the 2010 20th International Conference on Pattern Recognition, ICPR 2010, Washington, DC, USA, pp. 3121–3124. IEEE Computer Society (2010)
5.
go back to reference Cruz, R.M.O., Sabourin, R., Cavalcanti, G.D.C.: Dynamic classifier selection: recent advances and perspectives. Inf. Fus. 41, 195–216 (2018) CrossRef Cruz, R.M.O., Sabourin, R., Cavalcanti, G.D.C.: Dynamic classifier selection: recent advances and perspectives. Inf. Fus. 41, 195–216 (2018) CrossRef
6.
go back to reference Duin, R.P.W.: The combining classifier: to train or not to train? In: Object Recognition Supported by User Interaction for Service Robots, vol. 2, pp. 765–770, August 2002 Duin, R.P.W.: The combining classifier: to train or not to train? In: Object Recognition Supported by User Interaction for Service Robots, vol. 2, pp. 765–770, August 2002
7.
go back to reference Ko, A.H., Sabourin, R., Alceu Souza Britto, J.: From dynamic classifier selection to dynamic ensemble selection. Pattern Recogn. 41(5), 1718–1731 (2008) CrossRef Ko, A.H., Sabourin, R., Alceu Souza Britto, J.: From dynamic classifier selection to dynamic ensemble selection. Pattern Recogn. 41(5), 1718–1731 (2008) CrossRef
9.
go back to reference Ksieniewicz, P.: Undersampled majority class ensemble for highly imbalanced binary classification. In: Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications. Proceedings of Machine Learning Research, Dublin, Ireland, vol. 94, pp. 82–94. PMLR, ECML-PKDD, 10 September 2018 Ksieniewicz, P.: Undersampled majority class ensemble for highly imbalanced binary classification. In: Proceedings of the Second International Workshop on Learning with Imbalanced Domains: Theory and Applications. Proceedings of Machine Learning Research, Dublin, Ireland, vol. 94, pp. 82–94. PMLR, ECML-PKDD, 10 September 2018
10.
go back to reference Ksieniewicz, P., Zyblewski, P.: Stream-learn-open-source Python library for difficult data stream batch analysis. arXiv preprint arXiv:​2001.​11077 (2020) Ksieniewicz, P., Zyblewski, P.: Stream-learn-open-source Python library for difficult data stream batch analysis. arXiv preprint arXiv:​2001.​11077 (2020)
11.
go back to reference Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML (1997) Kubat, M., Matwin, S.: Addressing the curse of imbalanced training sets: one-sided selection. In: ICML (1997)
12.
go back to reference Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011) MathSciNetMATH Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011) MathSciNetMATH
13.
go back to reference Powers, D.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J. Mach. Learn. Technol 2, 2229–3981 (2011) Powers, D.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness & correlation. J. Mach. Learn. Technol 2, 2229–3981 (2011)
14.
go back to reference Roy, A., Cruz, R.M., Sabourin, R., Cavalcanti, G.D.: A study on combining dynamic selection and data preprocessing for imbalance learning. Neurocomputing 286, 179–192 (2018) CrossRef Roy, A., Cruz, R.M., Sabourin, R., Cavalcanti, G.D.: A study on combining dynamic selection and data preprocessing for imbalance learning. Neurocomputing 286, 179–192 (2018) CrossRef
15.
go back to reference Sasaki, Y.: The truth of the F-measure. Teach Tutor Mater, January 2007 Sasaki, Y.: The truth of the F-measure. Teach Tutor Mater, January 2007
16.
go back to reference Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fus. 16, 3–17 (2014). Special Issue on Information Fusion in Hybrid Intelligent Fusion Systems CrossRef Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fus. 16, 3–17 (2014). Special Issue on Information Fusion in Hybrid Intelligent Fusion Systems CrossRef
Metadata
Title
Dynamic Classifier Selection for Data with Skewed Class Distribution Using Imbalance Ratio and Euclidean Distance
Authors
Paweł Zyblewski
Michał Woźniak
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-50423-6_5