Skip to main content
Top

2020 | OriginalPaper | Chapter

Standard Decision Boundary in a Support-Domain of Fuzzy Classifier Prediction for the Task of Imbalanced Data Classification

Author : Pawel Ksieniewicz

Published in: Computational Science – ICCS 2020

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Many real classification problems are characterized by a strong disturbance in a prior probability, which for the most of classification algorithms leads to favoring majority classes. The action most often used to deal with this problem is oversampling of the minority class by the smote algorithm. Following work proposes to employ a modification of an individual binary classifier support-domain decision boundary, similar to the fusion of classifier ensembles done by the Fuzzy Templates method to deal with imbalanced data classification without introducing any repeated or artificial patterns into the training set. The proposed solution has been tested in computer experiments, which results shows its potential in the imbalanced data classification.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Aditsania, A., Adiwijaya, Saonard, A.L.: Handling imbalanced data in churn prediction using ADASYN and backpropagation algorithm. In: Proceeding - 2017 3rd International Conference on Science in Information Technology: Theory and Application of IT for Education, Industry and Society in Big Data Era, ICSITech 2017 (2017) Aditsania, A., Adiwijaya, Saonard, A.L.: Handling imbalanced data in churn prediction using ADASYN and backpropagation algorithm. In: Proceeding - 2017 3rd International Conference on Science in Information Technology: Theory and Application of IT for Education, Industry and Society in Big Data Era, ICSITech 2017 (2017)
2.
go back to reference Alcalá-Fdez, J., et al.: KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011) Alcalá-Fdez, J., et al.: KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011)
3.
go back to reference del Amo, A., Montero, J., Cutello, V.: On the principles of fuzzy classification. In: Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS (1999) del Amo, A., Montero, J., Cutello, V.: On the principles of fuzzy classification. In: Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS (1999)
4.
go back to reference Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)MATH Bishop, C.M.: Pattern Recognition and Machine Learning. Springer, New York (2006)MATH
5.
go back to reference Fernández, A., García, S., Galar, M., Prati, R.C., Krawczyk, B., Herrera, F.: Learning from Imbalanced Data Sets (2018) Fernández, A., García, S., Galar, M., Prati, R.C., Krawczyk, B., Herrera, F.: Learning from Imbalanced Data Sets (2018)
6.
go back to reference Fernández, A., García, S., Herrera, F., Chawla, N.V.: SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61, 863–905 (2018)MathSciNetCrossRef Fernández, A., García, S., Herrera, F., Chawla, N.V.: SMOTE for learning from imbalanced data: progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 61, 863–905 (2018)MathSciNetCrossRef
7.
go back to reference Ganganwar, V.: An overview of classification algorithms for imbalanced datasets. Int. J. Emerg. Technol. Adv. Eng. 2(4), 42–47 (2012) Ganganwar, V.: An overview of classification algorithms for imbalanced datasets. Int. J. Emerg. Technol. Adv. Eng. 2(4), 42–47 (2012)
8.
go back to reference He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: Proceedings of the International Joint Conference on Neural Networks (2008) He, H., Bai, Y., Garcia, E.A., Li, S.: ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: Proceedings of the International Joint Conference on Neural Networks (2008)
9.
go back to reference Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Prog. Artif. Intell. 5(4), 221–232 (2016)CrossRef Krawczyk, B.: Learning from imbalanced data: open challenges and future directions. Prog. Artif. Intell. 5(4), 221–232 (2016)CrossRef
10.
go back to reference Ksieniewicz, P.: Undersampled majority class ensemble for highly imbalanced binary classification. In: Second International Workshop on Learning with Imbalanced Domains: Theory and Applications, pp. 82–94 (2018) Ksieniewicz, P.: Undersampled majority class ensemble for highly imbalanced binary classification. In: Second International Workshop on Learning with Imbalanced Domains: Theory and Applications, pp. 82–94 (2018)
11.
go back to reference Ksieniewicz, P.: Combining Random Subspace approach with smote oversampling for imbalanced data classification. In: Pérez García, H., Sánchez González, L., Castejón Limas, M., Quintián Pardo, H., Corchado Rodríguez, E. (eds.) HAIS 2019. LNCS (LNAI), vol. 11734, pp. 660–673. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-29859-3_56CrossRef Ksieniewicz, P.: Combining Random Subspace approach with smote oversampling for imbalanced data classification. In: Pérez García, H., Sánchez González, L., Castejón Limas, M., Quintián Pardo, H., Corchado Rodríguez, E. (eds.) HAIS 2019. LNCS (LNAI), vol. 11734, pp. 660–673. Springer, Cham (2019). https://​doi.​org/​10.​1007/​978-3-030-29859-3_​56CrossRef
13.
go back to reference Ksieniewicz, P., Wozniak, M., Torgo, L., Krawczyk, B., Branco, P., Moniz, N.: Dealing with the task of imbalanced, multidimensional data classification using ensembles of exposers. In: Proceedings of Machine Learning Research (2017) Ksieniewicz, P., Wozniak, M., Torgo, L., Krawczyk, B., Branco, P., Moniz, N.: Dealing with the task of imbalanced, multidimensional data classification using ensembles of exposers. In: Proceedings of Machine Learning Research (2017)
15.
go back to reference Kuncheva, L.I., Bezdek, J.C., Duin, R.P.: Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recogn. 34(2), 299–314 (2001)CrossRef Kuncheva, L.I., Bezdek, J.C., Duin, R.P.: Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recogn. 34(2), 299–314 (2001)CrossRef
16.
go back to reference Kuncheva, L.I., Bezdek, J.C., Sutton, M.A.: On combining multiple classifiers by fuzzy templates. In: Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS (1998) Kuncheva, L.I., Bezdek, J.C., Sutton, M.A.: On combining multiple classifiers by fuzzy templates. In: Annual Conference of the North American Fuzzy Information Processing Society - NAFIPS (1998)
17.
go back to reference Mitchell, T.M.: The Discipline of Machine Learning. Machine Learning (2006) Mitchell, T.M.: The Discipline of Machine Learning. Machine Learning (2006)
18.
go back to reference Moreo, A., Esuli, A., Sebastiani, F.: Distributional random oversampling for imbalanced text classification. In: SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (2016) Moreo, A., Esuli, A., Sebastiani, F.: Distributional random oversampling for imbalanced text classification. In: SIGIR 2016 - Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval (2016)
19.
go back to reference Ohsaki, M., Wang, P., Matsuda, K., Katagiri, S., Watanabe, H., Ralescu, A.: Confusion-matrix-based kernel logistic regression for imbalanced data classification. IEEE Trans. Knowl. Data Eng. 29(9), 1806–1819 (2017)CrossRef Ohsaki, M., Wang, P., Matsuda, K., Katagiri, S., Watanabe, H., Ralescu, A.: Confusion-matrix-based kernel logistic regression for imbalanced data classification. IEEE Trans. Knowl. Data Eng. 29(9), 1806–1819 (2017)CrossRef
20.
go back to reference Prusa, J., Khoshgoftaar, T.M., Dittman, D.J., Napolitano, A.: Using random undersampling to alleviate class imbalance on tweet sentiment data. In: Proceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015 (2015) Prusa, J., Khoshgoftaar, T.M., Dittman, D.J., Napolitano, A.: Using random undersampling to alleviate class imbalance on tweet sentiment data. In: Proceedings - 2015 IEEE 16th International Conference on Information Reuse and Integration, IRI 2015 (2015)
21.
go back to reference Rodriguez-Torres, F., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F.: Deterministic oversampling methods based on SMOTE. J. Intell. Fuzzy Syst. 36(5), 4945–4955 (2019)CrossRef Rodriguez-Torres, F., Carrasco-Ochoa, J.A., Martínez-Trinidad, J.F.: Deterministic oversampling methods based on SMOTE. J. Intell. Fuzzy Syst. 36(5), 4945–4955 (2019)CrossRef
22.
go back to reference Wang, Q., Luo, Z.H., Huang, J.C., Feng, Y.H., Liu, Z.: A novel ensemble method for imbalanced data learning: bagging of extrapolation-SMOTE SVM. Comput. Intell. Neurosci. (2017) Wang, Q., Luo, Z.H., Huang, J.C., Feng, Y.H., Liu, Z.: A novel ensemble method for imbalanced data learning: bagging of extrapolation-SMOTE SVM. Comput. Intell. Neurosci. (2017)
23.
go back to reference Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014)CrossRef Woźniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014)CrossRef
24.
go back to reference Xu, Y., Yang, Z., Zhang, Y., Pan, X., Wang, L.: A maximum margin and minimum volume hyper-spheres machine with pinball loss for imbalanced data classification. Knowl.-Based Syst. 95, 75–85 (2016)CrossRef Xu, Y., Yang, Z., Zhang, Y., Pan, X., Wang, L.: A maximum margin and minimum volume hyper-spheres machine with pinball loss for imbalanced data classification. Knowl.-Based Syst. 95, 75–85 (2016)CrossRef
25.
go back to reference Zhang, Y.: Deep generative model for multi-class imbalanced learning. ProQuest Dissertations and Theses (2018) Zhang, Y.: Deep generative model for multi-class imbalanced learning. ProQuest Dissertations and Theses (2018)
Metadata
Title
Standard Decision Boundary in a Support-Domain of Fuzzy Classifier Prediction for the Task of Imbalanced Data Classification
Author
Pawel Ksieniewicz
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-50423-6_8

Premium Partner