Skip to main content

2019 | OriginalPaper | Buchkapitel

A Genetic-Based Ensemble Learning Applied to Imbalanced Data Classification

verfasst von : Jakub Klikowski, Paweł Ksieniewicz, Michał Woźniak

Erschienen in: Intelligent Data Engineering and Automated Learning – IDEAL 2019

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Imbalanced data classification is still a focus of intense research, due to its ever-growing presence in the real-life decision tasks. In this article, we focus on a classifier ensemble for imbalanced data classification. The ensemble is formed on the basis of the individual classifiers trained on supervise-selected feature subsets. There are several methods employing this concept to ensure a high diverse ensemble, nevertheless most of them, as Random Subspace or Random Forest, select attributes for a particular classifier randomly. The main drawback of mentioned methods is not giving the ability to supervise and control this task. In following work, we apply a genetic algorithm to the considered problem. Proposition formulates an original learning criterion, taking into consideration not only the overall classification performance but also ensures that trained ensemble is characterised by high diversity. The experimental study confirmed the high efficiency of the proposed algorithm and its superiority to other ensemble forming method based on random feature selection.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult. Valued Log. Soft Comput. 17, 255–287 (2011) Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Mult. Valued Log. Soft Comput. 17, 255–287 (2011)
2.
Zurück zum Zitat Back, T., Fogel, D., Michalewicz, Z.: Handbook of Evolutionary Computation. Oxford University Press, New York (1997)CrossRef Back, T., Fogel, D., Michalewicz, Z.: Handbook of Evolutionary Computation. Oxford University Press, New York (1997)CrossRef
4.
Zurück zum Zitat Canuto, A.M., Nascimento, D.S.: A genetic-based approach to features selection for ensembles using a hybrid and adaptive fitness function. In: The 2012 international joint conference on neural networks (IJCNN), pp. 1–8. IEEE (2012) Canuto, A.M., Nascimento, D.S.: A genetic-based approach to features selection for ensembles using a hybrid and adaptive fitness function. In: The 2012 international joint conference on neural networks (IJCNN), pp. 1–8. IEEE (2012)
5.
Zurück zum Zitat Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATH Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)MathSciNetMATH
6.
Zurück zum Zitat Du, L., Xu, Y., Jin, L.: Feature selection for imbalanced datasets based on improved genetic algorithm. In: Decision Making and Soft Computing: Proceedings of the 11th International FLINS Conference, pp. 119–124. World Scientific (2014) Du, L., Xu, Y., Jin, L.: Feature selection for imbalanced datasets based on improved genetic algorithm. In: Decision Making and Soft Computing: Proceedings of the 11th International FLINS Conference, pp. 119–124. World Scientific (2014)
7.
Zurück zum Zitat García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf. Sci. 180(10), 2044–2064 (2010)CrossRef García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf. Sci. 180(10), 2044–2064 (2010)CrossRef
8.
Zurück zum Zitat Haque, M.N., Noman, N., Berretta, R., Moscato, P.: Heterogeneous ensemble combination search using genetic algorithm for class imbalanced data classification. PloS One 11(1), e0146116 (2016)CrossRef Haque, M.N., Noman, N., Berretta, R., Moscato, P.: Heterogeneous ensemble combination search using genetic algorithm for class imbalanced data classification. PloS One 11(1), e0146116 (2016)CrossRef
9.
Zurück zum Zitat Koziarski, M., Krawczyk, B., Woźniak, M.: The deterministic subspace method for constructing classifier ensembles. Pattern Anal. Appl. 20(4), 981–990 (2017)MathSciNetCrossRef Koziarski, M., Krawczyk, B., Woźniak, M.: The deterministic subspace method for constructing classifier ensembles. Pattern Anal. Appl. 20(4), 981–990 (2017)MathSciNetCrossRef
12.
Zurück zum Zitat Lee, H.M., Chen, C.M., Chen, J.M., Jou, Y.L.: An efficient fuzzy classifier with feature selection based on fuzzy entropy. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 31(3), 426–432 (2001)CrossRef Lee, H.M., Chen, C.M., Chen, J.M., Jou, Y.L.: An efficient fuzzy classifier with feature selection based on fuzzy entropy. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 31(3), 426–432 (2001)CrossRef
13.
Zurück zum Zitat Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH
14.
Zurück zum Zitat Wozniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014)CrossRef Wozniak, M., Graña, M., Corchado, E.: A survey of multiple classifier systems as hybrid systems. Inf. Fusion 16, 3–17 (2014)CrossRef
Metadaten
Titel
A Genetic-Based Ensemble Learning Applied to Imbalanced Data Classification
verfasst von
Jakub Klikowski
Paweł Ksieniewicz
Michał Woźniak
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-33617-2_35