Skip to main content
Top

2020 | OriginalPaper | Chapter

Employing One-Class SVM Classifier Ensemble for Imbalanced Data Stream Classification

Authors : Jakub Klikowski, Michał Woźniak

Published in: Computational Science – ICCS 2020

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The classification of imbalanced data streams is gaining more and more interest. However, apart from the problem that one of the class is not well represented, there are problems typical for data stream classification, such as limited resources, lack of access to the true labels and the possibility of occurrence of the concept drift. Possibility of concept drift appearing enforces design in the method adaptation mechanism. In this article, we propose the OCEIS classifier (One-Class support vector machine classifier Ensemble for Imbalanced data Stream). The main idea is to supply the committee with one-class classifiers trained on clustered data for each class separately. The results obtained from experiments carried out on synthetic and real data show that the proposed method achieves results at a similar level as the state of the art methods compared with it.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17 (2011) Alcalá-Fdez, J., et al.: Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17 (2011)
2.
go back to reference Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRef Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002)CrossRef
3.
go back to reference Chen, S., He, H.: Sera: selectively recursive approach towards nonstationary imbalanced stream data mining. In: 2009 International Joint Conference on Neural Networks, pp. 522–529. IEEE (2009) Chen, S., He, H.: Sera: selectively recursive approach towards nonstationary imbalanced stream data mining. In: 2009 International Joint Conference on Neural Networks, pp. 522–529. IEEE (2009)
4.
go back to reference Chen, S., He, H.: Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach. Evol. Syst. 2(1), 35–50 (2011)CrossRef Chen, S., He, H.: Towards incremental learning of nonstationary imbalanced data stream: a multiple selectively recursive approach. Evol. Syst. 2(1), 35–50 (2011)CrossRef
5.
go back to reference Chen, S., He, H., Li, K., Desai, S.: Musera: multiple selectively recursive approach towards imbalanced stream data mining. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2010) Chen, S., He, H., Li, K., Desai, S.: Musera: multiple selectively recursive approach towards imbalanced stream data mining. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–8. IEEE (2010)
6.
go back to reference Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2012)CrossRef Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2012)CrossRef
7.
go back to reference Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)CrossRef Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)CrossRef
8.
go back to reference Gao, J., Ding, B., Fan, W., Han, J., Philip, S.Y.: Classifying data streams with skewed class distributions and concept drifts. IEEE Internet Comput. 12(6), 37–49 (2008)CrossRef Gao, J., Ding, B., Fan, W., Han, J., Philip, S.Y.: Classifying data streams with skewed class distributions and concept drifts. IEEE Internet Comput. 12(6), 37–49 (2008)CrossRef
9.
go back to reference Kaufmann, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)CrossRef Kaufmann, L., Rousseeuw, P.J.: Finding Groups in Data: An Introduction to Cluster Analysis. Wiley, New York (1990)CrossRef
10.
go back to reference Krawczyk, B., Woźniak, M.: Diversity measures for one-class classifier ensembles. Neurocomputing 126, 36–44 (2014)CrossRef Krawczyk, B., Woźniak, M.: Diversity measures for one-class classifier ensembles. Neurocomputing 126, 36–44 (2014)CrossRef
11.
go back to reference Krawczyk, B., Woźniak, M., Cyganek, B.: Clustering-based ensembles for one-class classification. Inf. Sci. 264, 182–195 (2014)MathSciNetCrossRef Krawczyk, B., Woźniak, M., Cyganek, B.: Clustering-based ensembles for one-class classification. Inf. Sci. 264, 182–195 (2014)MathSciNetCrossRef
12.
go back to reference Ksieniewicz, P., Zyblewski, P.: Stream-learn-open-source python library for difficult data stream batch analysis. arXiv preprint arXiv:2001.11077 (2020) Ksieniewicz, P., Zyblewski, P.: Stream-learn-open-source python library for difficult data stream batch analysis. arXiv preprint arXiv:​2001.​11077 (2020)
13.
go back to reference Lima, M., Valle, V., Costa, E., Lira, F., Gadelha, B.: Software engineering repositories: expanding the promise database. In: Proceedings of the XXXIII Brazilian Symposium on Software Engineering, pp. 427–436. ACM (2019) Lima, M., Valle, V., Costa, E., Lira, F., Gadelha, B.: Software engineering repositories: expanding the promise database. In: Proceedings of the XXXIII Brazilian Symposium on Software Engineering, pp. 427–436. ACM (2019)
14.
go back to reference Liu, J., Miao, Q., Sun, Y., Song, J., Quan, Y.: Modular ensembles for one-class classification based on density analysis. Neurocomputing 171, 262–276 (2016)CrossRef Liu, J., Miao, Q., Sun, Y., Song, J., Quan, Y.: Modular ensembles for one-class classification based on density analysis. Neurocomputing 171, 262–276 (2016)CrossRef
15.
go back to reference MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967) MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, Oakland, CA, USA, vol. 1, pp. 281–297 (1967)
16.
go back to reference Pal, S.K., Mitra, S.: Multilayer perceptron, fuzzy sets, classifiaction (1992) Pal, S.K., Mitra, S.: Multilayer perceptron, fuzzy sets, classifiaction (1992)
17.
go back to reference Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATH
18.
go back to reference Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRef Rousseeuw, P.J.: Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. J. Comput. Appl. Math. 20, 53–65 (1987)CrossRef
20.
go back to reference Xu, L., Krzyzak, A., Suen, C.Y.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst. Man Cybern. 22(3), 418–435 (1992)CrossRef Xu, L., Krzyzak, A., Suen, C.Y.: Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst. Man Cybern. 22(3), 418–435 (1992)CrossRef
Metadata
Title
Employing One-Class SVM Classifier Ensemble for Imbalanced Data Stream Classification
Authors
Jakub Klikowski
Michał Woźniak
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-50423-6_9

Premium Partner