Skip to main content
Erschienen in: Evolutionary Intelligence 3/2019

10.10.2018 | Special Issue

Imbalanced data classification algorithm with support vector machine kernel extensions

verfasst von: Feng Wang, Shaojiang Liu, Weichuan Ni, Zhiming Xu, Zemin Qiu, Zhiping Wan, Zhihong Pan

Erschienen in: Evolutionary Intelligence | Ausgabe 3/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Learning from the imbalanced data samples so as to achieve accurate classification is an important research content in data mining field. It is very difficult for classification algorithm to achieve a higher accuracy because the uneven distribution of data samples makes some categories have few samples. A imbalanced data classification algorithm of support vector machines (KE-SVM) is proposed in this article, this algorithm achieve the initial classification of data samples by training the maximum margin classification SVM model, and then obtaining a new kernel extension function. based on Chi square test and weight coefficient calculation, through training the samples again by the new vector machine with kernel function to improve the classification accuracy. Through the simulation experiments of real data sets of artificial data set, it shows that the proposed method has higher classification accuracy and faster convergence for the uneven distribution data.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Buczak AL, Guven E (2017) A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun Surv Tutor 18(2):1153–1176CrossRef Buczak AL, Guven E (2017) A survey of data mining and machine learning methods for cyber security intrusion detection. IEEE Commun Surv Tutor 18(2):1153–1176CrossRef
2.
Zurück zum Zitat Papalexakis EE, Faloutsos C, Sidiropoulos ND (2016) Tensors for data mining and data fusion: models, applications, and scalable algorithms. Acm Trans Intell Syst Technol 8(2):1–44CrossRef Papalexakis EE, Faloutsos C, Sidiropoulos ND (2016) Tensors for data mining and data fusion: models, applications, and scalable algorithms. Acm Trans Intell Syst Technol 8(2):1–44CrossRef
3.
Zurück zum Zitat Adeniyi DA, Wei Z, Yongquan Y (2016) Automated web usage data mining and recommendation system using K-nearest neighbor (KNN) classification method. Appl Comput Inf 12(1):90–108 Adeniyi DA, Wei Z, Yongquan Y (2016) Automated web usage data mining and recommendation system using K-nearest neighbor (KNN) classification method. Appl Comput Inf 12(1):90–108
4.
Zurück zum Zitat Deng Y, Ren Z, Kong Y et al (2017) A hierarchical fused fuzzy deep neural network for data classification. IEEE Trans Fuzzy Syst 25(4):1006–1012CrossRef Deng Y, Ren Z, Kong Y et al (2017) A hierarchical fused fuzzy deep neural network for data classification. IEEE Trans Fuzzy Syst 25(4):1006–1012CrossRef
5.
Zurück zum Zitat Gu Y, Wang Q, Xie B (2017) Multiple kernel sparse representation for airborne LiDAR data classification. IEEE Trans Geosci Remote Sens 55(99):1–21 Gu Y, Wang Q, Xie B (2017) Multiple kernel sparse representation for airborne LiDAR data classification. IEEE Trans Geosci Remote Sens 55(99):1–21
6.
Zurück zum Zitat Pourpanah F, Lim CP, Saleh JM (2016) A hybrid model of fuzzy ARTMAP and genetic algorithm for data classification and rule extraction. Expert Syst Appl 49:74–85CrossRef Pourpanah F, Lim CP, Saleh JM (2016) A hybrid model of fuzzy ARTMAP and genetic algorithm for data classification and rule extraction. Expert Syst Appl 49:74–85CrossRef
7.
Zurück zum Zitat Zhang J, Wang S, Chen L et al (2017) Multiple Bayesian discriminant functions for high-dimensional massive data classification. Data Min Knowl Discov 31(2):1–37MathSciNetCrossRefMATH Zhang J, Wang S, Chen L et al (2017) Multiple Bayesian discriminant functions for high-dimensional massive data classification. Data Min Knowl Discov 31(2):1–37MathSciNetCrossRefMATH
8.
Zurück zum Zitat Gu X, Wang S-T, Xu M (2014) A new cross-multidomain classification algorithm and its fast version for large datasets. Acta Autom Sin 40(3):531–547 Gu X, Wang S-T, Xu M (2014) A new cross-multidomain classification algorithm and its fast version for large datasets. Acta Autom Sin 40(3):531–547
9.
Zurück zum Zitat Wang Z-W, Xiao W-D, Tan W-T (2013) Classification in networked data based on the probability generative mode. J Comput Res Dev 50(12):2642–2650 Wang Z-W, Xiao W-D, Tan W-T (2013) Classification in networked data based on the probability generative mode. J Comput Res Dev 50(12):2642–2650
10.
Zurück zum Zitat Shao YH, Chen WJ, Zhang JJ et al (2014) “An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recognit 47(9):3158–3167CrossRefMATH Shao YH, Chen WJ, Zhang JJ et al (2014) “An efficient weighted Lagrangian twin support vector machine for imbalanced data classification. Pattern Recognit 47(9):3158–3167CrossRefMATH
11.
Zurück zum Zitat Peng X, Xu D (2014) “Structural regularized projection twin support vector machine for data classification. Inf Sci 279(279):416–432CrossRefMATH Peng X, Xu D (2014) “Structural regularized projection twin support vector machine for data classification. Inf Sci 279(279):416–432CrossRefMATH
12.
Zurück zum Zitat Zhang H, Li M (2014) RWO-sampling: a random walk over-sampling approach to imbalanced data classification. Inf Fusion 20(1):99–116CrossRef Zhang H, Li M (2014) RWO-sampling: a random walk over-sampling approach to imbalanced data classification. Inf Fusion 20(1):99–116CrossRef
13.
Zurück zum Zitat Yin Y, Xu D, Wang X et al (2017) Online state-based structured SVM combined with incremental PCA for robust visual tracking. IEEE Trans Cybern 45(9):1988–2000CrossRef Yin Y, Xu D, Wang X et al (2017) Online state-based structured SVM combined with incremental PCA for robust visual tracking. IEEE Trans Cybern 45(9):1988–2000CrossRef
14.
Zurück zum Zitat He H, Kong F, Tan J (2017) DietCam: multi-view food recognition using a multi-kernel SVM. IEEE J Biomed Health Inf 20(3):848–855CrossRef He H, Kong F, Tan J (2017) DietCam: multi-view food recognition using a multi-kernel SVM. IEEE J Biomed Health Inf 20(3):848–855CrossRef
15.
Zurück zum Zitat Yoon H, Park CS, Kim JS et al (2013) Algorithm learning based neural network integrating feature selection and classification. Expert Syst Appl 40(1):231–241CrossRef Yoon H, Park CS, Kim JS et al (2013) Algorithm learning based neural network integrating feature selection and classification. Expert Syst Appl 40(1):231–241CrossRef
16.
Zurück zum Zitat Chen Y, Nasrabadi NM, Tran TD (2013) Hyperspectral image classification via kernel sparse representation. IEEE Trans Geosci Remote Sens 51(1):217–231CrossRef Chen Y, Nasrabadi NM, Tran TD (2013) Hyperspectral image classification via kernel sparse representation. IEEE Trans Geosci Remote Sens 51(1):217–231CrossRef
17.
Zurück zum Zitat Zhun M, Li X-L, Li X-L (2012) A two-stage support vector machine algorithm based on meta learning and stacking generalization. Pattern Recognit Artif Intell 25:943–949 Zhun M, Li X-L, Li X-L (2012) A two-stage support vector machine algorithm based on meta learning and stacking generalization. Pattern Recognit Artif Intell 25:943–949
Metadaten
Titel
Imbalanced data classification algorithm with support vector machine kernel extensions
verfasst von
Feng Wang
Shaojiang Liu
Weichuan Ni
Zhiming Xu
Zemin Qiu
Zhiping Wan
Zhihong Pan
Publikationsdatum
10.10.2018
Verlag
Springer Berlin Heidelberg
Erschienen in
Evolutionary Intelligence / Ausgabe 3/2019
Print ISSN: 1864-5909
Elektronische ISSN: 1864-5917
DOI
https://doi.org/10.1007/s12065-018-0182-0

Weitere Artikel der Ausgabe 3/2019

Evolutionary Intelligence 3/2019 Zur Ausgabe