Skip to main content
Top
Published in: Arabian Journal for Science and Engineering 8/2022

28-01-2022 | Research Article-Computer Engineering and Computer Science

DRBF-DS: Double RBF Kernel-Based Deep Sampling with CNNs to Handle Complex Imbalanced Datasets

Authors: Subhashree Rout, Pradeep Kumar Mallick, Debahuti Mishra

Published in: Arabian Journal for Science and Engineering | Issue 8/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The inappropriate distribution of samples is known as the data imbalance problem leading to minority and majority classes, often in the healthcare domain due to weakly labelled data. The challenge of working with these imbalanced datasets is to develop predictive models by augmenting the samples to minority class to equalize to the number of samples of the majority class. The successful integration of double radial basis function kernel with deep sampling and convolutional neural networks is experimented to propose a hybrid sampling method coined as DRBF-DS, which generates extended samples, and augments those to the minority class in this study. In this paper, the proposed DRBF-DS is compared with the variants of simple and widely used probability and non-probability-based sampling strategies. It confirms that the proposed approach outperforms the sampling strategies in terms of accuracy and other performance measures used for validation. Experimental validations are performed on five publicly available complex imbalanced datasets to demonstrate the effectiveness of the proposed DRBF-DS based on various evaluation metrics.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Karaca, Y.; Baleanu, D.: A novel R/S fractal analysis and wavelet entropy characterization approach for robust forecasting based on self-similar time series modeling. Fractals 28(8), 2040032–2040068 (2020)CrossRef Karaca, Y.; Baleanu, D.: A novel R/S fractal analysis and wavelet entropy characterization approach for robust forecasting based on self-similar time series modeling. Fractals 28(8), 2040032–2040068 (2020)CrossRef
2.
go back to reference Akgul, A.; Ahmed, N.; Raza, A.; Iqbal, Z.; Rafiq, M.; Rehman, M.A.; Baleanu, D.: A fractal fractional model for cervical cancer due to human papillomavirus infection. Fractals 29(5), 2140015 (2021)CrossRef Akgul, A.; Ahmed, N.; Raza, A.; Iqbal, Z.; Rafiq, M.; Rehman, M.A.; Baleanu, D.: A fractal fractional model for cervical cancer due to human papillomavirus infection. Fractals 29(5), 2140015 (2021)CrossRef
3.
go back to reference Sabir, Z.; Baleanu, D.; Zahoor Raja, M.A.; Guirao, J.L.: Design of neuro-swarming heuristic solver for multi-pantograph singular delay differential equation. Fractals 29(5), 2140022 (2021)CrossRef Sabir, Z.; Baleanu, D.; Zahoor Raja, M.A.; Guirao, J.L.: Design of neuro-swarming heuristic solver for multi-pantograph singular delay differential equation. Fractals 29(5), 2140022 (2021)CrossRef
4.
go back to reference Al Qurashi, M.; Rashid, S.; Khalid, A.; Karaca, Y.; Chu, Y.M.: New computations of ostrowski type inequality pertaining to fractal style with applications. Fractals 29(5), 2140026 (2021)CrossRef Al Qurashi, M.; Rashid, S.; Khalid, A.; Karaca, Y.; Chu, Y.M.: New computations of ostrowski type inequality pertaining to fractal style with applications. Fractals 29(5), 2140026 (2021)CrossRef
5.
go back to reference Iqbal, N.; Karaca, Y.: Complex fractional-order HIV diffusion model based on amplitude equations with turing patterns and turing instability. Fractals (2021) Iqbal, N.; Karaca, Y.: Complex fractional-order HIV diffusion model based on amplitude equations with turing patterns and turing instability. Fractals (2021)
6.
go back to reference He, H.; Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef He, H.; Garcia, E.A.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21(9), 1263–1284 (2009)CrossRef
7.
go back to reference Das, S.; Datta, S.; Chaudhuri, B.B.: Handling data irregularities in classification: foundations, trends, and future challenges. Pattern Recognit. 81, 674–693 (2018)CrossRef Das, S.; Datta, S.; Chaudhuri, B.B.: Handling data irregularities in classification: foundations, trends, and future challenges. Pattern Recognit. 81, 674–693 (2018)CrossRef
8.
go back to reference Elreedy, D.; Atiya, A.F.: A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance. Inf. Sci. 505, 32–64 (2019)CrossRef Elreedy, D.; Atiya, A.F.: A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance. Inf. Sci. 505, 32–64 (2019)CrossRef
9.
go back to reference Mullick, S.S.; Datta, S.; Dhekane, S.G.; Das, S.: Appropriateness of performance indices for imbalanced data classification: an analysis. Pattern Recognit. 102, 107197 (2020)CrossRef Mullick, S.S.; Datta, S.; Dhekane, S.G.; Das, S.: Appropriateness of performance indices for imbalanced data classification: an analysis. Pattern Recognit. 102, 107197 (2020)CrossRef
10.
go back to reference Tao, X.; Li, Q.; Guo, W.; Ren, C.; He, Q.; Liu, R.; Zou, J.R.: Adaptive weighted over-sampling for imbalanced datasets based on density peaks clustering with heuristic filtering. Inf. Sci. 519, 43–73 (2020)MathSciNetCrossRef Tao, X.; Li, Q.; Guo, W.; Ren, C.; He, Q.; Liu, R.; Zou, J.R.: Adaptive weighted over-sampling for imbalanced datasets based on density peaks clustering with heuristic filtering. Inf. Sci. 519, 43–73 (2020)MathSciNetCrossRef
11.
go back to reference Liu, B.; Tsoumakas, G.: Dealing with class imbalance in classifier chains via random undersampling. Knowl. Based Syst. 192, 1052926 (2020) Liu, B.; Tsoumakas, G.: Dealing with class imbalance in classifier chains via random undersampling. Knowl. Based Syst. 192, 1052926 (2020)
12.
go back to reference Soltanzadeh, P.; Hashemzadeh, M.: RCSMOTE: Range-Controlled synthetic minority over-sampling technique for handling the class imbalance problem. Inf. Sci. 542, 92–111 (2021)MathSciNetCrossRef Soltanzadeh, P.; Hashemzadeh, M.: RCSMOTE: Range-Controlled synthetic minority over-sampling technique for handling the class imbalance problem. Inf. Sci. 542, 92–111 (2021)MathSciNetCrossRef
13.
go back to reference Zareapoor, M.; Shamsolmoali, P.; Yang, J.: Oversampling adversarial network for class-imbalanced fault diagnosis. Mech. Syst. Signal Process. 149, 107175 (2021)CrossRef Zareapoor, M.; Shamsolmoali, P.; Yang, J.: Oversampling adversarial network for class-imbalanced fault diagnosis. Mech. Syst. Signal Process. 149, 107175 (2021)CrossRef
14.
go back to reference Koziarski, M.; Woźniak, M.; Krawczyk, B.: Combined cleaning and resampling algorithm for multi-class imbalanced data with label noise. Knowl. Based Syst. 204, 106223 (2020)CrossRef Koziarski, M.; Woźniak, M.; Krawczyk, B.: Combined cleaning and resampling algorithm for multi-class imbalanced data with label noise. Knowl. Based Syst. 204, 106223 (2020)CrossRef
15.
go back to reference Nejatian, S.; Parvin, H.; Faraji, E.: Using sub-sampling and ensemble clustering techniques to improve performance of imbalanced classification. Neurocomputing 276, 55–66 (2018)CrossRef Nejatian, S.; Parvin, H.; Faraji, E.: Using sub-sampling and ensemble clustering techniques to improve performance of imbalanced classification. Neurocomputing 276, 55–66 (2018)CrossRef
16.
go back to reference O’Brien, R.; Ishwaran, H.: A random forests quantile classifier for class imbalanced data. Pattern Recognit. 90, 232–249 (2019)CrossRef O’Brien, R.; Ishwaran, H.: A random forests quantile classifier for class imbalanced data. Pattern Recognit. 90, 232–249 (2019)CrossRef
17.
go back to reference Mohamad, M.; Selamat, A.; Subroto, I.M.; Krejcar, O.: Improving the classification performance on imbalanced data sets via new hybrid parameterisation model. J. King Saud Univ. Comput. Inf. Sci. (2019) Mohamad, M.; Selamat, A.; Subroto, I.M.; Krejcar, O.: Improving the classification performance on imbalanced data sets via new hybrid parameterisation model. J. King Saud Univ. Comput. Inf. Sci. (2019)
18.
go back to reference Zhang, X.; Li, R.; Zhang, Bo.; Yang, Y.; Guo, J.; Ji, X.: An instance-based learning recommendation algorithm of imbalance handling methods. Appl. Math. Comput. 351, 204–218 (2019)MathSciNetCrossRef Zhang, X.; Li, R.; Zhang, Bo.; Yang, Y.; Guo, J.; Ji, X.: An instance-based learning recommendation algorithm of imbalance handling methods. Appl. Math. Comput. 351, 204–218 (2019)MathSciNetCrossRef
20.
go back to reference Chen, H.; Zhang, Y.; Gutman, I.: A kernel-based clustering method for gene selection with gene expression data. J. Biomed. Inform. 62, 12–20 (2016)CrossRef Chen, H.; Zhang, Y.; Gutman, I.: A kernel-based clustering method for gene selection with gene expression data. J. Biomed. Inform. 62, 12–20 (2016)CrossRef
21.
go back to reference MichałKoziarski, BartoszKrawczyk: MichałWoźniak, Radial-Based oversampling for noisy imbalanced data classification. Neurocomputing 343, 19–43 (2019)CrossRef MichałKoziarski, BartoszKrawczyk: MichałWoźniak, Radial-Based oversampling for noisy imbalanced data classification. Neurocomputing 343, 19–43 (2019)CrossRef
22.
go back to reference Zhu, H.; Liu, G.; Mengchu Zhou, Yu.; Xie, A.A.; Kang, Qi.: Optimizing weighted extreme learning machines for imbalanced classification and application to credit card fraud detection. Neurocomputing 407, 50–62 (2020)CrossRef Zhu, H.; Liu, G.; Mengchu Zhou, Yu.; Xie, A.A.; Kang, Qi.: Optimizing weighted extreme learning machines for imbalanced classification and application to credit card fraud detection. Neurocomputing 407, 50–62 (2020)CrossRef
23.
go back to reference Raghuwanshi, B.S.; Shukla, S.: SMOTE based class-specific extreme learning machine for imbalanced learning. Knowl. Based Syst. 187, 104814 (2020)CrossRef Raghuwanshi, B.S.; Shukla, S.: SMOTE based class-specific extreme learning machine for imbalanced learning. Knowl. Based Syst. 187, 104814 (2020)CrossRef
Metadata
Title
DRBF-DS: Double RBF Kernel-Based Deep Sampling with CNNs to Handle Complex Imbalanced Datasets
Authors
Subhashree Rout
Pradeep Kumar Mallick
Debahuti Mishra
Publication date
28-01-2022
Publisher
Springer Berlin Heidelberg
Published in
Arabian Journal for Science and Engineering / Issue 8/2022
Print ISSN: 2193-567X
Electronic ISSN: 2191-4281
DOI
https://doi.org/10.1007/s13369-021-06480-z

Other articles of this Issue 8/2022

Arabian Journal for Science and Engineering 8/2022 Go to the issue

Research Article-Computer Engineering and Computer Science

Transformer-Based Word Embedding With CNN Model to Detect Sarcasm and Irony

Research Article-Computer Engineering and Computer Science

Channel Estimation of Massive MIMO-OFDM System Using Elman Recurrent Neural Network

Premium Partners