Skip to main content
Top
Published in: Pattern Analysis and Applications 3/2020

03-09-2019 | Theoretical advances

Classifying imbalanced data using BalanceCascade-based kernelized extreme learning machine

Authors: Bhagat Singh Raghuwanshi, Sanyam Shukla

Published in: Pattern Analysis and Applications | Issue 3/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Imbalanced learning is one of the substantial challenging problems in the field of data mining. The datasets that have skewed class distribution pose hindrance to conventional learning methods. Conventional learning methods give the same importance to all the examples. This leads to the prediction inclined in favor of the majority classes. To solve this intrinsic deficiency, numerous strategies have been proposed such as weighted extreme learning machine (WELM) and boosting WELM (BWELM). This work designs a novel BalanceCascade-based kernelized extreme learning machine (BCKELM) to tackle the class imbalance problem more effectively. BalanceCascade includes the merits of random undersampling and the ensemble methods. The proposed method utilizes random undersampling to design balanced training subsets. The proposed ensemble generates the base learner in a sequential manner. In each iteration, the correctly classified examples belonging to the majority class are replaced by the other majority class examples to create a new balanced training subset, i.e., the base learners differ in the choice of the balanced training subset. The cardinality of the balanced training subsets depends on the imbalance ratio. This work utilizes a kernelized extreme learning machine (KELM) as the base learner to build the ensemble as it is stable and has good generalization performance. The time complexity of BCKELM is considerably lower in contrast to BWELM, BalanceCascade, EasyEnsemble and hybrid artificial bee colony WELM. The exhaustive experimental evaluation on real-world benchmark datasets demonstrates the efficacy of the proposed method.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Alcalá J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple Valued Logic Soft Comput 17(2–3):255–287 Alcalá J, Fernández A, Luengo J, Derrac J, García S, Sánchez L, Herrera F (2011) Keel data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J Multiple Valued Logic Soft Comput 17(2–3):255–287
2.
go back to reference Belciug S, Gorunescu F (2018) Learning a single-hidden layer feedforward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection. J Biomed Inform 83:159–166CrossRef Belciug S, Gorunescu F (2018) Learning a single-hidden layer feedforward neural network using a rank correlation-based strategy with application to high dimensional gene expression and proteomic spectra datasets in cancer detection. J Biomed Inform 83:159–166CrossRef
3.
go back to reference Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159CrossRef Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognit 30(7):1145–1159CrossRef
4.
go back to reference Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Int Res 16(1):321–357MATH Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) Smote: synthetic minority over-sampling technique. J Artif Int Res 16(1):321–357MATH
5.
go back to reference Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30MathSciNetMATH
6.
go back to reference Deng W, Zheng Q, Chen L (2009) Regularized extreme learning machine. In: IEEE symposium on computational intelligence and data mining, pp 389–395 Deng W, Zheng Q, Chen L (2009) Regularized extreme learning machine. In: IEEE symposium on computational intelligence and data mining, pp 389–395
8.
go back to reference Fawcett T (2003) ROC graphs: notes and practical considerations for researchers. Technical report, HP Labs, HPL-2003-4 Fawcett T (2003) ROC graphs: notes and practical considerations for researchers. Technical report, HP Labs, HPL-2003-4
9.
go back to reference Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybernet Part C (Appl Rev) 42(4):463–484CrossRef Galar M, Fernandez A, Barrenechea E, Bustince H, Herrera F (2012) A review on ensembles for the class imbalance problem: bagging-, boosting-, and hybrid-based approaches. IEEE Trans Syst Man Cybernet Part C (Appl Rev) 42(4):463–484CrossRef
10.
go back to reference Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239CrossRef Haixiang G, Yijing L, Shang J, Mingyun G, Yuanyue H, Bing G (2017) Learning from class-imbalanced data: review of methods and applications. Expert Syst Appl 73:220–239CrossRef
11.
go back to reference Han H, Wang WY, Mao BH (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: Huang DS, Zhang XP, Huang GB (eds) Advances in intelligent computing. Springer, Berlin, pp 878–887CrossRef Han H, Wang WY, Mao BH (2005) Borderline-smote: a new over-sampling method in imbalanced data sets learning. In: Huang DS, Zhang XP, Huang GB (eds) Advances in intelligent computing. Springer, Berlin, pp 878–887CrossRef
12.
go back to reference He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRef He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284CrossRef
13.
go back to reference He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), pp 1322–1328 He H, Bai Y, Garcia EA, Li S (2008) ADASYN: adaptive synthetic sampling approach for imbalanced learning. In: 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence), pp 1322–1328
14.
go back to reference Hu S, Liang Y, Ma L, He Y (2009) MSMOTE: improving classification performance when training data is imbalanced. In: 2009 second international workshop on computer science and engineering, vol 2, pp 13–17 Hu S, Liang Y, Ma L, He Y (2009) MSMOTE: improving classification performance when training data is imbalanced. In: 2009 second international workshop on computer science and engineering, vol 2, pp 13–17
15.
go back to reference Huang G, Huang GB, Song S, You K (2015) Trends in extreme learning machines: a review. Neural Netw 61:32–48MATHCrossRef Huang G, Huang GB, Song S, You K (2015) Trends in extreme learning machines: a review. Neural Netw 61:32–48MATHCrossRef
16.
go back to reference Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501CrossRef Huang GB, Zhu QY, Siew CK (2006) Extreme learning machine: theory and applications. Neurocomputing 70(1–3):489–501CrossRef
17.
go back to reference Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybernet Part B (Cybernet) 42(2):513–529CrossRef Huang GB, Zhou H, Ding X, Zhang R (2012) Extreme learning machine for regression and multiclass classification. IEEE Trans Syst Man Cybernet Part B (Cybernet) 42(2):513–529CrossRef
18.
go back to reference Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310CrossRef Huang J, Ling CX (2005) Using AUC and accuracy in evaluating learning algorithms. IEEE Trans Knowl Data Eng 17(3):299–310CrossRef
19.
go back to reference Iosifidis A, Gabbouj M (2015) On the kernel extreme learning machine speedup. Pattern Recognit Lett 68:205–210CrossRef Iosifidis A, Gabbouj M (2015) On the kernel extreme learning machine speedup. Pattern Recognit Lett 68:205–210CrossRef
20.
go back to reference Iosifidis A, Tefas A, Pitas I (2015) On the kernel extreme learning machine classifier. Pattern Recognit Lett 54:11–17CrossRef Iosifidis A, Tefas A, Pitas I (2015) On the kernel extreme learning machine classifier. Pattern Recognit Lett 54:11–17CrossRef
21.
go back to reference Janakiraman VM, Nguyen X, Sterniak J, Assanis D (2015) Identification of the dynamic operating envelope of HCCI engines using class imbalance learning. IEEE Trans Neural Netw Learn Syst 26(1):98–112MathSciNetCrossRef Janakiraman VM, Nguyen X, Sterniak J, Assanis D (2015) Identification of the dynamic operating envelope of HCCI engines using class imbalance learning. IEEE Trans Neural Netw Learn Syst 26(1):98–112MathSciNetCrossRef
22.
go back to reference Janakiraman VM, Nguyen X, Assanis D (2016) Stochastic gradient based extreme learning machines for stable online learning of advanced combustion engines. Neurocomputing 177:304–316CrossRef Janakiraman VM, Nguyen X, Assanis D (2016) Stochastic gradient based extreme learning machines for stable online learning of advanced combustion engines. Neurocomputing 177:304–316CrossRef
23.
go back to reference Krawczyk B, Galar M, Jele L, Herrera F (2016) Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl Soft Comput 38(C):714–726CrossRef Krawczyk B, Galar M, Jele L, Herrera F (2016) Evolutionary undersampling boosting for imbalanced classification of breast cancer malignancy. Appl Soft Comput 38(C):714–726CrossRef
24.
go back to reference Kubat M, Holte RC, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30(2):195–215CrossRef Kubat M, Holte RC, Matwin S (1998) Machine learning for the detection of oil spills in satellite radar images. Mach Learn 30(2):195–215CrossRef
25.
go back to reference Li K, Kong X, Lu Z, Wenyin L, Yin J (2014) Boosting weighted ELM for imbalanced learning. Neurocomputing 128:15–21CrossRef Li K, Kong X, Lu Z, Wenyin L, Yin J (2014) Boosting weighted ELM for imbalanced learning. Neurocomputing 128:15–21CrossRef
26.
go back to reference Liu XY, Wu J, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybernet Part B (Cybernet) 39(2):539–550CrossRef Liu XY, Wu J, Zhou ZH (2009) Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybernet Part B (Cybernet) 39(2):539–550CrossRef
27.
go back to reference López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inform Sci 250:113–141CrossRef López V, Fernández A, García S, Palade V, Herrera F (2013) An insight into classification with imbalanced data: empirical results and current trends on using data intrinsic characteristics. Inform Sci 250:113–141CrossRef
28.
go back to reference Mathew J, Pang CK, Luo M, Leong WH (2018) Classification of imbalanced data by oversampling in kernel space of support vector machines. IEEE Trans Neural Netw Learn Syst 99:1–12 Mathew J, Pang CK, Luo M, Leong WH (2018) Classification of imbalanced data by oversampling in kernel space of support vector machines. IEEE Trans Neural Netw Learn Syst 99:1–12
29.
go back to reference Nanni L, Fantozzi C, Lazzarini N (2015) Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158(C):48–61CrossRef Nanni L, Fantozzi C, Lazzarini N (2015) Coupling different methods for overcoming the class imbalance problem. Neurocomputing 158(C):48–61CrossRef
30.
go back to reference Parvin H, Minaei-Bidgoli B, Alizadeh H (2011) Detection of cancer patients using an innovative method for learning at imbalanced datasets. In: Yao J, Ramanna S, Wang G, Suraj Z (eds) Rough sets and knowledge technology. Springer, Berlin, pp 376–381CrossRef Parvin H, Minaei-Bidgoli B, Alizadeh H (2011) Detection of cancer patients using an innovative method for learning at imbalanced datasets. In: Yao J, Ramanna S, Wang G, Suraj Z (eds) Rough sets and knowledge technology. Springer, Berlin, pp 376–381CrossRef
31.
go back to reference Raghuwanshi BS, Shukla S (2018a) Class-specific cost-sensitive boosting weighted elm for class imbalance learning. Memet Comput 4:1–12 Raghuwanshi BS, Shukla S (2018a) Class-specific cost-sensitive boosting weighted elm for class imbalance learning. Memet Comput 4:1–12
32.
go back to reference Raghuwanshi BS, Shukla S (2018b) Class-specific extreme learning machine for handling binary class imbalance problem. Neural Netw 105:206–217MATHCrossRef Raghuwanshi BS, Shukla S (2018b) Class-specific extreme learning machine for handling binary class imbalance problem. Neural Netw 105:206–217MATHCrossRef
33.
go back to reference Raghuwanshi BS, Shukla S (2018c) Class-specific kernelized extreme learning machine for binary class imbalance learning. Appl Soft Comput 73:1026–1038MATHCrossRef Raghuwanshi BS, Shukla S (2018c) Class-specific kernelized extreme learning machine for binary class imbalance learning. Appl Soft Comput 73:1026–1038MATHCrossRef
34.
go back to reference Raghuwanshi BS, Shukla S (2018d) Underbagging based reduced kernelized weighted extreme learning machine for class imbalance learning. Eng Appl Artif Intell 74:252–270CrossRef Raghuwanshi BS, Shukla S (2018d) Underbagging based reduced kernelized weighted extreme learning machine for class imbalance learning. Eng Appl Artif Intell 74:252–270CrossRef
35.
go back to reference Raghuwanshi BS, Shukla S (2019a) Class imbalance learning using underbagging based kernelized extreme learning machine. Neurocomputing 329:172–187CrossRef Raghuwanshi BS, Shukla S (2019a) Class imbalance learning using underbagging based kernelized extreme learning machine. Neurocomputing 329:172–187CrossRef
36.
go back to reference Raghuwanshi BS, Shukla S (2019b) Generalized class-specific kernelized extreme learning machine for multiclass imbalanced learning. Expert Syst Appl 121:244–255CrossRef Raghuwanshi BS, Shukla S (2019b) Generalized class-specific kernelized extreme learning machine for multiclass imbalanced learning. Expert Syst Appl 121:244–255CrossRef
37.
go back to reference Schapire RE (1999) A brief introduction to boosting. In: Proceedings of the 16th international joint conference on artificial intelligence, Vol 2, IJCAI’99, pp 1401–1406 Schapire RE (1999) A brief introduction to boosting. In: Proceedings of the 16th international joint conference on artificial intelligence, Vol 2, IJCAI’99, pp 1401–1406
38.
go back to reference Seiffert C, Khoshgoftaar TM, Hulse JV, Napolitano A (2010) Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybernet Part A Syst Hum 40(1):185–197CrossRef Seiffert C, Khoshgoftaar TM, Hulse JV, Napolitano A (2010) Rusboost: a hybrid approach to alleviating class imbalance. IEEE Trans Syst Man Cybernet Part A Syst Hum 40(1):185–197CrossRef
39.
go back to reference Shukla S, Yadav RN (2015) Regularized weighted circular complex-valued extreme learning machine for imbalanced learning. IEEE Access 3:3048–3057CrossRef Shukla S, Yadav RN (2015) Regularized weighted circular complex-valued extreme learning machine for imbalanced learning. IEEE Access 3:3048–3057CrossRef
40.
go back to reference Tang X, Chen L (2018) Artificial bee colony optimization-based weighted extreme learning machine for imbalanced data learning. Cluster Comput 1:1–16 Tang X, Chen L (2018) Artificial bee colony optimization-based weighted extreme learning machine for imbalanced data learning. Cluster Comput 1:1–16
42.
go back to reference Wang N, Gao X, Li J (2018a) Random sampling for fast face sketch synthesis. Pattern Recognit 76:215–227CrossRef Wang N, Gao X, Li J (2018a) Random sampling for fast face sketch synthesis. Pattern Recognit 76:215–227CrossRef
43.
go back to reference Wang N, Gao X, Sun L, Li J (2018b) Anchored neighborhood index for face sketch synthesis. IEEE Trans Circuits Syst Video Technol 28(9):2154–2163CrossRef Wang N, Gao X, Sun L, Li J (2018b) Anchored neighborhood index for face sketch synthesis. IEEE Trans Circuits Syst Video Technol 28(9):2154–2163CrossRef
44.
go back to reference Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443CrossRef Wang S, Yao X (2013) Using class imbalance learning for software defect prediction. IEEE Trans Reliab 62(2):434–443CrossRef
45.
go back to reference Xiao W, Zhang J, Li Y, Zhang S, Yang W (2017) Class-specific cost regulation extreme learning machine for imbalanced classification. Neurocomputing 261:70–82CrossRef Xiao W, Zhang J, Li Y, Zhang S, Yang W (2017) Class-specific cost regulation extreme learning machine for imbalanced classification. Neurocomputing 261:70–82CrossRef
46.
go back to reference Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(05):961–976CrossRef Yang X, Song Q, Wang Y (2007) A weighted support vector machine for data classification. Int J Pattern Recognit Artif Intell 21(05):961–976CrossRef
47.
go back to reference Zhang Y, Liu B, Cai J, Zhang S (2016) Ensemble weighted extreme learning machine for imbalanced data classification based on differential evolution. Neural Comput Appl 28:1–9CrossRef Zhang Y, Liu B, Cai J, Zhang S (2016) Ensemble weighted extreme learning machine for imbalanced data classification based on differential evolution. Neural Comput Appl 28:1–9CrossRef
48.
go back to reference Zhao YP (2016) Parsimonious kernel extreme learning machine in primal via Cholesky factorization. Neural Netw 80:95–109MATHCrossRef Zhao YP (2016) Parsimonious kernel extreme learning machine in primal via Cholesky factorization. Neural Netw 80:95–109MATHCrossRef
49.
go back to reference Zhou Z (2012) Ensemble methods: foundations and algorithms. Data mining and knowledge discovery series. Taylor & Francis, Boca RatonCrossRef Zhou Z (2012) Ensemble methods: foundations and algorithms. Data mining and knowledge discovery series. Taylor & Francis, Boca RatonCrossRef
50.
go back to reference Zhu QY, Qin A, Suganthan P, Huang GB (2005) Evolutionary extreme learning machine. Pattern Recognit 38(10):1759–1763MATHCrossRef Zhu QY, Qin A, Suganthan P, Huang GB (2005) Evolutionary extreme learning machine. Pattern Recognit 38(10):1759–1763MATHCrossRef
51.
go back to reference Zong W, Huang GB, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101:229–242CrossRef Zong W, Huang GB, Chen Y (2013) Weighted extreme learning machine for imbalance learning. Neurocomputing 101:229–242CrossRef
Metadata
Title
Classifying imbalanced data using BalanceCascade-based kernelized extreme learning machine
Authors
Bhagat Singh Raghuwanshi
Sanyam Shukla
Publication date
03-09-2019
Publisher
Springer London
Published in
Pattern Analysis and Applications / Issue 3/2020
Print ISSN: 1433-7541
Electronic ISSN: 1433-755X
DOI
https://doi.org/10.1007/s10044-019-00844-w

Other articles of this Issue 3/2020

Pattern Analysis and Applications 3/2020 Go to the issue

Industrial and commercial application

Customs fraud detection

Premium Partner