Skip to main content
Top
Published in: Soft Computing 8/2021

17-02-2021 | Methodologies and Application

A noise-aware feature selection approach for classification

Authors: Mostafa Sabzekar, Zafer Aydin

Published in: Soft Computing | Issue 8/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

A noise-aware version of support vector machines is utilized for feature selection in this paper. Combining this method and sequential backward search (SBS), a new algorithm for removing irrelevant features is proposed. Although feature selection methods in the literature which utilize support vector machines have provided acceptable results, noisy samples and outliers may affect the performance of SVM and feature selections method, consequently. Recently, we have proposed relaxed constrains SVM (RSVM) which handles noisy data and outliers. Each training sample in RSVM is associated with a degree of importance utilizing the fuzzy c-means clustering method. Therefore, a less importance degree is assigned to noisy data and outliers. Moreover, RSVM has more relaxed constraints that can reduce the effect of noisy samples. Feature selection increases the accuracy of different machine learning applications by eliminating noisy and irrelevant features. In the proposed RSVM-SBS feature selection algorithm, noisy data have small effect on eliminating irrelevant features. Experimental results using real-world data verify that RSVM-SBS has better results in comparison with other feature selection approaches utilizing support vector machines.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literature
go back to reference Abdar M, Makarenkov V (2019) CWV-BANN-SVM ensemble learning classifier for an accurate diagnosis of breast cancer. Measurement 146:557–570CrossRef Abdar M, Makarenkov V (2019) CWV-BANN-SVM ensemble learning classifier for an accurate diagnosis of breast cancer. Measurement 146:557–570CrossRef
go back to reference Abdoos A, Khorshidian Mianaei P, Rayatpanah Ghadikolaei M (2016) Combined VMD-SVM based feature selection method for classification of power quality events. Appl Soft Comput 38:637–646CrossRef Abdoos A, Khorshidian Mianaei P, Rayatpanah Ghadikolaei M (2016) Combined VMD-SVM based feature selection method for classification of power quality events. Appl Soft Comput 38:637–646CrossRef
go back to reference Aljarah I, Al-Zoubi A, Haris F, Hassonah M, Mirjalili S, Saadeh H (2018) Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm. Cognit Comput 10:478–495CrossRef Aljarah I, Al-Zoubi A, Haris F, Hassonah M, Mirjalili S, Saadeh H (2018) Simultaneous feature selection and support vector machine optimization using the grasshopper optimization algorithm. Cognit Comput 10:478–495CrossRef
go back to reference Benítez-Peña S, Blanquero R, Carrizosa E, Ramírez-Cobo P (2019) Cost-sensitive feature selection for support vector machines. Comput Oper Res 106:169–178MathSciNetCrossRef Benítez-Peña S, Blanquero R, Carrizosa E, Ramírez-Cobo P (2019) Cost-sensitive feature selection for support vector machines. Comput Oper Res 106:169–178MathSciNetCrossRef
go back to reference Blum A, Langley PP (1997) Selection of relevant features and examples in machine learning. Artif Intell 97:245–271MathSciNetCrossRef Blum A, Langley PP (1997) Selection of relevant features and examples in machine learning. Artif Intell 97:245–271MathSciNetCrossRef
go back to reference Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:13–156CrossRef Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:13–156CrossRef
go back to reference Duda PEHRO, Stork DG (2001) Pattern classification. Wiley-Interscience Publication, HobokenMATH Duda PEHRO, Stork DG (2001) Pattern classification. Wiley-Interscience Publication, HobokenMATH
go back to reference Faris H, Al-Zoubi A, Heidari A, Aljarah I, Mafarja M, Hassonah M, Fujita H (2019) An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks. Inf Fusion 48:67–83CrossRef Faris H, Al-Zoubi A, Heidari A, Aljarah I, Mafarja M, Hassonah M, Fujita H (2019) An intelligent system for spam detection and identification of the most relevant features based on evolutionary random weight networks. Inf Fusion 48:67–83CrossRef
go back to reference GhasemiGol M, Sabzekar M, Monsefi R, Naghibzadeh M, Sadoghi Yazdi H (2010) Support vector data description with fuzzy constraints. In: First international conference on intelligent systems, modelling and simulation (ISMS), pp 10–14, Liverpool, England GhasemiGol M, Sabzekar M, Monsefi R, Naghibzadeh M, Sadoghi Yazdi H (2010) Support vector data description with fuzzy constraints. In: First international conference on intelligent systems, modelling and simulation (ISMS), pp 10–14, Liverpool, England
go back to reference Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH
go back to reference Hancer E, Xue B, Zhang M (2018) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119CrossRef Hancer E, Xue B, Zhang M (2018) Differential evolution for filter feature selection based on information theory and feature ranking. Knowl Based Syst 140:103–119CrossRef
go back to reference He Q, Wu C (2011) Membership evaluation and feature selection for fuzzy support vector machine based on fuzzy rough sets. Soft Comput 15:1105–1114CrossRef He Q, Wu C (2011) Membership evaluation and feature selection for fuzzy support vector machine based on fuzzy rough sets. Soft Comput 15:1105–1114CrossRef
go back to reference Kursa M, Rudnicki W (2010) Feature selection with the Boruta package. J Stat Softw 36:1–13CrossRef Kursa M, Rudnicki W (2010) Feature selection with the Boruta package. J Stat Softw 36:1–13CrossRef
go back to reference Liu Y, Zheng YF (2006) FS-SFS: a novel feature selection method for support vector machines. Pattern Recogn 39:1333–1345CrossRef Liu Y, Zheng YF (2006) FS-SFS: a novel feature selection method for support vector machines. Pattern Recogn 39:1333–1345CrossRef
go back to reference Lu M (2019) Embedded feature selection accounting for unknown data heterogeneity. Expert Syst Appl 119:350–361CrossRef Lu M (2019) Embedded feature selection accounting for unknown data heterogeneity. Expert Syst Appl 119:350–361CrossRef
go back to reference Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453CrossRef Mafarja M, Mirjalili S (2018) Whale optimization approaches for wrapper feature selection. Appl Soft Comput 62:441–453CrossRef
go back to reference Mafarja M, Aljarah I, Heidari A, Hammouri A, Faris H, Al-Zoubi A, Mirjalali S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl Based Syst 145:25–45CrossRef Mafarja M, Aljarah I, Heidari A, Hammouri A, Faris H, Al-Zoubi A, Mirjalali S (2018) Evolutionary population dynamics and grasshopper optimization approaches for feature selection problems. Knowl Based Syst 145:25–45CrossRef
go back to reference Maldonado S, Weber R (2009) A wrapper method for feature selection using support vector machines. Inf Sci 179:2208–2217CrossRef Maldonado S, Weber R (2009) A wrapper method for feature selection using support vector machines. Inf Sci 179:2208–2217CrossRef
go back to reference Maldonado S, Weber R, Basak J (2011) Simultaneous feature selection and classification using kernel-penalized support vector machines. Inf Sci 181:115–128CrossRef Maldonado S, Weber R, Basak J (2011) Simultaneous feature selection and classification using kernel-penalized support vector machines. Inf Sci 181:115–128CrossRef
go back to reference Maldonadoa S, López J (2018) Dealing with high-dimensional class-imbalanced datasets: embedded feature selection for SVM classification. Appl Soft Comput 67:94–105CrossRef Maldonadoa S, López J (2018) Dealing with high-dimensional class-imbalanced datasets: embedded feature selection for SVM classification. Appl Soft Comput 67:94–105CrossRef
go back to reference Marill T, Green DM (1963) On the effectiveness of receptors in recognition system. IEEE Trans Inf Theory 9:11–17CrossRef Marill T, Green DM (1963) On the effectiveness of receptors in recognition system. IEEE Trans Inf Theory 9:11–17CrossRef
go back to reference Mundra PA, Rajapakse JC (2010) SVM-RFE with MRMR filter for gene selection. IEEE Trans Nanobiosci 9(1):31–37CrossRef Mundra PA, Rajapakse JC (2010) SVM-RFE with MRMR filter for gene selection. IEEE Trans Nanobiosci 9(1):31–37CrossRef
go back to reference Nasiri JA, Sabzekar M, Sadoghi Yazdi H, Naghibzadeh M, Naghibzadeh B (2009) Intelligent arrhythmia detection using genetic algorithm and emphatic SVM (ESVM). In: Third UKSim European symposium on computer modeling and simulation (EMS), pp 112–117, Athens, Greece Nasiri JA, Sabzekar M, Sadoghi Yazdi H, Naghibzadeh M, Naghibzadeh B (2009) Intelligent arrhythmia detection using genetic algorithm and emphatic SVM (ESVM). In: Third UKSim European symposium on computer modeling and simulation (EMS), pp 112–117, Athens, Greece
go back to reference Neumann J, Schnorr C, Steidl G (2005) Combined SVM-based feature selection and classification. Mach Learn 61:129–150CrossRef Neumann J, Schnorr C, Steidl G (2005) Combined SVM-based feature selection and classification. Mach Learn 61:129–150CrossRef
go back to reference Pławiak P, Abdar M, Acharya UR (2019) Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring. Appl Soft Comput 84:105740CrossRef Pławiak P, Abdar M, Acharya UR (2019) Application of new deep genetic cascade ensemble of SVM classifiers to predict the Australian credit scoring. Appl Soft Comput 84:105740CrossRef
go back to reference Sabzekar M, Naghibzadeh M (2013a) Fuzzy c-means improvement using relaxed constraints support vector machines. Appl Soft Comput 13:881–890CrossRef Sabzekar M, Naghibzadeh M (2013a) Fuzzy c-means improvement using relaxed constraints support vector machines. Appl Soft Comput 13:881–890CrossRef
go back to reference Sabzekar M, Naghibzadeh M (2013b) Fuzzy c-means improvement using relaxed constraints support vector machines. Appl Soft Comput 13(2):881–890CrossRef Sabzekar M, Naghibzadeh M (2013b) Fuzzy c-means improvement using relaxed constraints support vector machines. Appl Soft Comput 13(2):881–890CrossRef
go back to reference Sabzekar M, Naghibzadeh M, Sadoghi Yazdi H, Effati S (2009) Emphatic constraints support vector machines for multiclass classification. In: Third UKSim European symposium on computer modeling and simulation (EMS), pp 118–123, Athens, Greece Sabzekar M, Naghibzadeh M, Sadoghi Yazdi H, Effati S (2009) Emphatic constraints support vector machines for multiclass classification. In: Third UKSim European symposium on computer modeling and simulation (EMS), pp 118–123, Athens, Greece
go back to reference Sabzekar M, Sadoghi Yazdi H, Naghibzadeh M (2011) Relaxed constraints support vector machines for noisy data. Neural Comput Appl 20:671–685CrossRef Sabzekar M, Sadoghi Yazdi H, Naghibzadeh M (2011) Relaxed constraints support vector machines for noisy data. Neural Comput Appl 20:671–685CrossRef
go back to reference Sabzekar M, Hossein Yaghmaee Moghaddam M, Naghibzadeh M (2013) TCP traffic classification using relaxed constraints support vector machines. In: Fathi M (ed) Integration of practice-oriented knowledge technology: trends and prospectives. ISBN 978–3–642–34470–1, pp 129–141 Sabzekar M, Hossein Yaghmaee Moghaddam M, Naghibzadeh M (2013) TCP traffic classification using relaxed constraints support vector machines. In: Fathi M (ed) Integration of practice-oriented knowledge technology: trends and prospectives. ISBN 978–3–642–34470–1, pp 129–141
go back to reference Shieh MD, Yang CC (2008) Multiclass SVM-RFE for product form feature selection. Expert Syst Appl 35:531–541CrossRef Shieh MD, Yang CC (2008) Multiclass SVM-RFE for product form feature selection. Expert Syst Appl 35:531–541CrossRef
go back to reference Tang W (2010) Feature selection using hybrid Taguchi genetic algorithm and fuzzy support vector machine. In: Sixth international conference on natural computation, pp 2348–2355 Tang W (2010) Feature selection using hybrid Taguchi genetic algorithm and fuzzy support vector machine. In: Sixth international conference on natural computation, pp 2348–2355
go back to reference Torres-Valencia C, Álvarez-López M, Orozco-Gutiérrez Á (2017) SVM-based feature selection methods for emotion recognition from multimodal data. J Multimodal User Interfaces 11:9–23CrossRef Torres-Valencia C, Álvarez-López M, Orozco-Gutiérrez Á (2017) SVM-based feature selection methods for emotion recognition from multimodal data. J Multimodal User Interfaces 11:9–23CrossRef
go back to reference Vapnik V (1995) The nature of statistical learning theory. Springer, New YorkCrossRef Vapnik V (1995) The nature of statistical learning theory. Springer, New YorkCrossRef
go back to reference Vapnik V (1998) Statistical learning theory. Wiley, New YorkMATH Vapnik V (1998) Statistical learning theory. Wiley, New YorkMATH
go back to reference Xia H (2008) Feature selection based on fuzzy SVM. In: Fifth international conference on fuzzy systems and knowledge discovery (FSKD), vol 1, pp 586–589 Xia H (2008) Feature selection based on fuzzy SVM. In: Fifth international conference on fuzzy systems and knowledge discovery (FSKD), vol 1, pp 586–589
go back to reference Xiong W, Wang C (2008) Feature selection: a hybrid approach based on self-adaptive ant colony and support vector machine. In: International conference on computer science and software engineering, pp 751–754 Xiong W, Wang C (2008) Feature selection: a hybrid approach based on self-adaptive ant colony and support vector machine. In: International conference on computer science and software engineering, pp 751–754
go back to reference Yan C, Ma J, Luo H, Patel A (2019) Hybrid binary coral reefs optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical datasets. Chemom Intell Lab Syst 184:102–111CrossRef Yan C, Ma J, Luo H, Patel A (2019) Hybrid binary coral reefs optimization algorithm with simulated annealing for feature selection in high-dimensional biomedical datasets. Chemom Intell Lab Syst 184:102–111CrossRef
go back to reference Zakeri A, Hokmabadi A (2019) Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Syst Appl 119:61–72CrossRef Zakeri A, Hokmabadi A (2019) Efficient feature selection method using real-valued grasshopper optimization algorithm. Expert Syst Appl 119:61–72CrossRef
go back to reference Zaman S, Karray F (2009) Features selection for intrusion detection systems based on support vector machines. In: 2009 6th IEEE consumer communications and networking conference, pp 1–8 Zaman S, Karray F (2009) Features selection for intrusion detection systems based on support vector machines. In: 2009 6th IEEE consumer communications and networking conference, pp 1–8
go back to reference Zheng K, Wang X (2018) Feature selection method with joint maximal information entropy between features and class. Pattern Recogn 77:20–29CrossRef Zheng K, Wang X (2018) Feature selection method with joint maximal information entropy between features and class. Pattern Recogn 77:20–29CrossRef
Metadata
Title
A noise-aware feature selection approach for classification
Authors
Mostafa Sabzekar
Zafer Aydin
Publication date
17-02-2021
Publisher
Springer Berlin Heidelberg
Published in
Soft Computing / Issue 8/2021
Print ISSN: 1432-7643
Electronic ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-021-05630-7

Other articles of this Issue 8/2021

Soft Computing 8/2021 Go to the issue

Premium Partner