Skip to main content
Erschienen in: Neural Computing and Applications 10/2017

31.03.2017 | New Trends in data pre-processing methods for signal and image classification

An approach for feature selection using local searching and global optimization techniques

verfasst von: Sadhana Tiwari, Birmohan Singh, Manpreet Kaur

Erschienen in: Neural Computing and Applications | Ausgabe 10/2017

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Classification problems such as gene expression array analysis, text processing of Internet document, combinatorial chemistry, software defect prediction and image retrieval involve tens or hundreds of thousands of features in the dataset. However, many of these features may be irrelevant and redundant, which only worsen the performance of the learning algorithms, and this may lead to the problem of overfitting. These superfluous features only degrade the accuracy and the computation time of a classification algorithm. So, the selection of relevant and nonredundant features is an important preprocessing step of any classification problem. Most of the global optimization techniques have the ability to converge to a solution quickly, but these begin with initializing a population randomly and the choice of initial population is an important step. In this paper, local searching algorithms have been used for generating a subset of relevant and nonredundant features; thereafter, a global optimization algorithm has been used so as to remove the limitations of global optimization algorithms, like lack of consistency in classification results and very high time complexity, to some extent. The computation time and classification accuracy are improved by using a feature set obtained from sequential backward selection and mutual information maximization algorithm which is fed to a global optimization technique (genetic algorithm, differential evolution or particle swarm optimization). In this proposed work, the computation time of these global optimization techniques has been reduced by using variance as stopping criteria. The proposed approach has been tested on publicly available Sonar, Wdbc and German datasets.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Jain A, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE T Pattern Anal 19:153–158CrossRef Jain A, Zongker D (1997) Feature selection: evaluation, application, and small sample performance. IEEE T Pattern Anal 19:153–158CrossRef
2.
Zurück zum Zitat Kotsiantis S (2011) Feature selection for machine learning classification problems: a recent overview. Artif Intell Rev 42:157CrossRef Kotsiantis S (2011) Feature selection for machine learning classification problems: a recent overview. Artif Intell Rev 42:157CrossRef
3.
Zurück zum Zitat Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5:1205–1224MathSciNetMATH Yu L, Liu H (2004) Efficient feature selection via analysis of relevance and redundancy. J Mach Learn Res 5:1205–1224MathSciNetMATH
4.
Zurück zum Zitat Peng Y, Wu Z, Jiang J (2010) A novel feature selection approach for biomedical data classification. J Biomed Inform 43:15–23CrossRef Peng Y, Wu Z, Jiang J (2010) A novel feature selection approach for biomedical data classification. J Biomed Inform 43:15–23CrossRef
5.
Zurück zum Zitat Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156CrossRef Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156CrossRef
6.
Zurück zum Zitat Sutha K, Tamilselvi JJ (2015) A review of feature selection algorithms for data mining techniques. Int J Comput Sci Eng 7:63–67 Sutha K, Tamilselvi JJ (2015) A review of feature selection algorithms for data mining techniques. Int J Comput Sci Eng 7:63–67
7.
Zurück zum Zitat Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17:491–502CrossRef Liu H, Yu L (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Trans Knowl Data Eng 17:491–502CrossRef
8.
Zurück zum Zitat Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 100:917–922CrossRefMATH Narendra PM, Fukunaga K (1977) A branch and bound algorithm for feature subset selection. IEEE Trans Comput 100:917–922CrossRefMATH
9.
Zurück zum Zitat Gupta P, Doermann D, DeMenthon D (2002) Beam search for feature selection in automatic SVM defect classification. Proc Int Conf Pattern Recogn 2:212–215 Gupta P, Doermann D, DeMenthon D (2002) Beam search for feature selection in automatic SVM defect classification. Proc Int Conf Pattern Recogn 2:212–215
10.
Zurück zum Zitat Kohavi R, Sommerfield D (1995) Feature subset selection using the wrapper method: overfitting and dynamic search space topology. In: Proceedings of international conference of knowledge discovery and data mining, pp 192–197 Kohavi R, Sommerfield D (1995) Feature subset selection using the wrapper method: overfitting and dynamic search space topology. In: Proceedings of international conference of knowledge discovery and data mining, pp 192–197
11.
Zurück zum Zitat Pudil P, Novovicova J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15:1119–1125CrossRef Pudil P, Novovicova J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15:1119–1125CrossRef
12.
Zurück zum Zitat Liu Y, Zheng YF (2006) FS_SFS: a novel feature selection method for support vector machines. Pattern Recogn 39:1333–1345CrossRefMATH Liu Y, Zheng YF (2006) FS_SFS: a novel feature selection method for support vector machines. Pattern Recogn 39:1333–1345CrossRefMATH
13.
Zurück zum Zitat Yusta SC (2009) Different metaheuristic strategies to solve the feature selection problem. Pattern Recogn Lett 30:525–534CrossRef Yusta SC (2009) Different metaheuristic strategies to solve the feature selection problem. Pattern Recogn Lett 30:525–534CrossRef
14.
Zurück zum Zitat Chaikla N, Qi Y (1999) Genetic algorithms in feature selection. Proc IEEE Int Conf Syst Man Cybernet 5:538–540 Chaikla N, Qi Y (1999) Genetic algorithms in feature selection. Proc IEEE Int Conf Syst Man Cybernet 5:538–540
16.
Zurück zum Zitat Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer, Berlin, pp 37–130MATH Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer, Berlin, pp 37–130MATH
17.
18.
Zurück zum Zitat Christensen J, Marks J, Shieber S (1995) An empirical study of algorithms for point-feature label placement. ACM Trans Gr 14:203–232CrossRef Christensen J, Marks J, Shieber S (1995) An empirical study of algorithms for point-feature label placement. ACM Trans Gr 14:203–232CrossRef
19.
Zurück zum Zitat Hall MA (1999) Correlation-based feature selection for machine learning. Dissertation, University of Waikato Hall MA (1999) Correlation-based feature selection for machine learning. Dissertation, University of Waikato
20.
Zurück zum Zitat Burrell L, Smart O, Georgoulas GK, Marsh E, Vachtsevanos GJ (2007) Evaluation of feature selection techniques for analysis of functional MRI and EEG. In: Proceedings of international conference on data mining, pp 256–262 Burrell L, Smart O, Georgoulas GK, Marsh E, Vachtsevanos GJ (2007) Evaluation of feature selection techniques for analysis of functional MRI and EEG. In: Proceedings of international conference on data mining, pp 256–262
21.
Zurück zum Zitat Vafaie H, Imam IF (1994) Feature selection methods: genetic algorithms vs. greedy-like search. Proc Int Conf Fuzzy Intell Control Syst 51:39–43 Vafaie H, Imam IF (1994) Feature selection methods: genetic algorithms vs. greedy-like search. Proc Int Conf Fuzzy Intell Control Syst 51:39–43
22.
Zurück zum Zitat Ladha L, Deepa T (2011) Feature selection methods and algorithms. Int J Adv Trends Comput Sci Eng 3:1787–1797 Ladha L, Deepa T (2011) Feature selection methods and algorithms. Int J Adv Trends Comput Sci Eng 3:1787–1797
23.
Zurück zum Zitat Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal 26:1424–1437CrossRef Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal 26:1424–1437CrossRef
24.
25.
Zurück zum Zitat Yuan Huang, TsengSS Gangshan W, Fuyan Z (1999) A two-phase feature selection method using both filter and wrapper. Proc IEEE Conf Syst Man Cybernet 2:132–136 Yuan Huang, TsengSS Gangshan W, Fuyan Z (1999) A two-phase feature selection method using both filter and wrapper. Proc IEEE Conf Syst Man Cybernet 2:132–136
26.
Zurück zum Zitat Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH
27.
Zurück zum Zitat Brown G, Pocock A, Zhao MJ, Luján M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13:27–66MathSciNetMATH Brown G, Pocock A, Zhao MJ, Luján M (2012) Conditional likelihood maximisation: a unifying framework for information theoretic feature selection. J Mach Learn Res 13:27–66MathSciNetMATH
28.
Zurück zum Zitat Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 22:8520–8532CrossRef Bennasar M, Hicks Y, Setchi R (2015) Feature selection using joint mutual information maximisation. Expert Syst Appl 22:8520–8532CrossRef
29.
Zurück zum Zitat Bhandari D, Murthy CA, Pal SK (2012) Variance as a stopping criterion for genetic algorithms with elitist model. Fundam Inform 120:145–164MathSciNetMATH Bhandari D, Murthy CA, Pal SK (2012) Variance as a stopping criterion for genetic algorithms with elitist model. Fundam Inform 120:145–164MathSciNetMATH
30.
Zurück zum Zitat Yu L, Liu H (2003) Efficiently handling feature redundancy in high-dimensional data. In: Proceedings of international conference on knowledge discovery and data mining, pp 685–690 Yu L, Liu H (2003) Efficiently handling feature redundancy in high-dimensional data. In: Proceedings of international conference on knowledge discovery and data mining, pp 685–690
31.
Zurück zum Zitat Kwak N, Choi CH (2002) Input feature selection by mutual information based on parzen window. IEEE Trans Pattern Anal 24:1667–1671CrossRef Kwak N, Choi CH (2002) Input feature selection by mutual information based on parzen window. IEEE Trans Pattern Anal 24:1667–1671CrossRef
32.
Zurück zum Zitat Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. Proc Int Conf Mach Learn 3:856–863 Yu L, Liu H (2003) Feature selection for high-dimensional data: a fast correlation-based filter solution. Proc Int Conf Mach Learn 3:856–863
33.
Zurück zum Zitat Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal 27:1226–1238CrossRef Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal 27:1226–1238CrossRef
34.
Zurück zum Zitat Zhuo L, Zheng J, Li X, Wang F, Ai B, Qian, J (2008) A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine. In: Proceedings of geoinformatics 2008 and joint conference on GIS and built environment: classification of remote sensing images, pp 71471J–71471J Zhuo L, Zheng J, Li X, Wang F, Ai B, Qian, J (2008) A genetic algorithm based wrapper feature selection method for classification of hyperspectral images using support vector machine. In: Proceedings of geoinformatics 2008 and joint conference on GIS and built environment: classification of remote sensing images, pp 71471J–71471J
35.
Zurück zum Zitat Jung M, Zscheischler J (2013) A guided hybrid genetic algorithm for feature selection with expensive cost functions. Proc Int Conf Comput Sci 18:2337–2346CrossRef Jung M, Zscheischler J (2013) A guided hybrid genetic algorithm for feature selection with expensive cost functions. Proc Int Conf Comput Sci 18:2337–2346CrossRef
36.
Zurück zum Zitat Jiang J, Bo Y, Song C, Bao L (2012) Hybrid algorithm based on particle swarm optimization and artificial fish swarm algorithm. Int Symp Neural Netw 607–614 Jiang J, Bo Y, Song C, Bao L (2012) Hybrid algorithm based on particle swarm optimization and artificial fish swarm algorithm. Int Symp Neural Netw 607–614
37.
Zurück zum Zitat Balakrishnan U, Venkatachalapathy K, Marimuthu SG (2015) A hybrid PSO-DEFS based feature selection for the identification of diabetic retinopathy. Curr Diabet Rev 11:182–190CrossRef Balakrishnan U, Venkatachalapathy K, Marimuthu SG (2015) A hybrid PSO-DEFS based feature selection for the identification of diabetic retinopathy. Curr Diabet Rev 11:182–190CrossRef
38.
Zurück zum Zitat Brown G (2009) A new perspective for information theoretic feature selection. In: Proceedings of international conference on artificial intelligence and statistics, pp 49–56 Brown G (2009) A new perspective for information theoretic feature selection. In: Proceedings of international conference on artificial intelligence and statistics, pp 49–56
39.
Zurück zum Zitat Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5:1531–1555MathSciNetMATH Fleuret F (2004) Fast binary feature selection with conditional mutual information. J Mach Learn Res 5:1531–1555MathSciNetMATH
42.
Zurück zum Zitat Wang G, Song Q, Sun H, Zhang X, Xu B, Zhou Y (2013) A feature subset selection algorithm automatic recommendation method. J Artif Intell Res 47:1–34MATH Wang G, Song Q, Sun H, Zhang X, Xu B, Zhou Y (2013) A feature subset selection algorithm automatic recommendation method. J Artif Intell Res 47:1–34MATH
Metadaten
Titel
An approach for feature selection using local searching and global optimization techniques
verfasst von
Sadhana Tiwari
Birmohan Singh
Manpreet Kaur
Publikationsdatum
31.03.2017
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 10/2017
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-017-2959-y

Weitere Artikel der Ausgabe 10/2017

Neural Computing and Applications 10/2017 Zur Ausgabe

New Trends in data pre-processing methods for signal and image classification

Covering-based rough set classification system

New Trends in data pre-processing methods for signal and image classification

Muscular synergy classification and myoelectric control using high-order cross-cumulants

New Trends in data pre-processing methods for signal and image classification

Sine–cosine algorithm for feature selection with elitism strategy and new updating mechanism

New Trends in data pre-processing methods for signal and image classification

A novel image segmentation approach based on neutrosophic c-means clustering and indeterminacy filtering

New Trends in data pre-processing methods for signal and image classification

Automatic detection of respiratory arrests in OSA patients using PPG and machine learning techniques