Skip to main content
Top

2018 | OriginalPaper | Chapter

Improving Evolutionary Algorithm Performance for Feature Selection in High-Dimensional Data

Authors : N. Cilia, C. De Stefano, F. Fontanella, A. Scotto di Freca

Published in: Applications of Evolutionary Computation

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

In classification and clustering problems, selecting a subset of discriminative features is a challenging problem, especially when hundreds or thousands of features are involved. In this framework, Evolutionary Computation (EC) techniques have received a growing scientific interest in the last years, because they are able to explore large search spaces without requiring any a priori knowledge or assumption on the considered domain. Following this line of thought, we developed a novel strategy to improve the performance of EC-based algorithms for feature selection. The proposed strategy requires to rank the whole set of available features according to a univariate evaluation function; then the search space represented by the first M ranked features is searched using an evolutionary algorithm for finding feature subsets with high discriminative power. Results of comparisons demonstrated the effectiveness of the proposed approach in improving the performance obtainable with three effective and widely used EC-based algorithm for feature selection in high dimensional data problems, namely Ant Colony Optimization (ACO), Particle Swarm Optimization (PSO) and Artificial Bees Colony (ABC).

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
Note that the same holds also for the feature-class correlation.
 
Literature
1.
go back to reference Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1–4), 131–156 (1997)CrossRef Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1(1–4), 131–156 (1997)CrossRef
2.
go back to reference Xue, B., Zhang, M., Browne, W.N., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)CrossRef Xue, B., Zhang, M., Browne, W.N., Yao, X.: A survey on evolutionary computation approaches to feature selection. IEEE Trans. Evol. Comput. 20(4), 606–626 (2016)CrossRef
3.
go back to reference Bevilacqua, V., Mastronardi, G., Piscopo, G.: Evolutionary approach to inverse planning in coplanar radiotherapy. Image Vis. Comput. 25(2), 196–203 (2007). Soft Computing in Image AnalysisCrossRefMATH Bevilacqua, V., Mastronardi, G., Piscopo, G.: Evolutionary approach to inverse planning in coplanar radiotherapy. Image Vis. Comput. 25(2), 196–203 (2007). Soft Computing in Image AnalysisCrossRefMATH
4.
go back to reference Menolascina, F., Tommasi, S., Paradiso, A., Cortellino, M., Bevilacqua, V., Mastronardi, G.: Novel data mining techniques in acgh based breast cancer subtypes profiling: the biological perspective. In: 2007 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology, pp. 9–16, April 2007 Menolascina, F., Tommasi, S., Paradiso, A., Cortellino, M., Bevilacqua, V., Mastronardi, G.: Novel data mining techniques in acgh based breast cancer subtypes profiling: the biological perspective. In: 2007 IEEE Symposium on Computational Intelligence and Bioinformatics and Computational Biology, pp. 9–16, April 2007
5.
go back to reference Menolascina, F., Bellomo, D., Maiwald, T., Bevilacqua, V., Ciminelli, C., Paradiso, A., Tommasi, S.: Developing optimal input design strategies in cancer systems biology with applications to microfluidic device engineering. BMC Bioinform. 10(12) (2009) Menolascina, F., Bellomo, D., Maiwald, T., Bevilacqua, V., Ciminelli, C., Paradiso, A., Tommasi, S.: Developing optimal input design strategies in cancer systems biology with applications to microfluidic device engineering. BMC Bioinform. 10(12) (2009)
6.
go back to reference Bevilacqua, V., Brunetti, A., Triggiani, M., Magaletti, D., Telegrafo, M., Moschetta, M.: An optimized feed-forward artificial neural network topology to support radiologists in breast lesions classification. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion, GECCO 2016 Companion, pp. 1385–1392. ACM, New York (2016). https://doi.org/10.1145/2908961.2931733 Bevilacqua, V., Brunetti, A., Triggiani, M., Magaletti, D., Telegrafo, M., Moschetta, M.: An optimized feed-forward artificial neural network topology to support radiologists in breast lesions classification. In: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion, GECCO 2016 Companion, pp. 1385–1392. ACM, New York (2016). https://​doi.​org/​10.​1145/​2908961.​2931733
7.
go back to reference Manimala, K., Selvi, K., Ahila, R.: Hybrid soft computing techniques for feature selection and parameter optimization in power quality data mining. Appl. Soft Comput. 11(8), 5485–5497 (2011)CrossRef Manimala, K., Selvi, K., Ahila, R.: Hybrid soft computing techniques for feature selection and parameter optimization in power quality data mining. Appl. Soft Comput. 11(8), 5485–5497 (2011)CrossRef
8.
go back to reference Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans. Cybern. 43(6), 1656–1671 (2013)CrossRef Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans. Cybern. 43(6), 1656–1671 (2013)CrossRef
10.
go back to reference Lanzi, P.: Fast feature selection with genetic algorithms: a filter approach. In: IEEE International Conference on Evolutionary Computation, pp. 537–540, April 1997 Lanzi, P.: Fast feature selection with genetic algorithms: a filter approach. In: IEEE International Conference on Evolutionary Computation, pp. 537–540, April 1997
11.
go back to reference Cordella, L.P., De Stefano, C., Fontanella, F., Marrocco, C., Scotto di Freca, A.: Combining single class features for improving performance of a two stage classifier. In: 20th International Conference on Pattern Recognition (ICPR 2010), pp. 4352–4355. IEEE Computer Society (2010) Cordella, L.P., De Stefano, C., Fontanella, F., Marrocco, C., Scotto di Freca, A.: Combining single class features for improving performance of a two stage classifier. In: 20th International Conference on Pattern Recognition (ICPR 2010), pp. 4352–4355. IEEE Computer Society (2010)
14.
go back to reference Oreski, S., Oreski, G.: Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst. Appl. 41(4, Part 2), 2052–2064 (2014)CrossRef Oreski, S., Oreski, G.: Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst. Appl. 41(4, Part 2), 2052–2064 (2014)CrossRef
15.
go back to reference Tan, F., Fu, X., Zhang, Y., Bourgeois, A.G.: A genetic algorithm-based method for feature subset selection. Soft. Comput. 12(2), 111–120 (2007)CrossRef Tan, F., Fu, X., Zhang, Y., Bourgeois, A.G.: A genetic algorithm-based method for feature subset selection. Soft. Comput. 12(2), 111–120 (2007)CrossRef
16.
go back to reference Ugolotti, R., Mesejo, P., Zongaro, S., Bardoni, B., Berto, G., Bianchi, F., Molineris, I., Giacobini, M., Cagnoni, S., Cunto, F.D.: Visual search of neuropil-enriched rnas from brain in situ hybridization data through the image analysis pipeline hippo-atesc. PLOS ONE 8(9) (2013) Ugolotti, R., Mesejo, P., Zongaro, S., Bardoni, B., Berto, G., Bianchi, F., Molineris, I., Giacobini, M., Cagnoni, S., Cunto, F.D.: Visual search of neuropil-enriched rnas from brain in situ hybridization data through the image analysis pipeline hippo-atesc. PLOS ONE 8(9) (2013)
18.
go back to reference Liu, H., Setiono, R.: Chi2: Feature selection and discretization of numeric attributes. In: ICTAI, pp. 88–91. IEEE Computer Society, Washington, DC (1995) Liu, H., Setiono, R.: Chi2: Feature selection and discretization of numeric attributes. In: ICTAI, pp. 88–91. IEEE Computer Society, Washington, DC (1995)
19.
go back to reference Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann Publishers Inc., San Francisco (2000) Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–366. Morgan Kaufmann Publishers Inc., San Francisco (2000)
21.
go back to reference Huang, J., Cai, Y., Xu, X.: A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recogn. Lett. 28(13), 1825–1844 (2007)CrossRef Huang, J., Cai, Y., Xu, X.: A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recogn. Lett. 28(13), 1825–1844 (2007)CrossRef
22.
go back to reference Karaboga, D.: An idea based on Honey Bee Swarm for Numerical Optimization. Technical report TR06, Erciyes University, October 2005 Karaboga, D.: An idea based on Honey Bee Swarm for Numerical Optimization. Technical report TR06, Erciyes University, October 2005
23.
go back to reference Gütlein, M., Frank, E., Hall, M., Karwath, A.: Large scale attribute selection using wrappers. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2009) (2009) Gütlein, M., Frank, E., Hall, M., Karwath, A.: Large scale attribute selection using wrappers. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2009) (2009)
24.
go back to reference Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML 2003, pp. 856–863. AAAI Press (2003) Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. In: Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML 2003, pp. 856–863. AAAI Press (2003)
25.
go back to reference Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Patt. Anal. Mach. Intell. 27(8), 1226–1238 (2005)CrossRef Peng, H., Long, F., Ding, C.: Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans. Patt. Anal. Mach. Intell. 27(8), 1226–1238 (2005)CrossRef
26.
go back to reference Babiloni, C., Triggiani, A.I., Lizio, R., Cordone, S., Tattoli, G., Bevilacqua, V., Soricelli, A., Ferri, R., Nobili, F., Gesualdo, L., Millán-Calenti, J.C., Buján, A., Tortelli, R., Cardinali, V., Barulli, M.R., Giannini, A., Spagnolo, P., Armenise, S., Buenza, G., Scianatico, G., Logroscino, G., Frisoni, G.B., del Percio, C.: Classification of single normal and alzheimer’s disease individuals from cortical sources of resting state eeg rhythms. Front. Neurosci. 10, 47 (2016)CrossRef Babiloni, C., Triggiani, A.I., Lizio, R., Cordone, S., Tattoli, G., Bevilacqua, V., Soricelli, A., Ferri, R., Nobili, F., Gesualdo, L., Millán-Calenti, J.C., Buján, A., Tortelli, R., Cardinali, V., Barulli, M.R., Giannini, A., Spagnolo, P., Armenise, S., Buenza, G., Scianatico, G., Logroscino, G., Frisoni, G.B., del Percio, C.: Classification of single normal and alzheimer’s disease individuals from cortical sources of resting state eeg rhythms. Front. Neurosci. 10, 47 (2016)CrossRef
27.
go back to reference Bria, A., Marrocco, C., Molinara, M., Tortorella, F.: An effective learning strategy for cascaded object detection. Inf. Sci. 340, 17–26 (2016)MathSciNetCrossRef Bria, A., Marrocco, C., Molinara, M., Tortorella, F.: An effective learning strategy for cascaded object detection. Inf. Sci. 340, 17–26 (2016)MathSciNetCrossRef
28.
go back to reference Marrocco, C., Molinara, M., Tortorella, F.: On linear combinations of dichotomizers for maximizing the area under the ROC curve. IEEE Trans. Syst. Man Cybern. Part B (Cybernetics) 41(3), 610–620 (2011)CrossRef Marrocco, C., Molinara, M., Tortorella, F.: On linear combinations of dichotomizers for maximizing the area under the ROC curve. IEEE Trans. Syst. Man Cybern. Part B (Cybernetics) 41(3), 610–620 (2011)CrossRef
29.
go back to reference Marrocco, C., Tortorella, F.: Exploiting coding theory for classification: an ldpc-based strategy for multiclass-to-binary decomposition. Inf. Sci. 357, 88–107 (2016)CrossRef Marrocco, C., Tortorella, F.: Exploiting coding theory for classification: an ldpc-based strategy for multiclass-to-binary decomposition. Inf. Sci. 357, 88–107 (2016)CrossRef
30.
go back to reference Ricamato, M.T., Marrocco, C., Tortorella, F.: MCS-based balancing techniques for skewed classes: an empirical comparison. In: IEEE 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4 (2008) Ricamato, M.T., Marrocco, C., Tortorella, F.: MCS-based balancing techniques for skewed classes: an empirical comparison. In: IEEE 19th International Conference on Pattern Recognition, ICPR 2008, pp. 1–4 (2008)
Metadata
Title
Improving Evolutionary Algorithm Performance for Feature Selection in High-Dimensional Data
Authors
N. Cilia
C. De Stefano
F. Fontanella
A. Scotto di Freca
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-77538-8_30

Premium Partner