ABSTRACT
High-dimensional data often need techniques, such as Feature Selection (FS), in order to solve the curse of dimensionality problem. One of the most popular approaches to FS is wrapper methods, which are based on a search algorithm and a clustering technique. NSGA-II and k-NN are applied in this paper. Since NSGA-II is intended to converge to a small subset of features, the generation of the initial population is crucial to speed up the search process. The lower number of features selected by the individuals belonging to the initial population, the lower number of generations needed to achieve convergence. This work presents a novel technique to reduce the average size (number of features selected) of the individuals forming the initial population. This technique is based on a hyper-parameter, p, which controls the probability of any feature being selected by any individual of the initial population. An analysis of both convergence time and classification accuracy is performed for different values of p, concluding that p can be set to quite low values, accelerating markedly the convergence of the algorithm without affecting the quality of the solutions.
- J. Asensio-Cubero, J. Q. Gan, and R. Palaniappan. 2013. Multiresolution Analysis over Simple Graphs for Brain Computer Interfaces. Journal of Neural Engineering 10, 4 (2013), 21--26. Google ScholarCross Ref
- R. E. Bellman. 1961. Adaptive Control Processes: A Guided Tour. Princeton University Press.Google Scholar
- J. J. Escobar, J. Ortega, J. González, M. Damas, and A. F. Díaz. 2017. Parallel high-dimensional multi-objective feature selection for EEG classification with dynamic workload balancing on CPU-GPU. Cluster Computing 20, 3 (2017), 1881--1897. Google ScholarDigital Library
- K. Fukunaga. 2013. Introduction to Statistical Pattern Recognition. Elsevier.Google ScholarDigital Library
- J. González, J. Ortega, M. Damas, P. Martín-Smith, and J. Q. Gan. 2019. A New Multi-objective Wrapper Method for Feature Selection - Accuracy and Stability Analysis for BCI. Neurocomputing (2019), 407--418. Google ScholarDigital Library
- J. González, J. Ortega, J. J. Escobar, and M. Damas. 2021. A Lexicographic Cooperative Co-evolutionary Approach for Feature Selection. Neurocomputing (2021), 59--76. Google ScholarDigital Library
- J. González, I. Rojas, J. Ortega, H. Pomares, F. J. Fernandez, and A. F. Diaz. 2003. Multiobjective Evolutionary Optimization of the Size, Shape, and Position Parameters of Radial Basis Function Networks for Function Approximation. IEEE Transactions on Neural Networks (2003), 1478--1495. Google ScholarDigital Library
- M. Leon, C. Parkkila, J. Tidare, N. Xiong, and E. Astrand. 2020. Impact of NSGA-II objectives on EEG feature selection related to motor imagery. In Proceedings of the 2020 Genetic and Evolutionary Computation Conference (GECCO'20). ACM, New York, NY, USA, 1134--1142. Google ScholarDigital Library
- L. D. Vignolo, D. H. Milone, and J. Scharcanski. 2013. Feature Selection for Face Recognition Based on Multi-objective Evolutionary Wrappers. Expert Systems with Applications (2013), 5077--5084. Google ScholarCross Ref
- Z. Yin, H. Lan, G. Tan, M. Lu, A. V. Vasilakos, and W. Liu. 2017. Computing Platforms for Big Biological Data Analytics: Perspectives and Challenges. Computational and Structural Biotechnology Journal 15 (2017), 403--411. Google ScholarCross Ref
Index Terms
- Boosting the convergence of a GA-based wrapper for feature selection problems on high-dimensional data
Recommendations
Ensemble feature selection for high dimensional data: a new method and a comparative study
The curse of dimensionality is based on the fact that high dimensional data is often difficult to work with. A large number of features can increase the noise of the data and thus the error of a learning algorithm. Feature selection is a solution for ...
Adaptive multi-subswarm optimisation for feature selection on high-dimensional classification
GECCO '19: Proceedings of the Genetic and Evolutionary Computation ConferenceFeature space is an important factor influencing the performance of any machine learning algorithm including classification methods. Feature selection aims to remove irrelevant and redundant features that may negatively affect the learning process ...
Harmony-based feature selection to improve the nearest neighbor classification
CCSEIT '12: Proceedings of the Second International Conference on Computational Science, Engineering and Information TechnologyA new approach for feature selection is presented in this paper. The proposed approach uses the Harmony Search with a novel fitness function to eliminate noisy and irrelevant features. Harmony vectors contain real weights which refer to feature space. ...
Comments