Skip to main content
Top
Published in: Cluster Computing 3/2021

22-02-2021

A novel feature selection method for data mining tasks using hybrid Sine Cosine Algorithm and Genetic Algorithm

Authors: Laith Abualigah, Akram Jamal Dulaimi

Published in: Cluster Computing | Issue 3/2021

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Feature selection (FS) is a real-world problem that can be solved using optimization techniques. These techniques proposed solutions to make a predictive model, which minimizes the classifier's prediction errors by selecting informative or important features by discarding redundant, noisy, and irrelevant attributes in the original dataset. A new hybrid feature selection method is proposed using the Sine Cosine Algorithm (SCA) and Genetic Algorithm (GA), called SCAGA. Typically, optimization methods have two main search strategies; exploration of the search space and exploitation to determine the optimal solution. The proposed SCAGA resulted in better performance when balancing between exploitation and exploration strategies of the search space. The proposed SCAGA has also been evaluated using the following evaluation criteria: classification accuracy, worst fitness, mean fitness, best fitness, the average number of features, and standard deviation. Moreover, the maximum accuracy of a classification and the minimal features were obtained in the results. The results were also compared with a basic Sine Cosine Algorithm (SCA) and other related approaches published in literature such as Ant Lion Optimization and Particle Swarm Optimization. The comparison showed that the obtained results from the SCAGA method were the best overall the tested datasets from the UCI machine learning repository.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Liu, H., Motoda, H.: Computational Methods of Feature Selection. Chapman and Hall/CRC Press, Boca Raton (2007)CrossRef Liu, H., Motoda, H.: Computational Methods of Feature Selection. Chapman and Hall/CRC Press, Boca Raton (2007)CrossRef
2.
go back to reference Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining, vol. 454. Springer, New York (2012)MATH Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining, vol. 454. Springer, New York (2012)MATH
3.
go back to reference Abdullah, S., Shaker, K., Shaker, H.: Investigating a round robin strategy over multi algorithms in optimizing the quality of university course timetables. Int. J. Phys. Sci. 6(6), 1452–1462 (2011) Abdullah, S., Shaker, K., Shaker, H.: Investigating a round robin strategy over multi algorithms in optimizing the quality of university course timetables. Int. J. Phys. Sci. 6(6), 1452–1462 (2011)
4.
go back to reference Holland. Genetic Algorithm for Solving Optimization Problems (1975) Holland. Genetic Algorithm for Solving Optimization Problems (1975)
5.
go back to reference Abualigah, L.M., Khader, A.T., Al-Betar, M.A., Alomari, O.A.: Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst. Appl. 84, 24–36 (2017)CrossRef Abualigah, L.M., Khader, A.T., Al-Betar, M.A., Alomari, O.A.: Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Syst. Appl. 84, 24–36 (2017)CrossRef
6.
go back to reference Abualigah, L., Alsalibi, B., Shehab, M., Alshinwan, M., Khasawneh, A.M., Alabool, H.: A parallel hybrid krill herd algorithm for feature selection. Int. J. Mach. Learn. Cybern. 1–24 (2020) Abualigah, L., Alsalibi, B., Shehab, M., Alshinwan, M., Khasawneh, A.M., Alabool, H.: A parallel hybrid krill herd algorithm for feature selection. Int. J. Mach. Learn. Cybern. 1–24 (2020)
7.
go back to reference Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature Selection for High-Dimensional Data. Springer , Cham (2015)CrossRef Bolón-Canedo, V., Sánchez-Maroño, N., Alonso-Betanzos, A.: Feature Selection for High-Dimensional Data. Springer , Cham (2015)CrossRef
8.
go back to reference Nakamura, R.Y., Pereira, L.A., Costa, K.A., Rodrigues, D., Papa, J.P., Yang, X.S.: BBA: a binary bat algorithm for feature selection. In: 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), (pp. 291–297). IEEE (2012) Nakamura, R.Y., Pereira, L.A., Costa, K.A., Rodrigues, D., Papa, J.P., Yang, X.S.: BBA: a binary bat algorithm for feature selection. In: 2012 25th SIBGRAPI Conference on Graphics, Patterns and Images (SIBGRAPI), (pp. 291–297). IEEE (2012)
9.
go back to reference Choi, S.I., Oh, J., Choi, C.H., Kim, C.: Input variable selection for feature extraction in classification problems. Signal Process. 92(3), 636–648 (2012)CrossRef Choi, S.I., Oh, J., Choi, C.H., Kim, C.: Input variable selection for feature extraction in classification problems. Signal Process. 92(3), 636–648 (2012)CrossRef
10.
go back to reference Fu, K.S., Min, P.J., Li, T.J.: Feature selection in pattern recognition. IEEE Trans. Syst. Sci. Cybern. 6(1), 33–39 (1970)CrossRef Fu, K.S., Min, P.J., Li, T.J.: Feature selection in pattern recognition. IEEE Trans. Syst. Sci. Cybern. 6(1), 33–39 (1970)CrossRef
11.
go back to reference Abualigah, L., Gandomi, A.H., Elaziz, M.A., Hussien, A.G., Khasawneh, A.M., Alshinwan, M., Houssein, E.H.: Nature-inspired optimization algorithms for text document clustering—a comprehensive analysis. Algorithms 13(12), 345 (2020)MathSciNetCrossRef Abualigah, L., Gandomi, A.H., Elaziz, M.A., Hussien, A.G., Khasawneh, A.M., Alshinwan, M., Houssein, E.H.: Nature-inspired optimization algorithms for text document clustering—a comprehensive analysis. Algorithms 13(12), 345 (2020)MathSciNetCrossRef
12.
go back to reference Abualigah, L.: Multi-verse optimizer algorithm: a comprehensive survey of its results, variants, and applications. Neural Comput. Appl. 1–21 (2020) Abualigah, L.: Multi-verse optimizer algorithm: a comprehensive survey of its results, variants, and applications. Neural Comput. Appl. 1–21 (2020)
13.
go back to reference Abualigah, L.: Group search optimizer: a nature-inspired meta-heuristic optimization algorithm with its results, variants, and applications. Neural Comput. Appl. 1–24 (2020) Abualigah, L.: Group search optimizer: a nature-inspired meta-heuristic optimization algorithm with its results, variants, and applications. Neural Comput. Appl. 1–24 (2020)
14.
go back to reference Yan, M.: Hybrid Bainary Coral Reefs Optimazation Algorithm with Samulated Annealing for Feature Selection in High Dimentional Bieomedical Datasets, pp. 102–111. Elsevier, Amsterdam (2018) Yan, M.: Hybrid Bainary Coral Reefs Optimazation Algorithm with Samulated Annealing for Feature Selection in High Dimentional Bieomedical Datasets, pp. 102–111. Elsevier, Amsterdam (2018)
15.
go back to reference Abualigah, L., Diabat, A., Mirjalili, S., AbdElaziz, M., Gandomi, A.H.: The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 376, 113609 (2021)MathSciNetCrossRef Abualigah, L., Diabat, A., Mirjalili, S., AbdElaziz, M., Gandomi, A.H.: The arithmetic optimization algorithm. Comput. Methods Appl. Mech. Eng. 376, 113609 (2021)MathSciNetCrossRef
17.
go back to reference Kang, S.H., Kim, K.J.: A feature selection approach to find optimal feature subsets for the network intrusion detection system. Clust. Comput. 19(1), 325–333 (2016)CrossRef Kang, S.H., Kim, K.J.: A feature selection approach to find optimal feature subsets for the network intrusion detection system. Clust. Comput. 19(1), 325–333 (2016)CrossRef
18.
go back to reference Manoj, R.J., Praveena, M.A., Vijayakumar, K.: An ACO–ANN based feature selection algorithm for big data. Clust. Comput. 22(2), 3953–3960 (2019)CrossRef Manoj, R.J., Praveena, M.A., Vijayakumar, K.: An ACO–ANN based feature selection algorithm for big data. Clust. Comput. 22(2), 3953–3960 (2019)CrossRef
19.
go back to reference Gokulnath, C.B., Shantharajah, S.P.: An optimized feature selection based on genetic approach and support vector machine for heart disease. Clust. Comput. 22(6), 14777–14787 (2019)CrossRef Gokulnath, C.B., Shantharajah, S.P.: An optimized feature selection based on genetic approach and support vector machine for heart disease. Clust. Comput. 22(6), 14777–14787 (2019)CrossRef
20.
go back to reference Khamees, A.A., Khalid, S.: Multi-objective Feature Selection: Hybrid of Salp Swarm and Simulated Annealing Approach, pp. 1–14. Springer, Switzerland (2018) Khamees, A.A., Khalid, S.: Multi-objective Feature Selection: Hybrid of Salp Swarm and Simulated Annealing Approach, pp. 1–14. Springer, Switzerland (2018)
21.
go back to reference Du, K.L., Swamy, M.N.S.: Search and Optimization by Metaheuristics, p. 434. Springer, New York City (2016)CrossRef Du, K.L., Swamy, M.N.S.: Search and Optimization by Metaheuristics, p. 434. Springer, New York City (2016)CrossRef
22.
go back to reference Dhaenens, C., Jourdan, L.: Metaheuristics for Big Data. Wiley, New York (2016)CrossRef Dhaenens, C., Jourdan, L.: Metaheuristics for Big Data. Wiley, New York (2016)CrossRef
23.
go back to reference Diao, R., Shen, Q.: Nature inspired feature selection meta-heuristics. Artif. Intell. Rev. 44(3), 311–340 (2015)CrossRef Diao, R., Shen, Q.: Nature inspired feature selection meta-heuristics. Artif. Intell. Rev. 44(3), 311–340 (2015)CrossRef
24.
go back to reference Mallenahalli, S.: A Tunable particle swarm size optimization algorithm for feature selection. In: 2018 IEEE Congress on Evolutionary Computation. IEEE (2018) Mallenahalli, S.: A Tunable particle swarm size optimization algorithm for feature selection. In: 2018 IEEE Congress on Evolutionary Computation. IEEE (2018)
25.
go back to reference Diao, R., Shen, Q.: Feature selection with harmony search. IEEE Trans. Syst. Man Cybern. Part B 42(6), 1509–1523 (2012)CrossRef Diao, R., Shen, Q.: Feature selection with harmony search. IEEE Trans. Syst. Man Cybern. Part B 42(6), 1509–1523 (2012)CrossRef
26.
go back to reference Peng, Y.T., Hu, S.: An improved feature selection algorithm based on ant colony optimization. IEEE Access. 6, 69203–69209 (2018)CrossRef Peng, Y.T., Hu, S.: An improved feature selection algorithm based on ant colony optimization. IEEE Access. 6, 69203–69209 (2018)CrossRef
27.
go back to reference Yan, M., Luo, W.: A hybrid algorithm based on binary chemical reaction optimization and tabu search for feature selection of high-dimensional biomedical data. Tsinghua Sci. Technol. 23(6), 733–743 (2018)CrossRef Yan, M., Luo, W.: A hybrid algorithm based on binary chemical reaction optimization and tabu search for feature selection of high-dimensional biomedical data. Tsinghua Sci. Technol. 23(6), 733–743 (2018)CrossRef
28.
go back to reference Sayed, G.I., Khoriba, G.: A Novel Chaotic Salp Swarm Algorithm for Global Optimization and Feature Selection. Springer, New York (2018)CrossRef Sayed, G.I., Khoriba, G.: A Novel Chaotic Salp Swarm Algorithm for Global Optimization and Feature Selection. Springer, New York (2018)CrossRef
29.
go back to reference Sahu, B., Debahut, M.: A novel feature selection algorithm using particle swarm optimization for cancer microarray data. Procedia Eng. 38, 27–31 (2012)CrossRef Sahu, B., Debahut, M.: A novel feature selection algorithm using particle swarm optimization for cancer microarray data. Procedia Eng. 38, 27–31 (2012)CrossRef
30.
go back to reference Abualigah, L.M.Q.: Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering. Studies in Computational Intelligence. Springer, Berlin (2019)CrossRef Abualigah, L.M.Q.: Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering. Studies in Computational Intelligence. Springer, Berlin (2019)CrossRef
31.
go back to reference Abualigah, L.M., Khader, A.T., Hanandeh, E.S.: A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J. Comput. Sci. 25, 456–466 (2018)CrossRef Abualigah, L.M., Khader, A.T., Hanandeh, E.S.: A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J. Comput. Sci. 25, 456–466 (2018)CrossRef
32.
go back to reference Chen, H., Hou, Y., Luo, Q., Hu, Z., Yan, L.: Text feature selection based on water wave optimization algorithm. In: International Conference on Advanced Computational Intelligence (ICACI). IEEE, pp. 546 551 (2018) Chen, H., Hou, Y., Luo, Q., Hu, Z., Yan, L.: Text feature selection based on water wave optimization algorithm. In: International Conference on Advanced Computational Intelligence (ICACI). IEEE, pp. 546 551 (2018)
33.
go back to reference Padhy, N., Mishra, D., Panigrahi, R.: The survey of data mining applications and feature scope. arXiv preprint (2012). Padhy, N., Mishra, D., Panigrahi, R.: The survey of data mining applications and feature scope. arXiv preprint (2012).
34.
go back to reference Han, X.C., Quan, Y.X., Li, J., Zhang, L.: Feature subset selection by gravitational search algorithm optimization. Inf. Sci. 281, 128–146 (2014)MathSciNetCrossRef Han, X.C., Quan, Y.X., Li, J., Zhang, L.: Feature subset selection by gravitational search algorithm optimization. Inf. Sci. 281, 128–146 (2014)MathSciNetCrossRef
35.
go back to reference Zanaty, E.A., Ghiduk, A.S.: A novel approach based on genetic algorithms and region growing for magnetic resonance image (MRI) segmentation. Comput. Sci. Inf. Syst. 10(3), 1319–1342 (2013)CrossRef Zanaty, E.A., Ghiduk, A.S.: A novel approach based on genetic algorithms and region growing for magnetic resonance image (MRI) segmentation. Comput. Sci. Inf. Syst. 10(3), 1319–1342 (2013)CrossRef
36.
go back to reference Mirjalili, S.: ALO: Antlion Optimization for solving feature selection problems. Adv. Eng. Softw. 83, 80–98 (2015)CrossRef Mirjalili, S.: ALO: Antlion Optimization for solving feature selection problems. Adv. Eng. Softw. 83, 80–98 (2015)CrossRef
37.
go back to reference Linoff, G.S., Berry, M.J.: Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. Wiley, New York (2011) Linoff, G.S., Berry, M.J.: Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management. Wiley, New York (2011)
38.
go back to reference Zhang, Z., Ning, Y.: Effective semi-supervised nonlinear dimensionality reduction for wood defects recognition. Comput. Sci. Inf. Syst. 7(1), 127–138 (2010)CrossRef Zhang, Z., Ning, Y.: Effective semi-supervised nonlinear dimensionality reduction for wood defects recognition. Comput. Sci. Inf. Syst. 7(1), 127–138 (2010)CrossRef
39.
go back to reference Wan, M.W., Ye, L.: A feature selection method based on modified binary coded ant colony optimization algorithm. Appl. Soft Comput. 49, 248–258 (2016)CrossRef Wan, M.W., Ye, L.: A feature selection method based on modified binary coded ant colony optimization algorithm. Appl. Soft Comput. 49, 248–258 (2016)CrossRef
40.
go back to reference Zhao, Z.A., Liu, H.: Spectral Feature Selection for Data Mining. CRC Press, Boca raon (2011)CrossRef Zhao, Z.A., Liu, H.: Spectral Feature Selection for Data Mining. CRC Press, Boca raon (2011)CrossRef
41.
go back to reference Chen, W.J., Li, L.: A heuristic feature selection approach for text categorization by using chaos optimization and genetic algorithm. In: Hindawi Publishing Corporation, Mathematical Problems in Engineering, pp. 1–6, (2013) Chen, W.J., Li, L.: A heuristic feature selection approach for text categorization by using chaos optimization and genetic algorithm. In: Hindawi Publishing Corporation, Mathematical Problems in Engineering, pp. 1–6, (2013)
42.
go back to reference Ghamisi, P., Jon, A.B.: Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci. Remote Sens. Lett. 12(2), 309–313 (2014)CrossRef Ghamisi, P., Jon, A.B.: Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci. Remote Sens. Lett. 12(2), 309–313 (2014)CrossRef
43.
go back to reference Oh, I.S., Lee, J.S., Moon, B.R.: Hybrid genetic algorithms for feature selection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1424–1437 (2004)CrossRef Oh, I.S., Lee, J.S., Moon, B.R.: Hybrid genetic algorithms for feature selection. IEEE Trans. Pattern Anal. Mach. Intell. 26(11), 1424–1437 (2004)CrossRef
44.
go back to reference Atyabi, A., Luerssen, M., Fitzgibbon, S., Powers, D.M.: Evolutionary feature selection and electrode reduction for EEG classification. In: IEEE Congress on Evolutionary Computation (CEC), (pp. 1–8). IEEE (2012) Atyabi, A., Luerssen, M., Fitzgibbon, S., Powers, D.M.: Evolutionary feature selection and electrode reduction for EEG classification. In: IEEE Congress on Evolutionary Computation (CEC), (pp. 1–8). IEEE (2012)
45.
go back to reference Vasant, P.: Hybrid simulated annealing and genetic algorithms for industrial production management problems. Int. J. Comput. Methods 7(02), 279–297 (2010)CrossRef Vasant, P.: Hybrid simulated annealing and genetic algorithms for industrial production management problems. Int. J. Comput. Methods 7(02), 279–297 (2010)CrossRef
46.
go back to reference Wu, J., Lu, Z., Jin, L.: A novel hybrid genetic algorithm and simulated annealing for feature selection and kernel optimization in support vector regression. In: 2012 IEEE 13th International Conference on Information Reuse and Integration (IRI), (pp. 401–406). IEEE (2012) Wu, J., Lu, Z., Jin, L.: A novel hybrid genetic algorithm and simulated annealing for feature selection and kernel optimization in support vector regression. In: 2012 IEEE 13th International Conference on Information Reuse and Integration (IRI), (pp. 401–406). IEEE (2012)
47.
go back to reference Mirjalili, S.: SCA: a sine cosine algorithm for solving optimization problems. Knowl.-Based Syst. 96, 120–133 (2016)CrossRef Mirjalili, S.: SCA: a sine cosine algorithm for solving optimization problems. Knowl.-Based Syst. 96, 120–133 (2016)CrossRef
48.
go back to reference Emary, E., Zawbaa, H.M., AboulElla, H.: Binary Gray Wolf optimization approaches for feature selection. Neuro computing 2312(15), 1–33 (2015) Emary, E., Zawbaa, H.M., AboulElla, H.: Binary Gray Wolf optimization approaches for feature selection. Neuro computing 2312(15), 1–33 (2015)
49.
go back to reference Abualigah, L.M., Khader, A.T.: Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput. 73(11), 4773–4795 (2017)CrossRef Abualigah, L.M., Khader, A.T.: Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. J Supercomput. 73(11), 4773–4795 (2017)CrossRef
Metadata
Title
A novel feature selection method for data mining tasks using hybrid Sine Cosine Algorithm and Genetic Algorithm
Authors
Laith Abualigah
Akram Jamal Dulaimi
Publication date
22-02-2021
Publisher
Springer US
Published in
Cluster Computing / Issue 3/2021
Print ISSN: 1386-7857
Electronic ISSN: 1573-7543
DOI
https://doi.org/10.1007/s10586-021-03254-y

Other articles of this Issue 3/2021

Cluster Computing 3/2021 Go to the issue

Premium Partner