Skip to main content
Erschienen in: International Journal of Machine Learning and Cybernetics 3/2021

30.09.2020 | Original Article

A parallel hybrid krill herd algorithm for feature selection

verfasst von: Laith Abualigah, Bisan Alsalibi, Mohammad Shehab, Mohammad Alshinwan, Ahmad M. Khasawneh, Hamzeh Alabool

Erschienen in: International Journal of Machine Learning and Cybernetics | Ausgabe 3/2021

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In this paper, a novel feature selection method is introduced to tackle the problem of high-dimensional features in the text clustering application. Text clustering is a prevailing direction in big text mining; in this manner, documents are grouped into cohesive groups by using neatly selected informative features. Swarm-based optimization techniques have been widely used to select the relevant text features and shown promising results on multi-sized datasets. The performance of traditional optimization algorithms tends to fail miserably when using large-scale datasets. A novel parallel membrane-inspired framework is proposed to enhance the performance of the krill herd algorithm combined with the swap mutation strategy (MHKHA). In which the krill herd algorithm is hybridized the swap mutation strategy and incorporated within the parallel membrane framework. Finally, the k-means technique is employed based on the results of feature selection-based Krill Herd Algorithm to cluster the documents. Seven benchmark datasets of various characterizations are used. The results revealed that the proposed MHKHA produced superior results compared to other optimization methods. This paper presents an alternative method for the text mining community through cohesive and informative features.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Weitere Produktempfehlungen anzeigen
Literatur
1.
Zurück zum Zitat Al-Sai ZA, Abualigah LM (2017) Big data and e-government: A review. In: Information Technology (ICIT), 2017 8th International Conference on, IEEE, pp 580–587 Al-Sai ZA, Abualigah LM (2017) Big data and e-government: A review. In: Information Technology (ICIT), 2017 8th International Conference on, IEEE, pp 580–587
2.
Zurück zum Zitat Peng H, Wang C, Guan X (2010) Swarm intelligent optimization algorithm for text clustering. In: 2010 3rd International Conference on Computer Science and Information Technology, volume 5, IEEE, pp 200–203 Peng H, Wang C, Guan X (2010) Swarm intelligent optimization algorithm for text clustering. In: 2010 3rd International Conference on Computer Science and Information Technology, volume 5, IEEE, pp 200–203
3.
Zurück zum Zitat Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. The Journal of Supercomputing 73:4773–4795CrossRef Abualigah LM, Khader AT (2017) Unsupervised text feature selection technique based on hybrid particle swarm optimization algorithm with genetic operators for the text clustering. The Journal of Supercomputing 73:4773–4795CrossRef
4.
Zurück zum Zitat Janani R, Vijayarani S (2019) Text document clustering using spectral clustering algorithm with particle swarm optimization. Expert Systems with Applications 134:192–200CrossRef Janani R, Vijayarani S (2019) Text document clustering using spectral clustering algorithm with particle swarm optimization. Expert Systems with Applications 134:192–200CrossRef
5.
Zurück zum Zitat Sayed GI, Hassanien AE, Azar AT (2019) Feature selection via a novel chaotic crow search algorithm. Neural Computing and Applications 31:171–188CrossRef Sayed GI, Hassanien AE, Azar AT (2019) Feature selection via a novel chaotic crow search algorithm. Neural Computing and Applications 31:171–188CrossRef
6.
Zurück zum Zitat Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. International Journal of Computer Science, Engineering and Applications 5:19CrossRef Abualigah LMQ, Hanandeh ES (2015) Applying genetic algorithms to information retrieval using vector space model. International Journal of Computer Science, Engineering and Applications 5:19CrossRef
7.
Zurück zum Zitat Zhang Y, Li H-G, Wang Q, Peng C (2019) A filter-based bare-bone particle swarm optimization algorithm for unsupervised feature selection. Applied Intelligence 1–10 Zhang Y, Li H-G, Wang Q, Peng C (2019) A filter-based bare-bone particle swarm optimization algorithm for unsupervised feature selection. Applied Intelligence 1–10
8.
Zurück zum Zitat Tubishat M, Abushariah MA, Idris N, Aljarah I (2019) Improved whale optimization algorithm for feature selection in arabic sentiment analysis. Applied Intelligence 49:1688–1707CrossRef Tubishat M, Abushariah MA, Idris N, Aljarah I (2019) Improved whale optimization algorithm for feature selection in arabic sentiment analysis. Applied Intelligence 49:1688–1707CrossRef
9.
Zurück zum Zitat Abualigah LM, Khader AT (2016) AI-Betar MA, Unsupervised feature selection technique based on harmony search. In: 2016 7th international conference on computer science and information technology (CSIT), IEEE Abualigah LM, Khader AT (2016) AI-Betar MA, Unsupervised feature selection technique based on harmony search. In: 2016 7th international conference on computer science and information technology (CSIT), IEEE
10.
Zurück zum Zitat Hazir E, Erdinler ES, Koc KH (2018) Optimization of cnc cutting parameters using design of experiment (doe) and desirability function. Journal of forestry research 29:1423–1434CrossRef Hazir E, Erdinler ES, Koc KH (2018) Optimization of cnc cutting parameters using design of experiment (doe) and desirability function. Journal of forestry research 29:1423–1434CrossRef
11.
Zurück zum Zitat Luo M, Nie F, Chang X, Yang Y, Hauptmann AG, Zheng Q (2018) Adaptive unsupervised feature selection with structure regularization. IEEE transactions on neural networks and learning systems 29:944–956CrossRef Luo M, Nie F, Chang X, Yang Y, Hauptmann AG, Zheng Q (2018) Adaptive unsupervised feature selection with structure regularization. IEEE transactions on neural networks and learning systems 29:944–956CrossRef
12.
Zurück zum Zitat Zhao M, Fu C, Ji L, Tang K, Zhou M (2011) Feature selection and parameter optimization for support vector machines: A new approach based on genetic algorithm with feature chromosomes. Expert Systems with Applications 38:5197–5204CrossRef Zhao M, Fu C, Ji L, Tang K, Zhou M (2011) Feature selection and parameter optimization for support vector machines: A new approach based on genetic algorithm with feature chromosomes. Expert Systems with Applications 38:5197–5204CrossRef
13.
Zurück zum Zitat Wang C, Lin Y, Liu J (2019) Feature selection for multi-label learning with missing labels. Applied Intelligence 1–16 Wang C, Lin Y, Liu J (2019) Feature selection for multi-label learning with missing labels. Applied Intelligence 1–16
14.
Zurück zum Zitat Hancer E, Xue B, Zhang M, Karaboga D, Akay B (2018) Pareto front feature selection based on artificial bee colony optimization. Information Sciences 422:462–479CrossRef Hancer E, Xue B, Zhang M, Karaboga D, Akay B (2018) Pareto front feature selection based on artificial bee colony optimization. Information Sciences 422:462–479CrossRef
15.
Zurück zum Zitat Abualigah LM, Khader AT, Hanandeh ES (2018) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. Journal of Computational Science 25:456–466CrossRef Abualigah LM, Khader AT, Hanandeh ES (2018) A new feature selection method to improve the document clustering using particle swarm optimization algorithm. Journal of Computational Science 25:456–466CrossRef
16.
Zurück zum Zitat Amini S, Homayouni S, Safari A, Darvishsefat AA (2018) Object-based classification of hyperspectral data using random forest algorithm. Geo-spatial Information Science 21:127–138CrossRef Amini S, Homayouni S, Safari A, Darvishsefat AA (2018) Object-based classification of hyperspectral data using random forest algorithm. Geo-spatial Information Science 21:127–138CrossRef
17.
Zurück zum Zitat Abualigah L (2020) Multi-verse optimizer algorithm: a comprehensive survey of its results, variants, and applications. Neural Computing and Applications 1–21 Abualigah L (2020) Multi-verse optimizer algorithm: a comprehensive survey of its results, variants, and applications. Neural Computing and Applications 1–21
18.
Zurück zum Zitat Bolaji AL, Al-Betar MA, Awadallah MA, Khader AT, Abualigah LM (2016) A comprehensive review: Krill herd algorithm (kh) and its applications. Applied Soft Computing 49:437–446CrossRef Bolaji AL, Al-Betar MA, Awadallah MA, Khader AT, Abualigah LM (2016) A comprehensive review: Krill herd algorithm (kh) and its applications. Applied Soft Computing 49:437–446CrossRef
19.
Zurück zum Zitat Xu X, Liu Y (2017) Recent advances in intelligent robotic systems. CAAI Transactions on Intelligence Technology 2:141–141CrossRef Xu X, Liu Y (2017) Recent advances in intelligent robotic systems. CAAI Transactions on Intelligence Technology 2:141–141CrossRef
20.
Zurück zum Zitat Abualigah LMQ (2019) Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering, Studies in Computational Intelligence book series, Springer Abualigah LMQ (2019) Feature Selection and Enhanced Krill Herd Algorithm for Text Document Clustering, Studies in Computational Intelligence book series, Springer
21.
Zurück zum Zitat Abualigah LM, Khader AT, Al-Betar MA, Alyasseri ZAA, Alomari OA, Hanandeh ES (2017) Feature selection with \(\beta\)-hill climbing search for text clustering application. In: Information and Communication Technology (PICICT), 2017 Palestinian International Conference on, IEEE, pp 22–27 Abualigah LM, Khader AT, Al-Betar MA, Alyasseri ZAA, Alomari OA, Hanandeh ES (2017) Feature selection with \(\beta\)-hill climbing search for text clustering application. In: Information and Communication Technology (PICICT), 2017 Palestinian International Conference on, IEEE, pp 22–27
22.
Zurück zum Zitat Bharti KK, Singh PK (2014) A three-stage unsupervised dimension reduction method for text clustering. Journal of Computational Science 5:156–169CrossRef Bharti KK, Singh PK (2014) A three-stage unsupervised dimension reduction method for text clustering. Journal of Computational Science 5:156–169CrossRef
23.
Zurück zum Zitat Abualigah LM, Khader AT, Al-Betar MA, Awadallah MA (2016) A krill herd algorithm for efficient text documents clustering. In: Computer Applications & Industrial Electronics (ISCAIE), 2016 IEEE Symposium on, IEEE, pp 67–72 Abualigah LM, Khader AT, Al-Betar MA, Awadallah MA (2016) A krill herd algorithm for efficient text documents clustering. In: Computer Applications & Industrial Electronics (ISCAIE), 2016 IEEE Symposium on, IEEE, pp 67–72
24.
Zurück zum Zitat Bharti KK, Singh P (2014) Chaotic artificial bee colony for text clustering. In: 2014 Fourth International Conference of Emerging Applications of Information Technology, IEEE, 2014, pp 337–343 Bharti KK, Singh P (2014) Chaotic artificial bee colony for text clustering. In: 2014 Fourth International Conference of Emerging Applications of Information Technology, IEEE, 2014, pp 337–343
25.
Zurück zum Zitat Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight bpso for feature selection in text clustering. Applied Soft Computing Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight bpso for feature selection in text clustering. Applied Soft Computing
26.
Zurück zum Zitat Kushwaha N, Pant M (2018) Link based bpso for feature selection in big data text clustering. Future Generation Computer Systems 82:190–199CrossRef Kushwaha N, Pant M (2018) Link based bpso for feature selection in big data text clustering. Future Generation Computer Systems 82:190–199CrossRef
27.
Zurück zum Zitat Abualigah LM, Khader AT, Hanandeh ES (2018) A novel weighting scheme applied to improve the text document clustering techniques. In: Innovative Computing, Optimization and Its Applications, Springer, pp 305–320 Abualigah LM, Khader AT, Hanandeh ES (2018) A novel weighting scheme applied to improve the text document clustering techniques. In: Innovative Computing, Optimization and Its Applications, Springer, pp 305–320
28.
Zurück zum Zitat Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Systems with Applications 84:24–36CrossRef Abualigah LM, Khader AT, Al-Betar MA, Alomari OA (2017) Text feature selection with a robust weight scheme and dynamic dimension reduction to text document clustering. Expert Systems with Applications 84:24–36CrossRef
30.
Zurück zum Zitat Chen G, Lu Z, Zhang Z (2018) Improved krill herd algorithm with novel constraint handling method for solving optimal power flow problems. Energies 11:76CrossRef Chen G, Lu Z, Zhang Z (2018) Improved krill herd algorithm with novel constraint handling method for solving optimal power flow problems. Energies 11:76CrossRef
31.
Zurück zum Zitat Babaoglu İ, Findik O, Ülker E (2010) A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine. Expert Systems with Applications 37:3177–3183CrossRef Babaoglu İ, Findik O, Ülker E (2010) A comparison of feature selection models utilizing binary particle swarm optimization and genetic algorithm in determining coronary artery disease using support vector machine. Expert Systems with Applications 37:3177–3183CrossRef
32.
Zurück zum Zitat Gandomi AH, Alavi AH (2012) Krill herd: a new bio-inspired optimization algorithm. Communications in Nonlinear Science and Numerical Simulation 17:4831–4845MathSciNetCrossRef Gandomi AH, Alavi AH (2012) Krill herd: a new bio-inspired optimization algorithm. Communications in Nonlinear Science and Numerical Simulation 17:4831–4845MathSciNetCrossRef
33.
Zurück zum Zitat Abdel-Basset M, Manogaran G, El-Shahat D, Mirjalili S (2018) A hybrid whale optimization algorithm based on local search strategy for the permutation flow shop scheduling problem. Future Generation Computer Systems 85:129–145CrossRef Abdel-Basset M, Manogaran G, El-Shahat D, Mirjalili S (2018) A hybrid whale optimization algorithm based on local search strategy for the permutation flow shop scheduling problem. Future Generation Computer Systems 85:129–145CrossRef
34.
Zurück zum Zitat Abualigah LM, Khader AT, Hanandeh ES (2019) Modified krill herd algorithm for global numerical optimization problems. In: Advances in Nature-Inspired Computing and Applications, Springer, pp 205–221 Abualigah LM, Khader AT, Hanandeh ES (2019) Modified krill herd algorithm for global numerical optimization problems. In: Advances in Nature-Inspired Computing and Applications, Springer, pp 205–221
35.
Zurück zum Zitat Abualigah LM, Khader AT, Hanandeh ES (2018) A combination of objective functions and hybrid krill herd algorithm for text document clustering analysis. Engineering Applications of Artificial Intelligence 73:111–125CrossRef Abualigah LM, Khader AT, Hanandeh ES (2018) A combination of objective functions and hybrid krill herd algorithm for text document clustering analysis. Engineering Applications of Artificial Intelligence 73:111–125CrossRef
36.
Zurück zum Zitat Abualigah LM, Khader AT, Al-Betar MA (2016) Unsupervised feature selection technique based on genetic algorithm for improving the text clustering, in: Computer Science and Information Technology (CSIT), 2016 7th International Conference on, IEEE, pp 1–6 Abualigah LM, Khader AT, Al-Betar MA (2016) Unsupervised feature selection technique based on genetic algorithm for improving the text clustering, in: Computer Science and Information Technology (CSIT), 2016 7th International Conference on, IEEE, pp 1–6
37.
Zurück zum Zitat Tu Q, Chen X, Liu X (2019) Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Applied Soft Computing 76:16–30CrossRef Tu Q, Chen X, Liu X (2019) Multi-strategy ensemble grey wolf optimizer and its application to feature selection. Applied Soft Computing 76:16–30CrossRef
38.
Zurück zum Zitat Abualigah L, Diabat A (2020) A novel hybrid antlion optimization algorithm for multi-objective task scheduling problems in cloud computing environments. Cluster Computing 1–19 Abualigah L, Diabat A (2020) A novel hybrid antlion optimization algorithm for multi-objective task scheduling problems in cloud computing environments. Cluster Computing 1–19
39.
Zurück zum Zitat Forsati R, Keikha A, Shamsfard M (2015) An improved bee colony optimization algorithm with an application to document clustering. Neurocomputing 159:9–26CrossRef Forsati R, Keikha A, Shamsfard M (2015) An improved bee colony optimization algorithm with an application to document clustering. Neurocomputing 159:9–26CrossRef
40.
Zurück zum Zitat Bharti KK, Singh PK (2015) Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering. Expert Systems with Applications 42:3105–3114CrossRef Bharti KK, Singh PK (2015) Hybrid dimension reduction by integrating feature selection with feature extraction method for text clustering. Expert Systems with Applications 42:3105–3114CrossRef
41.
Zurück zum Zitat Bharti KK, Singh PK (2016) Chaotic gradient artificial bee colony for text clustering. Soft Computing 20:1113–1126CrossRef Bharti KK, Singh PK (2016) Chaotic gradient artificial bee colony for text clustering. Soft Computing 20:1113–1126CrossRef
42.
Zurück zum Zitat Rose JD (2016) An efficient association rule based hierarchical algorithm for text clustering, Int J Adv Engg Tech/Vol. VII/Issue I/Jan.-March 751 (2016) 753 Rose JD (2016) An efficient association rule based hierarchical algorithm for text clustering, Int J Adv Engg Tech/Vol. VII/Issue I/Jan.-March 751 (2016) 753
43.
Zurück zum Zitat Abualigah LM, Sawaie AM, Khader AT, Rashaideh H, Al-Betar MA, Shehab M (2017a) \(\beta\)-hill climbing technique for the text document clustering. New Trends in Information Technology 60 Abualigah LM, Sawaie AM, Khader AT, Rashaideh H, Al-Betar MA, Shehab M (2017a) \(\beta\)-hill climbing technique for the text document clustering. New Trends in Information Technology 60
44.
Zurück zum Zitat Abualigah LM, Khader AT, AlBetar MA, Hanandeh ES (2017b) Unsupervised text feature selection technique based on particle swarm optimization algorithm for improving the text clustering. In: Eai International Conference on Computer Science and Engineering Abualigah LM, Khader AT, AlBetar MA, Hanandeh ES (2017b) Unsupervised text feature selection technique based on particle swarm optimization algorithm for improving the text clustering. In: Eai International Conference on Computer Science and Engineering
45.
Zurück zum Zitat Kushwaha N, Pant M (2017) Link based bpso for feature selection in big data text clustering. Future Generation Computer Systems Kushwaha N, Pant M (2017) Link based bpso for feature selection in big data text clustering. Future Generation Computer Systems
46.
Zurück zum Zitat Mirhosseini M (2017) A clustering approach using a combination of gravitational search algorithm and k-harmonic means and its application in text document clustering. Turkish Journal of Electrical Engineering & Computer Sciences 25:1251–1262CrossRef Mirhosseini M (2017) A clustering approach using a combination of gravitational search algorithm and k-harmonic means and its application in text document clustering. Turkish Journal of Electrical Engineering & Computer Sciences 25:1251–1262CrossRef
Metadaten
Titel
A parallel hybrid krill herd algorithm for feature selection
verfasst von
Laith Abualigah
Bisan Alsalibi
Mohammad Shehab
Mohammad Alshinwan
Ahmad M. Khasawneh
Hamzeh Alabool
Publikationsdatum
30.09.2020
Verlag
Springer Berlin Heidelberg
Erschienen in
International Journal of Machine Learning and Cybernetics / Ausgabe 3/2021
Print ISSN: 1868-8071
Elektronische ISSN: 1868-808X
DOI
https://doi.org/10.1007/s13042-020-01202-7

Weitere Artikel der Ausgabe 3/2021

International Journal of Machine Learning and Cybernetics 3/2021 Zur Ausgabe

Neuer Inhalt