Skip to main content
Erschienen in: Soft Computing 3/2018

07.10.2016 | Methodologies and Application

Feature selection for high-dimensional classification using a competitive swarm optimizer

verfasst von: Shenkai Gu, Ran Cheng, Yaochu Jin

Erschienen in: Soft Computing | Ausgabe 3/2018

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

When solving many machine learning problems such as classification, there exists a large number of input features. However, not all features are relevant for solving the problem, and sometimes, including irrelevant features may deteriorate the learning performance.Please check the edit made in the article title Therefore, it is essential to select the most relevant features, which is known as feature selection. Many feature selection algorithms have been developed, including evolutionary algorithms or particle swarm optimization (PSO) algorithms, to find a subset of the most important features for accomplishing a particular machine learning task. However, the traditional PSO does not perform well for large-scale optimization problems, which degrades the effectiveness of PSO for feature selection when the number of features dramatically increases. In this paper, we propose to use a very recent PSO variant, known as competitive swarm optimizer (CSO) that was dedicated to large-scale optimization, for solving high-dimensional feature selection problems. In addition, the CSO, which was originally developed for continuous optimization, is adapted to perform feature selection that can be considered as a combinatorial optimization problem. An archive technique is also introduced to reduce computational cost. Experiments on six benchmark datasets demonstrate that compared to the canonical PSO-based and a state-of-the-art PSO variant for feature selection, the proposed CSO-based feature selection algorithm not only selects a much smaller number of features, but result in better classification performance as well.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66 Aha D, Kibler D, Albert M (1991) Instance-based learning algorithms. Mach Learn 6(1):37–66
Zurück zum Zitat Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1):279–305MathSciNetCrossRefMATH Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1):279–305MathSciNetCrossRefMATH
Zurück zum Zitat Banks A, Vincent J, Anyakoha C (2008) A review of particle swarm optimization. Part II: hybridisation, combinatorial, multicriteria and constrained optimization, and indicative applications. Nat Comput 7(1):109–124MathSciNetCrossRefMATH Banks A, Vincent J, Anyakoha C (2008) A review of particle swarm optimization. Part II: hybridisation, combinatorial, multicriteria and constrained optimization, and indicative applications. Nat Comput 7(1):109–124MathSciNetCrossRefMATH
Zurück zum Zitat Bianchi L, Dorigo M, Gambardella LM, Gutjahr WJ (2009) A survey on metaheuristics for stochastic combinatorial optimization. Nat Comput 8(2):239–287MathSciNetCrossRefMATH Bianchi L, Dorigo M, Gambardella LM, Gutjahr WJ (2009) A survey on metaheuristics for stochastic combinatorial optimization. Nat Comput 8(2):239–287MathSciNetCrossRefMATH
Zurück zum Zitat Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28CrossRef Chandrashekar G, Sahin F (2014) A survey on feature selection methods. Comput Electr Eng 40(1):16–28CrossRef
Zurück zum Zitat Chen WN, Zhang J, Lin Y, Chen N, Zhan ZH, Chung HSH, Li Y, Shi YH (2013) Particle swarm optimization with an aging leader and challengers. IEEE Trans Evol Comput 17(2):241–258CrossRef Chen WN, Zhang J, Lin Y, Chen N, Zhan ZH, Chung HSH, Li Y, Shi YH (2013) Particle swarm optimization with an aging leader and challengers. IEEE Trans Evol Comput 17(2):241–258CrossRef
Zurück zum Zitat Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233CrossRef Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233CrossRef
Zurück zum Zitat Cheng R, Jin Y (2014) Demonstrator selection in a social learning particle swarm optimizer. In: 2014 IEEE congress on evolutionary computation, pp 3103–3110 Cheng R, Jin Y (2014) Demonstrator selection in a social learning particle swarm optimizer. In: 2014 IEEE congress on evolutionary computation, pp 3103–3110
Zurück zum Zitat Cheng R, Jin Y (2015) A competitive swarm optimizer for large scale optimization. IEEE Trans Cybern 45(2):191–204CrossRef Cheng R, Jin Y (2015) A competitive swarm optimizer for large scale optimization. IEEE Trans Cybern 45(2):191–204CrossRef
Zurück zum Zitat Chuang LY, Chang HW, Tu CJ, Yang CH (2008) Improved binary pso for feature selection using gene expression data. Comput Biol Chem 32(1):29–38CrossRefMATH Chuang LY, Chang HW, Tu CJ, Yang CH (2008) Improved binary pso for feature selection using gene expression data. Comput Biol Chem 32(1):29–38CrossRefMATH
Zurück zum Zitat Chuang LY, Tsai SW, Yang CH (2011) Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst Appl 38(10):12699–12707CrossRef Chuang LY, Tsai SW, Yang CH (2011) Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst Appl 38(10):12699–12707CrossRef
Zurück zum Zitat Fei H, Huan J (2010) Boosting with structure information in the functional space: an application to graph classification. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, pp 643–652 Fei H, Huan J (2010) Boosting with structure information in the functional space: an application to graph classification. In: Proceedings of the 16th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, New York, pp 643–652
Zurück zum Zitat Fong S, Wong R, Vasilakos AV (2016) Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans Serv Comput 9(1):33–45 Fong S, Wong R, Vasilakos AV (2016) Accelerated PSO swarm search feature selection for data stream mining big data. IEEE Trans Serv Comput 9(1):33–45
Zurück zum Zitat Gheyas IA, Smith LS (2010) Feature subset selection in large dimensionality domains. Pattern Recognit 43(1):5–13CrossRefMATH Gheyas IA, Smith LS (2010) Feature subset selection in large dimensionality domains. Pattern Recognit 43(1):5–13CrossRefMATH
Zurück zum Zitat Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182MATH
Zurück zum Zitat Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRef Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11(1):10–18CrossRef
Zurück zum Zitat Han KH, Kim JH (2002) Quantum-inspired evolutionary algorithm for a class of combinatorial optimization. IEEE Trans Evol Comput 6(6):580–593CrossRef Han KH, Kim JH (2002) Quantum-inspired evolutionary algorithm for a class of combinatorial optimization. IEEE Trans Evol Comput 6(6):580–593CrossRef
Zurück zum Zitat Hu M, Wu TF, Weir JD (2013) An adaptive particle swarm optimization with multiple adaptive methods. IEEE Trans Evol Comput 17(5):705–720CrossRef Hu M, Wu TF, Weir JD (2013) An adaptive particle swarm optimization with multiple adaptive methods. IEEE Trans Evol Comput 17(5):705–720CrossRef
Zurück zum Zitat Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE international conference on neural networks: proceedings, IS - SN -, vol 4, pp 1942–1948 Kennedy J, Eberhart R (1995) Particle swarm optimization. In: IEEE international conference on neural networks: proceedings, IS - SN -, vol 4, pp 1942–1948
Zurück zum Zitat Kira K, Rendell LA(1992) A practical approach to feature selection. In: Proceedings of the international workshop on machine learning, pp 249–256 Kira K, Rendell LA(1992) A practical approach to feature selection. In: Proceedings of the international workshop on machine learning, pp 249–256
Zurück zum Zitat Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324CrossRefMATH Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324CrossRefMATH
Zurück zum Zitat Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159CrossRef Kwak N, Choi CH (2002) Input feature selection for classification problems. IEEE Trans Neural Netw 13(1):143–159CrossRef
Zurück zum Zitat Li X, Yao X (2012) Cooperatively coevolving particle swarms for large scale optimization. IEEE Trans Evol Comput 16(2):210–224CrossRef Li X, Yao X (2012) Cooperatively coevolving particle swarms for large scale optimization. IEEE Trans Evol Comput 16(2):210–224CrossRef
Zurück zum Zitat Liang JJ, Qin AK, Suganthan PN, Baskar S (2006) Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans Evol Comput 10(3):281–295CrossRef Liang JJ, Qin AK, Suganthan PN, Baskar S (2006) Comprehensive learning particle swarm optimizer for global optimization of multimodal functions. IEEE Trans Evol Comput 10(3):281–295CrossRef
Zurück zum Zitat Liao JG, Chin KV (2007) Logistic regression for disease classification using microarray data: model selection in a large p and small n case. Bioinformatics 23(15):1945–1951CrossRef Liao JG, Chin KV (2007) Logistic regression for disease classification using microarray data: model selection in a large p and small n case. Bioinformatics 23(15):1945–1951CrossRef
Zurück zum Zitat Lin SW, Chen SC (2009) PSOLDA: a particle swarm optimization approach for enhancing classification accuracy rate of linear discriminant analysis. Appl Soft Comput 9(3):1008–1015MathSciNetCrossRef Lin SW, Chen SC (2009) PSOLDA: a particle swarm optimization approach for enhancing classification accuracy rate of linear discriminant analysis. Appl Soft Comput 9(3):1008–1015MathSciNetCrossRef
Zurück zum Zitat Liu Z, Jiang F, Tian G, Wang S, Sato F, Meltzer SJ, Tan M (2007) Sparse logistic regression with Lp penalty for biomarker identification. Stat Appl Gen Mol Biol 6(1):6. doi:10.2202/1544-6115.1248 Liu Z, Jiang F, Tian G, Wang S, Sato F, Meltzer SJ, Tan M (2007) Sparse logistic regression with Lp penalty for biomarker identification. Stat Appl Gen Mol Biol 6(1):6. doi:10.​2202/​1544-6115.​1248
Zurück zum Zitat Neshatian K, Zhang M(2009) Pareto front feature selection: using genetic programming to explore feature space. In: Proceedings of the annual conference on genetic and evolutionary computation, ACM, pp 1027–1034 Neshatian K, Zhang M(2009) Pareto front feature selection: using genetic programming to explore feature space. In: Proceedings of the annual conference on genetic and evolutionary computation, ACM, pp 1027–1034
Zurück zum Zitat Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer, HeidelbergMATH Price K, Storn RM, Lampinen JA (2006) Differential evolution: a practical approach to global optimization. Springer, HeidelbergMATH
Zurück zum Zitat Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15(11):1119–1125CrossRef Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recogn Lett 15(11):1119–1125CrossRef
Zurück zum Zitat Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Evolutionary computation proceedings, 1998. IEEE world congress on computational intelligence., The 1998 IEEE international conference on IS - SN - VO -, pp 69–73 Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: Evolutionary computation proceedings, 1998. IEEE world congress on computational intelligence., The 1998 IEEE international conference on IS - SN - VO -, pp 69–73
Zurück zum Zitat Tan M, Tsang IW, Wang L (2013) Minimax sparse logistic regression for very high-dimensional feature selection. IEEE Trans Neural Netw Learn Syst 24(10):1609–1622CrossRef Tan M, Tsang IW, Wang L (2013) Minimax sparse logistic regression for very high-dimensional feature selection. IEEE Trans Neural Netw Learn Syst 24(10):1609–1622CrossRef
Zurück zum Zitat Tran B, Xue B, Zhang M (2016) Bare-bone particle swarm optimisation for simultaneously discretising and selecting features for high-dimensional classification. In: Squillero G, Burelli P (eds) Applications of evolutionary computation: 19th European conference, evoapplications 2016, Porto, Portugal, March 30–April 1, 2016, Proceedings, Part I, Springer International Publishing, pp 701–718 Tran B, Xue B, Zhang M (2016) Bare-bone particle swarm optimisation for simultaneously discretising and selecting features for high-dimensional classification. In: Squillero G, Burelli P (eds) Applications of evolutionary computation: 19th European conference, evoapplications 2016, Porto, Portugal, March 30–April 1, 2016, Proceedings, Part I, Springer International Publishing, pp 701–718
Zurück zum Zitat Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539CrossRefMATH Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539CrossRefMATH
Zurück zum Zitat Wang H, Sun H, Li C, Rahnamayan S, Pan JS (2013) Diversity enhanced particle swarm optimization with neighborhood search. Inf Sci 223:119–135MathSciNetCrossRef Wang H, Sun H, Li C, Rahnamayan S, Pan JS (2013) Diversity enhanced particle swarm optimization with neighborhood search. Inf Sci 223:119–135MathSciNetCrossRef
Zurück zum Zitat Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471CrossRef Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471CrossRef
Zurück zum Zitat Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput C–20(9):1100–1103CrossRefMATH Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput C–20(9):1100–1103CrossRefMATH
Zurück zum Zitat Xue B, Zhang M, Browne W, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput PP(99):1–1 Xue B, Zhang M, Browne W, Yao X (2016) A survey on evolutionary computation approaches to feature selection. IEEE Trans Evol Comput PP(99):1–1
Zurück zum Zitat Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671CrossRef Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43(6):1656–1671CrossRef
Zurück zum Zitat Xue B, Zhang M, Browne WN (2014) Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl Soft Comput 18:261–276CrossRef Xue B, Zhang M, Browne WN (2014) Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl Soft Comput 18:261–276CrossRef
Zurück zum Zitat Zhai Y, Ong YS, Tsang IW (2014) The emerging “Big Dimensionality”. IEEE Comput Intell Mag 9(3):14–26CrossRef Zhai Y, Ong YS, Tsang IW (2014) The emerging “Big Dimensionality”. IEEE Comput Intell Mag 9(3):14–26CrossRef
Zurück zum Zitat Zhan ZH, Zhang J, Li Y, Chung HSH (2009) Adaptive particle swarm optimization. IEEE Trans Syst Man Cybern Part B Cybern 39(6):1362–1381CrossRef Zhan ZH, Zhang J, Li Y, Chung HSH (2009) Adaptive particle swarm optimization. IEEE Trans Syst Man Cybern Part B Cybern 39(6):1362–1381CrossRef
Zurück zum Zitat Zhu Z, Ong YS, Dash M (2007) Wrapper-filter feature selection algorithm using a memetic framework. IEEE Trans Syst Man Cybern Part B Cybern 37(1):70–76CrossRef Zhu Z, Ong YS, Dash M (2007) Wrapper-filter feature selection algorithm using a memetic framework. IEEE Trans Syst Man Cybern Part B Cybern 37(1):70–76CrossRef
Metadaten
Titel
Feature selection for high-dimensional classification using a competitive swarm optimizer
verfasst von
Shenkai Gu
Ran Cheng
Yaochu Jin
Publikationsdatum
07.10.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
Soft Computing / Ausgabe 3/2018
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-016-2385-6

Weitere Artikel der Ausgabe 3/2018

Soft Computing 3/2018 Zur Ausgabe