Skip to main content
Erschienen in: Neural Computing and Applications 8/2019

17.01.2018 | Original Article

Efficient feature selection and classification algorithm based on PSO and rough sets

verfasst von: Ramesh Kumar Huda, Haider Banka

Erschienen in: Neural Computing and Applications | Ausgabe 8/2019

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The high-dimensional data are often characterized by more number of features with less number of instances. Many of the features are irrelevant and redundant. These features may be especially harmful in case of extreme number of features carries the problem of memory usage in order to represent the datasets. On the other hand relatively small training set, where this irrelevancy and redundancy makes harder to evaluate. Hence, in this paper we propose an efficient feature selection and classification method based on Particle Swarm Optimization (PSO) and rough sets. In this study, we propose the inconsistency handler algorithm for handling inconsistency in dataset, new quick reduct algorithm for handling irrelevant/noisy features and fitness function with three parameters, the classification quality of feature subset, remaining features and the accuracy of approximation. The proposed method is compared with two traditional and three fusion of PSO and rough set-based feature selection methods. In this study, Decision Tree and Naive Bayes classifiers are used to calculate the classification accuracy of the selected feature subset on nine benchmark datasets. The result shows that the proposed method can automatically selects small feature subset with better classification accuracy than using all features. The proposed method also outperforms the two traditional and three existing PSO and rough set-based feature selection methods in terms of the classification accuracy, cardinality of feature and stability indices. It is also observed that with increased weight on the classification quality of feature subset of the fitness function, there is a significant reduction in the cardinality of features and also achieve better classification accuracy as well.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Settouti N, Bechar MEA, Chikh MA (2016) Statistical comparisons of the top 10 algorithms in data mining for classification task. Int J Interact Multimed Artif Intell Spec Issue Artif Intell 4:46–51 (Underpinning) Settouti N, Bechar MEA, Chikh MA (2016) Statistical comparisons of the top 10 algorithms in data mining for classification task. Int J Interact Multimed Artif Intell Spec Issue Artif Intell 4:46–51 (Underpinning)
2.
Zurück zum Zitat Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156CrossRef Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1(3):131–156CrossRef
3.
Zurück zum Zitat Pujari JD, Yakkundimath R, Byadgi A et al. (2016) SVM and ANN based classification of plant diseases using feature reduction technique. Int J Interact Multimed Artif Intell 3(7):1–9 Pujari JD, Yakkundimath R, Byadgi A et al. (2016) SVM and ANN based classification of plant diseases using feature reduction technique. Int J Interact Multimed Artif Intell 3(7):1–9
5.
Zurück zum Zitat Pawlak Z (2012) Rough sets: theoretical aspects of reasoning about data, vol 9. Springer, New YorkMATH Pawlak Z (2012) Rough sets: theoretical aspects of reasoning about data, vol 9. Springer, New YorkMATH
6.
Zurück zum Zitat Pawlak Z (1997) Rough set approach to knowledge-based decision support. Eur J Oper Res 99(1):48–57MATHCrossRef Pawlak Z (1997) Rough set approach to knowledge-based decision support. Eur J Oper Res 99(1):48–57MATHCrossRef
7.
Zurück zum Zitat Chouchoulas A, Shen Q (2001) Rough set-aided keyword reduction for text categorization. Appl Artif Intell 15(9):843–873CrossRef Chouchoulas A, Shen Q (2001) Rough set-aided keyword reduction for text categorization. Appl Artif Intell 15(9):843–873CrossRef
8.
Zurück zum Zitat Cervante L, Xue B, Shang L, Zhang M (2013) Binary particle swarm optimisation and rough set theory for dimension reduction in classification. In: 2013 IEEE congress on evolutionary computation. IEEE, pp 2428–2435 Cervante L, Xue B, Shang L, Zhang M (2013) Binary particle swarm optimisation and rough set theory for dimension reduction in classification. In: 2013 IEEE congress on evolutionary computation. IEEE, pp 2428–2435
9.
Zurück zum Zitat Bae C, Yeh W-C, Chung YY, Liu S-L (2010) Feature selection with intelligent dynamic swarm and rough set. Expert Syst Appl 37(10):7026–7032CrossRef Bae C, Yeh W-C, Chung YY, Liu S-L (2010) Feature selection with intelligent dynamic swarm and rough set. Expert Syst Appl 37(10):7026–7032CrossRef
10.
Zurück zum Zitat Kudo M, Sklansky J (2000) Comparison of algorithms that select features for pattern classifiers. Pattern Recogn 33(1):25–41CrossRef Kudo M, Sklansky J (2000) Comparison of algorithms that select features for pattern classifiers. Pattern Recogn 33(1):25–41CrossRef
11.
Zurück zum Zitat Cervante L, Xue B, Shang L, Zhang M (2013) A multi-objective feature selection approach based on binary pso and rough set theory. In: European conference on evolutionary computation in combinatorial optimization. Springer, pp 25–36 Cervante L, Xue B, Shang L, Zhang M (2013) A multi-objective feature selection approach based on binary pso and rough set theory. In: European conference on evolutionary computation in combinatorial optimization. Springer, pp 25–36
12.
Zurück zum Zitat Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324MATHCrossRef Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324MATHCrossRef
13.
Zurück zum Zitat Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182MATH Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3(Mar):1157–1182MATH
14.
Zurück zum Zitat Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 100(9):1100–1103MATHCrossRef Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 100(9):1100–1103MATHCrossRef
15.
Zurück zum Zitat Huang C-L, Wang C-J (2006) A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31(2):231–240CrossRef Huang C-L, Wang C-J (2006) A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31(2):231–240CrossRef
16.
Zurück zum Zitat Stein G, Chen B, Wu AS, Hua KA (2005) Decision tree classifier for network intrusion detection with GA-based feature selection. In: Proceedings of the 43rd annual Southeast regional conference, vol 2. ACM, pp 136–141 Stein G, Chen B, Wu AS, Hua KA (2005) Decision tree classifier for network intrusion detection with GA-based feature selection. In: Proceedings of the 43rd annual Southeast regional conference, vol 2. ACM, pp 136–141
17.
Zurück zum Zitat Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B Cybern 36(1):106–117CrossRef Muni DP, Pal NR, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern Part B Cybern 36(1):106–117CrossRef
18.
Zurück zum Zitat Al-Ani A (2005) Feature subset selection using ant colony optimization. Int J Comput Intell 2(1):53–58 Al-Ani A (2005) Feature subset selection using ant colony optimization. Int J Comput Intell 2(1):53–58
19.
Zurück zum Zitat Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539MATHCrossRef Unler A, Murat A (2010) A discrete particle swarm optimization method for feature selection in binary classification problems. Eur J Oper Res 206(3):528–539MATHCrossRef
20.
Zurück zum Zitat Meza J, Espitia H, Montenegro C, Giménez E, González-Crespo R (2017) MOVPSO: vortex multi-objective particle swarm optimization. Appl Soft Comput 52:1042–1057CrossRef Meza J, Espitia H, Montenegro C, Giménez E, González-Crespo R (2017) MOVPSO: vortex multi-objective particle swarm optimization. Appl Soft Comput 52:1042–1057CrossRef
21.
Zurück zum Zitat Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: The 1998 IEEE international conference on evolutionary computation proceedings, 1998. IEEE world congress on computational intelligence. IEEE, pp 69–73 Shi Y, Eberhart R (1998) A modified particle swarm optimizer. In: The 1998 IEEE international conference on evolutionary computation proceedings, 1998. IEEE world congress on computational intelligence. IEEE, pp 69–73
22.
Zurück zum Zitat Kennedy J (2011) Particle swarm optimization. Encyclopedia of machine learning. Springer, Berlin, pp 760–766 Kennedy J (2011) Particle swarm optimization. Encyclopedia of machine learning. Springer, Berlin, pp 760–766
23.
Zurück zum Zitat Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57CrossRef Poli R, Kennedy J, Blackwell T (2007) Particle swarm optimization. Swarm Intell 1(1):33–57CrossRef
24.
Zurück zum Zitat Meza J, Espitia H, Montenegro C, Crespo RG (2016) Statistical analysis of a multi-objective optimization algorithm based on a model of particles with vorticity behavior. Soft Comput 20(9):3521–3536CrossRef Meza J, Espitia H, Montenegro C, Crespo RG (2016) Statistical analysis of a multi-objective optimization algorithm based on a model of particles with vorticity behavior. Soft Comput 20(9):3521–3536CrossRef
27.
Zurück zum Zitat Banka H, Dara S (2015) A hamming distance based binary particle swarm optimization (hdbpso) algorithm for high dimensional feature selection, classification and validation. Pattern Recogn Lett 52:94–100CrossRef Banka H, Dara S (2015) A hamming distance based binary particle swarm optimization (hdbpso) algorithm for high dimensional feature selection, classification and validation. Pattern Recogn Lett 52:94–100CrossRef
28.
Zurück zum Zitat Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato Hall MA (1999) Correlation-based feature selection for machine learning. PhD thesis, The University of Waikato
29.
Zurück zum Zitat Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1–2):279–305MathSciNetMATHCrossRef Almuallim H, Dietterich TG (1994) Learning boolean concepts in the presence of many irrelevant features. Artif Intell 69(1–2):279–305MathSciNetMATHCrossRef
30.
Zurück zum Zitat Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 100(9):1100–1103MATHCrossRef Whitney AW (1971) A direct method of nonparametric measurement selection. IEEE Trans Comput 100(9):1100–1103MATHCrossRef
31.
Zurück zum Zitat Marill T, Green D (1963) On the effectiveness of receptors in recognition systems. IEEE Trans Inf Theory 9(1):11–17CrossRef Marill T, Green D (1963) On the effectiveness of receptors in recognition systems. IEEE Trans Inf Theory 9(1):11–17CrossRef
32.
Zurück zum Zitat Stearns, Stephen D (1976) On selecting features for pattern classifiers. In: Proceedings of the 3rd international joint conference on pattern recognition, pp 71–75 Stearns, Stephen D (1976) On selecting features for pattern classifiers. In: Proceedings of the 3rd international joint conference on pattern recognition, pp 71–75
33.
Zurück zum Zitat Chakraborty B (2002) Genetic algorithm with fuzzy fitness function for feature selection. In: IEEE international symposium on industrial electronics (ISIE02), vol 1, pp 315–319 Chakraborty B (2002) Genetic algorithm with fuzzy fitness function for feature selection. In: IEEE international symposium on industrial electronics (ISIE02), vol 1, pp 315–319
34.
Zurück zum Zitat Chakraborty B (2008) Feature subset selection by particle swarm optimization with fuzzy fitness function. In: 3rd international conference on intelligent system and knowledge engineering, 2008. ISKE 2008, vol 1. IEEE, pp 1038–1042 Chakraborty B (2008) Feature subset selection by particle swarm optimization with fuzzy fitness function. In: 3rd international conference on intelligent system and knowledge engineering, 2008. ISKE 2008, vol 1. IEEE, pp 1038–1042
35.
Zurück zum Zitat Neshatian K, Zhang M (2009) Pareto front feature selection: using genetic programming to explore feature space. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 1027–1034 Neshatian K, Zhang M (2009) Pareto front feature selection: using genetic programming to explore feature space. In: Proceedings of the 11th annual conference on genetic and evolutionary computation. ACM, pp 1027–1034
36.
Zurück zum Zitat Jensen R (2006) Performing feature selection with ACO. Swarm intelligence in data mining. Springer, Berlin, pp 45–73CrossRef Jensen R (2006) Performing feature selection with ACO. Swarm intelligence in data mining. Springer, Berlin, pp 45–73CrossRef
38.
Zurück zum Zitat Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471CrossRef Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471CrossRef
39.
Zurück zum Zitat Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233CrossRef Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recogn Lett 31(3):226–233CrossRef
40.
Zurück zum Zitat Cervante L, Xue B, Shang L, Zhang M (2012) A dimension reduction approach to classification based on particle swarm optimisation and rough set theory. In: Australasian joint conference on artificial intelligence. Springer, pp 313–325 Cervante L, Xue B, Shang L, Zhang M (2012) A dimension reduction approach to classification based on particle swarm optimisation and rough set theory. In: Australasian joint conference on artificial intelligence. Springer, pp 313–325
41.
Zurück zum Zitat Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recogn Lett 24(6):833–849MATHCrossRef Swiniarski RW, Skowron A (2003) Rough set methods in feature selection and recognition. Pattern Recogn Lett 24(6):833–849MATHCrossRef
42.
Zurück zum Zitat Xue B, Cervante L, Shang L, Browne WN, Zhang M (2014) Binary pso and rough set theory for feature selection: a multi-objective filter based approach. Int J Comput Intell Appl 13(02):1450009CrossRef Xue B, Cervante L, Shang L, Browne WN, Zhang M (2014) Binary pso and rough set theory for feature selection: a multi-objective filter based approach. Int J Comput Intell Appl 13(02):1450009CrossRef
43.
Zurück zum Zitat Inbarani HH, Azar AT, Jothi G (2014) Supervised hybrid feature selection based on pso and rough sets for medical diagnosis. Comput Methods Programs Biomed 113(1):175–185CrossRef Inbarani HH, Azar AT, Jothi G (2014) Supervised hybrid feature selection based on pso and rough sets for medical diagnosis. Comput Methods Programs Biomed 113(1):175–185CrossRef
45.
Zurück zum Zitat Witten Ian H, Eibe F (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Los AltosMATH Witten Ian H, Eibe F (2005) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Los AltosMATH
Metadaten
Titel
Efficient feature selection and classification algorithm based on PSO and rough sets
verfasst von
Ramesh Kumar Huda
Haider Banka
Publikationsdatum
17.01.2018
Verlag
Springer London
Erschienen in
Neural Computing and Applications / Ausgabe 8/2019
Print ISSN: 0941-0643
Elektronische ISSN: 1433-3058
DOI
https://doi.org/10.1007/s00521-017-3317-9

Weitere Artikel der Ausgabe 8/2019

Neural Computing and Applications 8/2019 Zur Ausgabe