Skip to main content
Erschienen in: Evolutionary Intelligence 3/2016

01.09.2016 | Special Issue

Mutual information for feature selection: estimation or counting?

verfasst von: Hoai Bach Nguyen, Bing Xue, Peter Andreae

Erschienen in: Evolutionary Intelligence | Ausgabe 3/2016

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

In classification, feature selection is an important pre-processing step to simplify the dataset and improve the data representation quality, which makes classifiers become better, easier to train, and understand. Because of an ability to analyse non-linear interactions between features, mutual information has been widely applied to feature selection. Along with counting approaches, a traditional way to calculate mutual information, many mutual information estimations have been proposed to allow mutual information to work directly on continuous datasets. This work focuses on comparing the effect of counting approach and kernel density estimation (KDE) approach in feature selection using particle swarm optimisation as a search mechanism. The experimental results on 15 different datasets show that KDE can work well on both continuous and discrete datasets. In addition, feature subsets evolved by KDE achieves similar or better classification performance than the counting approach. Furthermore, the results on artificial datasets with various interactions show that KDE is able to capture correctly the interaction between features, in both relevance and redundancy, which can not be achieved by using the counting approach.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. Data Classif Algorithms Appl 2014:37MathSciNet Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. Data Classif Algorithms Appl 2014:37MathSciNet
2.
Zurück zum Zitat Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2:37–52CrossRef Wold S, Esbensen K, Geladi P (1987) Principal component analysis. Chemom Intell Lab Syst 2:37–52CrossRef
3.
Zurück zum Zitat Lee TW (1998) Independent component analysis. Springer, US, pp 27–66 Lee TW (1998) Independent component analysis. Springer, US, pp 27–66
5.
Zurück zum Zitat Marill T, Green DM (1963) On the effectiveness of receptors in recognition systems. IEEE Trans Inf Theory 9:11–17CrossRef Marill T, Green DM (1963) On the effectiveness of receptors in recognition systems. IEEE Trans Inf Theory 9:11–17CrossRef
7.
Zurück zum Zitat Eberhart RC, Shi Y (1998) Comparison between genetic algorithms and particle swarm optimization. In: Porto VW, Saravanan N, Waagen D, Eiben AE (eds) Proceedings of the 7th international conference on evolutionary programming VII. Lecture notes in computer science, vol 1447. Springer, Berlin, Heidelberg, pp 611–616 Eberhart RC, Shi Y (1998) Comparison between genetic algorithms and particle swarm optimization. In: Porto VW, Saravanan N, Waagen D, Eiben AE (eds) Proceedings of the 7th international conference on evolutionary programming VII. Lecture notes in computer science, vol 1447. Springer, Berlin, Heidelberg, pp 611–616
8.
Zurück zum Zitat Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156CrossRef Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156CrossRef
9.
Zurück zum Zitat Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324CrossRefMATH Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97:273–324CrossRefMATH
10.
Zurück zum Zitat Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, New YorkMATH Duda RO, Hart PE, Stork DG (2012) Pattern classification. Wiley, New YorkMATH
11.
Zurück zum Zitat Dash M, Liu H, Motoda H (2000) Consistency Based Feature Selection. In: Takao T, Liu H, Chen ALP (eds) Knowledge discovery and data mining. current issues and new applications. Lecture notes in computer science, vol 1805. Springer, Berlin, Heidelberg, pp 98–109 Dash M, Liu H, Motoda H (2000) Consistency Based Feature Selection. In: Takao T, Liu H, Chen ALP (eds) Knowledge discovery and data mining. current issues and new applications. Lecture notes in computer science, vol 1805. Springer, Berlin, Heidelberg, pp 98–109
12.
Zurück zum Zitat Hall M (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of 7th intentional conference on machine learning, Stanford University (2000) Hall M (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of 7th intentional conference on machine learning, Stanford University (2000)
13.
Zurück zum Zitat Kononenko I (1995) On biases in estimating multi-valued attributes. IJCAI 95:1034–1040 Kononenko I (1995) On biases in estimating multi-valued attributes. IJCAI 95:1034–1040
14.
Zurück zum Zitat Walters-Williams J, Li Y (2009) Estimation of mutual information: a survey. In: Wen P, Li Y, Polkowski L, Yao Y, Tsumoto S, Wang G (eds) Rough sets and knowledge technology, Springer, Heidelberg, pp 389–396. doi:10.1007/978-3-642-02962-2_49 CrossRef Walters-Williams J, Li Y (2009) Estimation of mutual information: a survey. In: Wen P, Li Y, Polkowski L, Yao Y, Tsumoto S, Wang G (eds) Rough sets and knowledge technology, Springer, Heidelberg, pp 389–396. doi:10.​1007/​978-3-642-02962-2_​49 CrossRef
15.
Zurück zum Zitat Nguyen HB, Xue B, Andreae P (2016) Mutual information estimation for filter based feature selection using particle swarm optimization. In: Applications of evolutionary computation. Springer (2016) 719–736 Nguyen HB, Xue B, Andreae P (2016) Mutual information estimation for filter based feature selection using particle swarm optimization. In: Applications of evolutionary computation. Springer (2016) 719–736
16.
Zurück zum Zitat Kennedy J, Eberhart R et al (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, vol 4, Perth, Australia, pp 1942–1948 Kennedy J, Eberhart R et al (1995) Particle swarm optimization. In: Proceedings of IEEE international conference on neural networks, vol 4, Perth, Australia, pp 1942–1948
18.
Zurück zum Zitat Alfonso L, Lobbrecht A, Price R (2010) Optimization of water level monitoring network in polder systems using information theory. Water Resources Research 46 (2010) Alfonso L, Lobbrecht A, Price R (2010) Optimization of water level monitoring network in polder systems using information theory. Water Resources Research 46 (2010)
19.
Zurück zum Zitat Stearns SD (1976) On selecting features for pattern classifiers. In: Proceedings of the 3rd international conference on pattern recognition (ICPR 1976), Coronado, CA, pp 71–75 Stearns SD (1976) On selecting features for pattern classifiers. In: Proceedings of the 3rd international conference on pattern recognition (ICPR 1976), Coronado, CA, pp 71–75
20.
Zurück zum Zitat Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15:1119–1125CrossRef Pudil P, Novovičová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15:1119–1125CrossRef
21.
Zurück zum Zitat Xue B, Zhang M, Browne WN (2014) Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl Soft Comput 18:261–276CrossRef Xue B, Zhang M, Browne WN (2014) Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl Soft Comput 18:261–276CrossRef
22.
Zurück zum Zitat Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering. Appl Soft Comput 43:20–34CrossRef Bharti KK, Singh PK (2016) Opposition chaotic fitness mutation based adaptive inertia weight BPSO for feature selection in text clustering. Appl Soft Comput 43:20–34CrossRef
23.
Zurück zum Zitat Vieira SM, Mendonça LF, Farinha GJ, Sousa JM (2013) Modified binary PSO for feature selection using svm applied to mortality prediction of septic patients. Appl Soft Comput 13:3494–3504CrossRef Vieira SM, Mendonça LF, Farinha GJ, Sousa JM (2013) Modified binary PSO for feature selection using svm applied to mortality prediction of septic patients. Appl Soft Comput 13:3494–3504CrossRef
24.
Zurück zum Zitat Chuang LY, Chang HW, Tu CJ, Yang CH (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32:29–38CrossRefMATH Chuang LY, Chang HW, Tu CJ, Yang CH (2008) Improved binary PSO for feature selection using gene expression data. Comput Biol Chem 32:29–38CrossRefMATH
25.
Zurück zum Zitat Lee S, Soak S, Oh S, Pedrycz W, Jeon M (2008) Modified binary particle swarm optimization. Prog Nat Sci 18:1161–1166MathSciNetCrossRef Lee S, Soak S, Oh S, Pedrycz W, Jeon M (2008) Modified binary particle swarm optimization. Prog Nat Sci 18:1161–1166MathSciNetCrossRef
26.
Zurück zum Zitat Huang CL, Wang CJ (2006) A ga-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl 31:231–240CrossRef Huang CL, Wang CJ (2006) A ga-based feature selection and parameters optimizationfor support vector machines. Expert Syst Appl 31:231–240CrossRef
27.
Zurück zum Zitat Lane MC, Xue B, Liu I, Zhang M (2013) Particle swarm optimisation and statistical clustering for feature selection. In: AI 2013: advances in artificial intelligence. Springer, pp 214–220 Lane MC, Xue B, Liu I, Zhang M (2013) Particle swarm optimisation and statistical clustering for feature selection. In: AI 2013: advances in artificial intelligence. Springer, pp 214–220
28.
Zurück zum Zitat Lane MC, Xue B, Liu I, Zhang M (2014) Gaussian based particle swarm optimisation and statistical clustering for feature selection. In: Evolutionary computation in combinatorial optimisation. Lecture notes in computer science, vol 8600. Springer, Heidelberg, pp 133–144. doi:10.1007/978-3-662-44320-0_12 Lane MC, Xue B, Liu I, Zhang M (2014) Gaussian based particle swarm optimisation and statistical clustering for feature selection. In: Evolutionary computation in combinatorial optimisation. Lecture notes in computer science, vol 8600. Springer, Heidelberg, pp 133–144. doi:10.​1007/​978-3-662-44320-0_​12
29.
Zurück zum Zitat Nguyen HB, Xue B, Liu I, Zhang M (2014) PSO and statistical clustering for feature selection: a new representation. In: Dick G, Browne WN, Whigham P, Zhang M, Bui LT, Ishibuchi BH, Jin Y, Li X, Shi Y, Singh P, Tan KC, Tang K (eds) Simulated evolution and learning, vol 8886. Springer International Publishing, Heidelberg, pp 569–581. doi:10.1007/978-3-319-13563-2_481 Nguyen HB, Xue B, Liu I, Zhang M (2014) PSO and statistical clustering for feature selection: a new representation. In: Dick G, Browne WN, Whigham P, Zhang M, Bui LT, Ishibuchi BH, Jin Y, Li X, Shi Y, Singh P, Tan KC, Tang K (eds) Simulated evolution and learning, vol 8886. Springer International Publishing, Heidelberg, pp 569–581. doi:10.​1007/​978-3-319-13563-2_​481
30.
Zurück zum Zitat Nguyen HB, Xue B, Liu I, Andreae P, Zhang M (2015) Gaussian transformation based representation in particle swarm optimisation for feature selection. In: Mora AM, Squillero G (eds) Applications of evolutionary computation, vol 9028. Springer International Publishing, pp 541–553. doi:10.1007/978-3-319-16549-3_44 Nguyen HB, Xue B, Liu I, Andreae P, Zhang M (2015) Gaussian transformation based representation in particle swarm optimisation for feature selection. In: Mora AM, Squillero G (eds) Applications of evolutionary computation, vol 9028. Springer International Publishing, pp 541–553. doi:10.​1007/​978-3-319-16549-3_​44
31.
Zurück zum Zitat Tran B, Xue B, Zhang M (2014) Improved PSO for feature selection on high-dimensional datasets. In: Dick G, Browne WN, Whigham P, Zhang M, Bui LT, Ishibuchi BH, Jin Y, Li X, Shi Y, Singh P, Tan KC, Tang K (eds) Simulated evolution and learning. Lecture notes in computer science, vol 8886. Springer International Publishing, pp 503–515 Tran B, Xue B, Zhang M (2014) Improved PSO for feature selection on high-dimensional datasets. In: Dick G, Browne WN, Whigham P, Zhang M, Bui LT, Ishibuchi BH, Jin Y, Li X, Shi Y, Singh P, Tan KC, Tang K (eds) Simulated evolution and learning. Lecture notes in computer science, vol 8886. Springer International Publishing, pp 503–515
32.
Zurück zum Zitat Ghamisi P, Benediktsson JA (2015) Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci Rem Sens Lett 12:309–313CrossRef Ghamisi P, Benediktsson JA (2015) Feature selection based on hybridization of genetic algorithm and particle swarm optimization. IEEE Geosci Rem Sens Lett 12:309–313CrossRef
33.
Zurück zum Zitat Freeman C, Kulić D, Basir O (2015) An evaluation of classifier-specific filter measure performance for feature selection. Pattern Recognit 48:1812–1826CrossRef Freeman C, Kulić D, Basir O (2015) An evaluation of classifier-specific filter measure performance for feature selection. Pattern Recognit 48:1812–1826CrossRef
34.
Zurück zum Zitat Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238CrossRef Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238CrossRef
35.
Zurück zum Zitat Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20:189–201CrossRef Estévez PA, Tesmer M, Perez CA, Zurada JM (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20:189–201CrossRef
36.
Zurück zum Zitat Hoque N, Bhattacharyya D, Kalita JK (2014) Mifs-nd: a mutual information-based feature selection method. Expert Syst Appl 41:6371–6385CrossRef Hoque N, Bhattacharyya D, Kalita JK (2014) Mifs-nd: a mutual information-based feature selection method. Expert Syst Appl 41:6371–6385CrossRef
37.
Zurück zum Zitat Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11:10–18CrossRef Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. ACM SIGKDD Explor Newsl 11:10–18CrossRef
38.
Zurück zum Zitat Lee J, Kim DW (2015) Mutual information-based multi-label feature selection using interaction information. Expert Syst Appl 42:2013–2025CrossRef Lee J, Kim DW (2015) Mutual information-based multi-label feature selection using interaction information. Expert Syst Appl 42:2013–2025CrossRef
39.
Zurück zum Zitat Lee J, Kim DW (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recognit Lett 34:349–357CrossRef Lee J, Kim DW (2013) Feature selection for multi-label classification using multivariate mutual information. Pattern Recognit Lett 34:349–357CrossRef
40.
Zurück zum Zitat Fang L, Zhao H, Wang P, Yu M, Yan J, Cheng W, Chen P (2015) Feature selection method based on mutual information and class separability for dimension reduction in multidimensional time series for clinical data. Biomed Signal Process Control 21:82–89CrossRef Fang L, Zhao H, Wang P, Yu M, Yan J, Cheng W, Chen P (2015) Feature selection method based on mutual information and class separability for dimension reduction in multidimensional time series for clinical data. Biomed Signal Process Control 21:82–89CrossRef
41.
42.
Zurück zum Zitat Cervante L, Xue B, Zhang M, Shang L (2012) Binary particle swarm optimisation for feature selection: a filter based approach. In: 2012 IEEE congress on evolutionary computation (CEC). IEEE (2012) Cervante L, Xue B, Zhang M, Shang L (2012) Binary particle swarm optimisation for feature selection: a filter based approach. In: 2012 IEEE congress on evolutionary computation (CEC). IEEE (2012)
43.
Zurück zum Zitat Xue B, Cervante L, Shang L, Browne WN, Zhang M (2012) A multi-objective particle swarm optimisation for filter-based feature selection in classification problems. Connect Sci 24:91–116CrossRef Xue B, Cervante L, Shang L, Browne WN, Zhang M (2012) A multi-objective particle swarm optimisation for filter-based feature selection in classification problems. Connect Sci 24:91–116CrossRef
44.
Zurück zum Zitat Nguyen HB, Xue B, Liu I, Zhang M (2014) Filter based backward elimination in wrapper based PSO for feature selection in classification. In: IEEE congress on evolutionary computation (CEC), Beijing, pp 3111–3118. doi:10.1109/CEC.2014.6900657 Nguyen HB, Xue B, Liu I, Zhang M (2014) Filter based backward elimination in wrapper based PSO for feature selection in classification. In: IEEE congress on evolutionary computation (CEC), Beijing, pp 3111–3118. doi:10.​1109/​CEC.​2014.​6900657
45.
Zurück zum Zitat Sturges HA (1926) The choice of a class interval. J Am Stat Assoc 21:65–66CrossRef Sturges HA (1926) The choice of a class interval. J Am Stat Assoc 21:65–66CrossRef
47.
Zurück zum Zitat Lizier JT (2014) Jidt: an information-theoretic toolkit for studying the dynamics of complex systems. arXiv preprint arXiv:1408.3270 Lizier JT (2014) Jidt: an information-theoretic toolkit for studying the dynamics of complex systems. arXiv preprint arXiv:​1408.​3270
48.
Zurück zum Zitat Asuncion A, Newman D (2007) Uci machine learning repository (2007) Asuncion A, Newman D (2007) Uci machine learning repository (2007)
49.
Zurück zum Zitat Lungarella M, Pegors T, Bulwinkle D, Sporns O (2005) Methods for quantifying the informational structure of sensory and motor data. Neuroinformatics 3:243–262CrossRef Lungarella M, Pegors T, Bulwinkle D, Sporns O (2005) Methods for quantifying the informational structure of sensory and motor data. Neuroinformatics 3:243–262CrossRef
50.
Zurück zum Zitat Van Den Bergh F (2006) An analysis of particle swarm optimizers. PhD thesis, University of Pretoria (2006) Van Den Bergh F (2006) An analysis of particle swarm optimizers. PhD thesis, University of Pretoria (2006)
51.
Zurück zum Zitat Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43:1656–1671CrossRef Xue B, Zhang M, Browne WN (2013) Particle swarm optimization for feature selection in classification: a multi-objective approach. IEEE Trans Cybern 43:1656–1671CrossRef
52.
Zurück zum Zitat Eberhart RC, Shi Y (2000) Comparing inertia weights and constriction factors in particle swarm optimization evolutionary computation. In: Proceedings of the 2000 Congress on, La Jolla, CA, vol 1, pp 84–88. doi:10.1109/CEC.2000.870279 Eberhart RC, Shi Y (2000) Comparing inertia weights and constriction factors in particle swarm optimization evolutionary computation. In: Proceedings of the 2000 Congress on, La Jolla, CA, vol 1, pp 84–88. doi:10.​1109/​CEC.​2000.​870279
53.
Zurück zum Zitat Moraglio A, Di Chio C, Poli R (2007) Geometric Particle Swarm Optimisation. In: Ebner M, O’Neill M, Ekárt A, Vanneschi L, Esparcia-Alcázar AI (eds) Genetic Programming, vol 4445. Springer, Berlin, Heidelberg, pp 125–136. doi:10.1007/978-3-540-71605-1_12 Moraglio A, Di Chio C, Poli R (2007) Geometric Particle Swarm Optimisation. In: Ebner M, O’Neill M, Ekárt A, Vanneschi L, Esparcia-Alcázar AI (eds) Genetic Programming, vol 4445. Springer, Berlin, Heidelberg, pp 125–136. doi:10.​1007/​978-3-540-71605-1_​12
Metadaten
Titel
Mutual information for feature selection: estimation or counting?
verfasst von
Hoai Bach Nguyen
Bing Xue
Peter Andreae
Publikationsdatum
01.09.2016
Verlag
Springer Berlin Heidelberg
Erschienen in
Evolutionary Intelligence / Ausgabe 3/2016
Print ISSN: 1864-5909
Elektronische ISSN: 1864-5917
DOI
https://doi.org/10.1007/s12065-016-0143-4

Weitere Artikel der Ausgabe 3/2016

Evolutionary Intelligence 3/2016 Zur Ausgabe

Premium Partner