Skip to main content
Top

Hint

Swipe to navigate through the chapters of this book

2016 | OriginalPaper | Chapter

A Targeted Estimation of Distribution Algorithm Compared to Traditional Methods in Feature Selection

Authors : Geoffrey Neumann, David Cairns

Published in: Computational Intelligence

Publisher: Springer International Publishing

Abstract

The Targeted Estimation of Distribution Algorithm (TEDA) introduces into an EDA/GA hybrid framework a ‘Targeting’ process, whereby the number of active genes, or ‘control points’, in a solution is driven in an optimal direction. For larger feature selection problems with over a thousand features, traditional methods such as forward and backward selection are inefficient. Traditional EAs may perform better but are slow to optimize if a problem is sufficiently noisy that most large solutions are equally ineffective and it is only when much smaller solutions are discovered that effective optimization may begin. By using targeting, TEDA is able to drive down the feature set size quickly and so speeds up this process. This approach was tested on feature selection problems with between 500 and 20,000 features using all of these approaches and it was confirmed that TEDA finds effective solutions significantly faster than the other approaches.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Baluja, S.: Population-based incremental learning: A method for integrating genetic search based function optimization and competitive learning. Technical Report CMU-CS-94-163, Computer Science Department, Carnegie Mellon University (1994) Baluja, S.: Population-based incremental learning: A method for integrating genetic search based function optimization and competitive learning. Technical Report CMU-CS-94-163, Computer Science Department, Carnegie Mellon University (1994)
2.
go back to reference Bo, T., Jonassen, I.: New feature subset selection procedures for classification of expression profiles. Genome Biol. 3(4), 1–17 (2002) CrossRef Bo, T., Jonassen, I.: New feature subset selection procedures for classification of expression profiles. Genome Biol. 3(4), 1–17 (2002) CrossRef
3.
go back to reference Cantu-Paz, E.: Feature subset selection by estimation of distribution algorithms. In Proceedings of Genetic and Evolutionary Computation Conference MIT Press, pp. 303-310 (2002) Cantu-Paz, E.: Feature subset selection by estimation of distribution algorithms. In Proceedings of Genetic and Evolutionary Computation Conference MIT Press, pp. 303-310 (2002)
4.
go back to reference Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011) Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
5.
go back to reference Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1, 131–156 (1997) CrossRef Dash, M., Liu, H.: Feature selection for classification. Intell. Data Anal. 1, 131–156 (1997) CrossRef
6.
go back to reference Frank, A., Asuncion, A.: UCI machine learning repository (2010) Frank, A., Asuncion, A.: UCI machine learning repository (2010)
7.
go back to reference Godley, P., Cairns, D., Cowie, J., McCall, J.: Fitness directed intervention crossover approaches applied to bio-scheduling problems. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp 120-127 (2008) Godley, P., Cairns, D., Cowie, J., McCall, J.: Fitness directed intervention crossover approaches applied to bio-scheduling problems. In: IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp 120-127 (2008)
8.
go back to reference Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003) MATH Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003) MATH
9.
go back to reference Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the NIPS 2003 feature selection challenge. Adv. Neural Inf. Process. Syst. 17, 545–552 (2004) Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the NIPS 2003 feature selection challenge. Adv. Neural Inf. Process. Syst. 17, 545–552 (2004)
10.
go back to reference Inza, I., Larranaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by bayesian networks based on optimization. Artif. Intell. 123(1), 157–184 (2000) CrossRefMATH Inza, I., Larranaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by bayesian networks based on optimization. Artif. Intell. 123(1), 157–184 (2000) CrossRefMATH
11.
go back to reference Inza, I., Larranaga, P., Sierra, B.: Feature subset selection by bayesian networks: a comparison with genetic and sequential algorithms. Int. J. Approx. Reason. 27(2), 143–164 (2001) CrossRefMATH Inza, I., Larranaga, P., Sierra, B.: Feature subset selection by bayesian networks: a comparison with genetic and sequential algorithms. Int. J. Approx. Reason. 27(2), 143–164 (2001) CrossRefMATH
12.
go back to reference Keller, J., Gray, M., Givens, J.: A fuzzy k-nearest neighbor algorithm. In: IEEE Transactions on Systems, Man and Cybernetics, vol. 4, pp. 580–585 (1985) Keller, J., Gray, M., Givens, J.: A fuzzy k-nearest neighbor algorithm. In: IEEE Transactions on Systems, Man and Cybernetics, vol. 4, pp. 580–585 (1985)
13.
go back to reference Lai, C., Reinders, M., Wessels, L.: Random subspace method for multivariate feature selection. Pattern Recognit. Lett. 27(10), 1067–1076 (2006) CrossRef Lai, C., Reinders, M., Wessels, L.: Random subspace method for multivariate feature selection. Pattern Recognit. Lett. 27(10), 1067–1076 (2006) CrossRef
14.
go back to reference Larranaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A New Tool For Evolutionary Computation, vol 2. Springer (2002) Larranaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A New Tool For Evolutionary Computation, vol 2. Springer (2002)
15.
go back to reference Muhlenbein, H., Paass, G.: Recombination of genes to the estimation of distributions. PPSN, pp. 178–187. Springer, Berlin (1996) Muhlenbein, H., Paass, G.: Recombination of genes to the estimation of distributions. PPSN, pp. 178–187. Springer, Berlin (1996)
16.
go back to reference Neumann, G., Cairns, D.: Targeted eda adapted for a routing problem with variable length chromosomes. In: IEEE Congress on Evolutionary Computation (CEC), pp. 220–225 (2012) Neumann, G., Cairns, D.: Targeted eda adapted for a routing problem with variable length chromosomes. In: IEEE Congress on Evolutionary Computation (CEC), pp. 220–225 (2012)
17.
go back to reference Neumann, G.K., Cairns, D.E.: Introducing intervention targeting into estimation of distribution algorithms. In: Proceedings of the 27th ACM Symposium on Applied Computing, pp. 334 - 341 (2012) Neumann, G.K., Cairns, D.E.: Introducing intervention targeting into estimation of distribution algorithms. In: Proceedings of the 27th ACM Symposium on Applied Computing, pp. 334 - 341 (2012)
18.
go back to reference Pena, J., Robles, V., Larranaga, P., Herves, V., Rosales, F., Perez, M.: GA-EDA: Hybrid evolutionary algorithm using genetic and estimation of distribution algorithms. Innovations in Applied Artificial Intelligence, pp. 361–371. Springer, Berlin (2004) CrossRef Pena, J., Robles, V., Larranaga, P., Herves, V., Rosales, F., Perez, M.: GA-EDA: Hybrid evolutionary algorithm using genetic and estimation of distribution algorithms. Innovations in Applied Artificial Intelligence, pp. 361–371. Springer, Berlin (2004) CrossRef
19.
go back to reference Posik, P.: Preventing premature convergence in a simple eda via global step size setting. In: Proceedings of the 10th International Conference on PPSN X (2008) Posik, P.: Preventing premature convergence in a simple eda via global step size setting. In: Proceedings of the 10th International Conference on PPSN X (2008)
20.
go back to reference Pudil, P., Novovicova, J., Kittler, J.: Floating search methods in feature selection. Pattern Recognit. Lett. 15(11), 1119–1125 (1994) CrossRef Pudil, P., Novovicova, J., Kittler, J.: Floating search methods in feature selection. Pattern Recognit. Lett. 15(11), 1119–1125 (1994) CrossRef
21.
go back to reference Saeys, Y., Degroeve, S., Aeyels, D., de Peer, Y.V., Rouz, P.: Fast feature selection using a simple estimation of distribution algorithm: a case study on splice site prediction. Bioinformatics 19(suppl 2), 179–188 (2003) CrossRef Saeys, Y., Degroeve, S., Aeyels, D., de Peer, Y.V., Rouz, P.: Fast feature selection using a simple estimation of distribution algorithm: a case study on splice site prediction. Bioinformatics 19(suppl 2), 179–188 (2003) CrossRef
22.
go back to reference Stoppiglia, H., Dreyfus, G., Dubois, R., Oussar, Y.: Ranking a random feature for variable and feature selection. J. Mach. Learn. Res. 3, 1399–1414 (2003) MATH Stoppiglia, H., Dreyfus, G., Dubois, R., Oussar, Y.: Ranking a random feature for variable and feature selection. J. Mach. Learn. Res. 3, 1399–1414 (2003) MATH
23.
go back to reference Zhang, Q., Sun, J., Tsang, E.: Combinations of estimation of distribution algorithms and other techniques. Int. J. Autom. Comput. 4(3), 273–280 (2007) CrossRef Zhang, Q., Sun, J., Tsang, E.: Combinations of estimation of distribution algorithms and other techniques. Int. J. Autom. Comput. 4(3), 273–280 (2007) CrossRef
Metadata
Title
A Targeted Estimation of Distribution Algorithm Compared to Traditional Methods in Feature Selection
Authors
Geoffrey Neumann
David Cairns
Copyright Year
2016
DOI
https://doi.org/10.1007/978-3-319-23392-5_5

Premium Partner