Skip to main content
Top

2020 | OriginalPaper | Chapter

Optimal Feature Selection for Decision Trees Induction Using a Genetic Algorithm Wrapper - A Model Approach

Authors : Prokopis K. Theodoridis, Dimitris C. Gkikas

Published in: Strategic Innovative Marketing and Tourism

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

The aim of this paper is to describe an approach to a sophisticated model of optimised subsets of data classification. This effort refers to a seemingly parallel processing of two algorithms, in order to successfully classify features through optimization processing, using a wrapping method in order to decrease overfitting and maintain accuracy. A wrapping method measures how useful the features are through the classifier’s performance optimisation. In cases where big datasets are classified the risk of overfitting to occur is high. Thus, instead of classifying big datasets, a “smarter” approach is used by classifying subsets of data, also called chromosomes, using a genetic algorithm. The genetic algorithm is used to find the best combinations of chromosomes from a series of combinations called generations. The genetic algorithm will produce a big number of chromosomes of certain number of attributes, also called genes, that will be classified from the decision tree and they will get a fitness number. This fitness number refers to classification accuracy that each chromosome got from the classification process. Only the strongest chromosomes will pass on the next generation. This method reduces the size of genes classified, eliminating at the same time the risk of overfitting. At the end, the fittest chromosomes or sets of genes or subsets of attributes will be represented. This method helps on faster and more accurate decision making. Applications of this wrapper can be used in digital marketing campaigns metrics, analytics metrics, website ranking factors, content curation, keyword research, consumer/visitor behavior analysis and other areas of marketing and business interest.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference William HΗ (2003) Genetic wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning. Department of Computing and Information Sciences, Kansas State University, Manhattan, KS William HΗ (2003) Genetic wrappers for feature selection in decision tree induction and variable ordering in Bayesian network structure learning. Department of Computing and Information Sciences, Kansas State University, Manhattan, KS
4.
go back to reference Yu E, Cho S (2003) GA-SVM wrapper approach for feature selection in keystroke dynamics identity verification. In: Proceedings of 2003 INNS-IEEE International Joint Conference on Neural Networks, pp 2253–2257 Yu E, Cho S (2003) GA-SVM wrapper approach for feature selection in keystroke dynamics identity verification. In: Proceedings of 2003 INNS-IEEE International Joint Conference on Neural Networks, pp 2253–2257
9.
go back to reference Huang J, Wang H, Wang W, Xiong Z (2013) A computational study for feature selection on customer credit evaluation. In: International conference on systems, man, and cybernetics Huang J, Wang H, Wang W, Xiong Z (2013) A computational study for feature selection on customer credit evaluation. In: International conference on systems, man, and cybernetics
12.
go back to reference Mitchell TM (1997) Machine learning. McGraw-Hill, New York Mitchell TM (1997) Machine learning. McGraw-Hill, New York
13.
go back to reference Russel S, Norvig P (2003) Artificial intelligence: a modern approach. Prentice Hall, Upper Saddle River Russel S, Norvig P (2003) Artificial intelligence: a modern approach. Prentice Hall, Upper Saddle River
14.
go back to reference Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Amsterdam. ISBN: 978-0-12-374856-0CrossRef Witten IH, Frank E, Hall MA (2011) Data mining: practical machine learning tools and techniques. Morgan Kaufmann, Amsterdam. ISBN: 978-0-12-374856-0CrossRef
18.
go back to reference Mitchell M (1996) An introduction to genetic algorithms. MIT Press, Cambridge, MA Mitchell M (1996) An introduction to genetic algorithms. MIT Press, Cambridge, MA
21.
go back to reference Davis L (1991) Handbook of genetic algorithms, vol 115. Van Nostrand Reinhold, New York Davis L (1991) Handbook of genetic algorithms, vol 115. Van Nostrand Reinhold, New York
Metadata
Title
Optimal Feature Selection for Decision Trees Induction Using a Genetic Algorithm Wrapper - A Model Approach
Authors
Prokopis K. Theodoridis
Dimitris C. Gkikas
Copyright Year
2020
DOI
https://doi.org/10.1007/978-3-030-36126-6_65