Skip to main content

2016 | OriginalPaper | Buchkapitel

Kaizen Programming for Feature Construction for Classification

verfasst von : Vinícius Veloso de Melo, Wolfgang Banzhaf

Erschienen in: Genetic Programming Theory and Practice XIII

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

A data set for classification is commonly composed of a set of features defining the data space representation and one attribute corresponding to the instances’ class. A classification tool has to discover how to separate classes based on features, but the discovery of useful knowledge may be hampered by inadequate or insufficient features. Pre-processing steps for the automatic construction of new high-level features proposed to discover hidden relationships among features and to improve classification quality. Here we present a new tool for high-level feature construction: Kaizen Programming. This tool can construct many complementary/dependent high-level features simultaneously. We show that our approach outperforms related methods on well-known binary-class medical data sets using a decision-tree classifier, achieving greater accuracy and smaller trees.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Thirty-two runs were performed because it is a multiple of 8, and the runs were done in parallel on a quad-core machine with hyper-threading, so we employed all available processing units.
 
Literatur
Zurück zum Zitat Banzhaf W, Nordin P, Keller R, Francone F (1998) Genetic programming - an introduction. Morgan Kaufmann, San FranciscoCrossRefMATH Banzhaf W, Nordin P, Keller R, Francone F (1998) Genetic programming - an introduction. Morgan Kaufmann, San FranciscoCrossRefMATH
Zurück zum Zitat Brameier M, Banzhaf W (2001) Evolving teams of predictors with linear genetic programming. Genet Program Evolvable Mach 2(4):381–407CrossRefMATH Brameier M, Banzhaf W (2001) Evolving teams of predictors with linear genetic programming. Genet Program Evolvable Mach 2(4):381–407CrossRefMATH
Zurück zum Zitat Breiman L, Friedman J, Stone C, Olshen R (1984) Classification and regression trees. The Wadsworth and Brooks-Cole statistics-probability series. Taylor & Francis, London Breiman L, Friedman J, Stone C, Olshen R (1984) Classification and regression trees. The Wadsworth and Brooks-Cole statistics-probability series. Taylor & Francis, London
Zurück zum Zitat de Melo VV (2014) Kaizen programming. In: Proceedings of the 2014 conference on genetic and evolutionary computation (GECCO). ACM, New York, pp 895–902 de Melo VV (2014) Kaizen programming. In: Proceedings of the 2014 conference on genetic and evolutionary computation (GECCO). ACM, New York, pp 895–902
Zurück zum Zitat Drozdz K, Kwasnicka H (2010) Feature set reduction by evolutionary selection and construction. In: Agent and multi-agent systems: technologies and applications. Springer, Berlin, Heidelberg, pp 140–149CrossRef Drozdz K, Kwasnicka H (2010) Feature set reduction by evolutionary selection and construction. In: Agent and multi-agent systems: technologies and applications. Springer, Berlin, Heidelberg, pp 140–149CrossRef
Zurück zum Zitat Freitas AA (2008) A review of evolutionary algorithms for data mining. In: Soft computing for knowledge discovery and data mining, Springer, Berlin, pp 79–111CrossRef Freitas AA (2008) A review of evolutionary algorithms for data mining. In: Soft computing for knowledge discovery and data mining, Springer, Berlin, pp 79–111CrossRef
Zurück zum Zitat Gitlow H, Gitlow S, Oppenheim A, Oppenheim R (1989) Tools and methods for the improvement of quality. Irwin series in quantitative analysis for business. Taylor & Francis, LondonMATH Gitlow H, Gitlow S, Oppenheim A, Oppenheim R (1989) Tools and methods for the improvement of quality. Irwin series in quantitative analysis for business. Taylor & Francis, LondonMATH
Zurück zum Zitat Guo H, Zhang Q, Nandi AK (2008) Feature extraction and dimensionality reduction by genetic programming based on the fisher criterion. Expert Syst 25(5):444–459CrossRef Guo H, Zhang Q, Nandi AK (2008) Feature extraction and dimensionality reduction by genetic programming based on the fisher criterion. Expert Syst 25(5):444–459CrossRef
Zurück zum Zitat Guo PF, Bhattacharya P, Kharma N (2010) Advances in detecting parkinson’s disease. In: Zhang D, Sonka M (eds) Medical biometrics. Lecture notes in computer science, vol 6165. Springer, Berlin, Heidelberg, pp 306–314CrossRef Guo PF, Bhattacharya P, Kharma N (2010) Advances in detecting parkinson’s disease. In: Zhang D, Sonka M (eds) Medical biometrics. Lecture notes in computer science, vol 6165. Springer, Berlin, Heidelberg, pp 306–314CrossRef
Zurück zum Zitat Imai M (1986) Kaizen (Ky’zen), the key to Japan’s competitive success. McGraw-Hill, New York Imai M (1986) Kaizen (Ky’zen), the key to Japan’s competitive success. McGraw-Hill, New York
Zurück zum Zitat Isabelle G, André E, An introduction to feature extraction. In: Guyon I, Gunn S, Nikravesh M, Zadeh LA (eds) Feature extraction: foundations and applications (Studies in Fuzziness and Soft Computing). Springer, Berlin/Heidelberg, pp 1–25. doi:10.1007/978-3-540-35488-8 Isabelle G, André E, An introduction to feature extraction. In: Guyon I, Gunn S, Nikravesh M, Zadeh LA (eds) Feature extraction: foundations and applications (Studies in Fuzziness and Soft Computing). Springer, Berlin/Heidelberg, pp 1–25. doi:10.​1007/​978-3-540-35488-8
Zurück zum Zitat Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms. Wiley, New YorkCrossRefMATH Kantardzic M (2011) Data mining: concepts, models, methods, and algorithms. Wiley, New YorkCrossRefMATH
Zurück zum Zitat Liu H, Motoda H (1998) Feature extraction, construction and selection: a data mining perspective. Springer, BerlinCrossRefMATH Liu H, Motoda H (1998) Feature extraction, construction and selection: a data mining perspective. Springer, BerlinCrossRefMATH
Zurück zum Zitat Miner G, Nisbet R, Elder IVJ (2009) Handbook of statistical analysis and data mining applications. Academic Press, New York Miner G, Nisbet R, Elder IVJ (2009) Handbook of statistical analysis and data mining applications. Academic Press, New York
Zurück zum Zitat Muharram MA, Smith GD (2004) Evolutionary feature construction using information gain and gini index. In: Genetic programming, Springer, pp 379–388 Muharram MA, Smith GD (2004) Evolutionary feature construction using information gain and gini index. In: Genetic programming, Springer, pp 379–388
Zurück zum Zitat Neshatian K, Zhang M, Johnston M (2007) Feature construction and dimension reduction using genetic programming. In: AI 2007: advances in artificial intelligence. Springer, Berlin, pp 160–170CrossRef Neshatian K, Zhang M, Johnston M (2007) Feature construction and dimension reduction using genetic programming. In: AI 2007: advances in artificial intelligence. Springer, Berlin, pp 160–170CrossRef
Zurück zum Zitat Neshatian K, Zhang M, Andreae P (2012) A filter approach to multiple feature construction for symbolic learning classifiers using genetic programming. Trans Evol Comp 16(5):645–661CrossRef Neshatian K, Zhang M, Andreae P (2012) A filter approach to multiple feature construction for symbolic learning classifiers using genetic programming. Trans Evol Comp 16(5):645–661CrossRef
Zurück zum Zitat Nguyen DV, Rocke DM (2004) On partial least squares dimension reduction for microarray-based classification: a simulation study. Comput Stat Data Anal 46(3):407–425MathSciNetCrossRefMATH Nguyen DV, Rocke DM (2004) On partial least squares dimension reduction for microarray-based classification: a simulation study. Comput Stat Data Anal 46(3):407–425MathSciNetCrossRefMATH
Zurück zum Zitat Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetMATH Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in Python. J Mach Learn Res 12:2825–2830MathSciNetMATH
Zurück zum Zitat Schölkopf B, Smola A, Müller KR (1997) Kernel principal component analysis. In: Artificial neural networks–ICANN 97. Springer, Berlin, pp 583–588 Schölkopf B, Smola A, Müller KR (1997) Kernel principal component analysis. In: Artificial neural networks–ICANN 97. Springer, Berlin, pp 583–588
Zurück zum Zitat Smith MG, Bull L (2005) Genetic programming with a genetic algorithm for feature construction and selection. Genet Program Evolvable Mach 6(3):265–281CrossRef Smith MG, Bull L (2005) Genetic programming with a genetic algorithm for feature construction and selection. Genet Program Evolvable Mach 6(3):265–281CrossRef
Zurück zum Zitat Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82CrossRef Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1(1):67–82CrossRef
Zurück zum Zitat Wu SX, Banzhaf W (2010) A hierarchical cooperative evolutionary algorithm. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, GECCO ’10. ACM, New York, pp 233–240 Wu SX, Banzhaf W (2010) A hierarchical cooperative evolutionary algorithm. In: Proceedings of the 12th annual conference on genetic and evolutionary computation, GECCO ’10. ACM, New York, pp 233–240
Zurück zum Zitat Wu SX, Banzhaf W (2011) Rethinking multilevel selection in genetic programming. In: Proceedings of the 13th annual conference on genetic and evolutionary computation, Dublin, pp 1403–1410 Wu SX, Banzhaf W (2011) Rethinking multilevel selection in genetic programming. In: Proceedings of the 13th annual conference on genetic and evolutionary computation, Dublin, pp 1403–1410
Metadaten
Titel
Kaizen Programming for Feature Construction for Classification
verfasst von
Vinícius Veloso de Melo
Wolfgang Banzhaf
Copyright-Jahr
2016
DOI
https://doi.org/10.1007/978-3-319-34223-8_3