Skip to main content
Erschienen in: Soft Computing 9/2010

01.07.2010 | Original Paper

Lossless fitness inheritance in genetic algorithms for decision trees

verfasst von: Dimitris Kalles, Athanasios Papagelis

Erschienen in: Soft Computing | Ausgabe 9/2010

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

When genetic algorithms are used to evolve decision trees, key tree quality parameters can be recursively computed and re-used across generations of partially similar decision trees. Simply storing instance indices at leaves is sufficient for fitness to be piecewise computed in a lossless fashion. We show the derivation of the (substantial) expected speedup on two bounding case problems and trace the attractive property of lossless fitness inheritance to the divide-and-conquer nature of decision trees. The theoretical results are supported by experimental evidence.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
1
Note that Costi,leaf is a linear function, hence the approximation refers only to Costi,node. We have calculated the error of the approximation at about 8% for k = 8 (but the error increases with k). However, the approximation is pessimistic and we refer the reader to section 4 for a more detailed discussion on whether such pessimism is really warranted, offering a further at least double-fold improvement.
 
2
We again refer the reader to section 4 where this estimate is substantially improved by partially avoiding some reclassification effort.
 
3
All other GATree parameters were set at their default values, but each experiment was seeded at a different random number seed. We opted to test with different configurations as opposed to running the same experiment several times, each with a new random seed, to indicate the robustness potential of the proposed approach.
 
Literatur
Zurück zum Zitat Altenberg L (1997) NK fitness landscapes. In: Back T, Fogel D, Michalewicz Z (eds) The handbook of evolutionary computation, sect. B2.7.2. Oxford University Press, Oxford Altenberg L (1997) NK fitness landscapes. In: Back T, Fogel D, Michalewicz Z (eds) The handbook of evolutionary computation, sect. B2.7.2. Oxford University Press, Oxford
Zurück zum Zitat Bot M, Langdon W (2000) Application of genetic programming to induction of linear classification trees. In: Proceedings of the genetic and evolutionary computation conference. Las Vegas, NV Bot M, Langdon W (2000) Application of genetic programming to induction of linear classification trees. In: Proceedings of the genetic and evolutionary computation conference. Las Vegas, NV
Zurück zum Zitat Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, BelmontMATH Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth, BelmontMATH
Zurück zum Zitat Cantú-Paz E, Kamath C (2003) Inducing oblique decision trees with evolutionary algorithms. IEEE Trans Evol Comput 7(1):54–68CrossRef Cantú-Paz E, Kamath C (2003) Inducing oblique decision trees with evolutionary algorithms. IEEE Trans Evol Comput 7(1):54–68CrossRef
Zurück zum Zitat Catlett J (1992). Peepholing: choosing attributes effectively for megainduction. In: Proceedings of the 9th international workshop on machine learning, Aberdeen, Scotland, pp 49–54 Catlett J (1992). Peepholing: choosing attributes effectively for megainduction. In: Proceedings of the 9th international workshop on machine learning, Aberdeen, Scotland, pp 49–54
Zurück zum Zitat Chai B, Huang T, Zhuang X, Zhao Y, Sklansky J (1996) Piecewise-linear classifiers using binary tree structure and genetic algorithm. Pattern Recognit 29(11):1905–1917CrossRef Chai B, Huang T, Zhuang X, Zhao Y, Sklansky J (1996) Piecewise-linear classifiers using binary tree structure and genetic algorithm. Pattern Recognit 29(11):1905–1917CrossRef
Zurück zum Zitat Ehrenburg H (1996) Improved directed acyclic graph evaluation and the combine operator in genetic programming. In: Proceedings of the 1st genetic programming conference. Stanford University, CA Ehrenburg H (1996) Improved directed acyclic graph evaluation and the combine operator in genetic programming. In: Proceedings of the 1st genetic programming conference. Stanford University, CA
Zurück zum Zitat Esmeir S, Markovitch S (2007) Anytime learning of decision trees. J Machine Learn Res 8:891–933 Esmeir S, Markovitch S (2007) Anytime learning of decision trees. J Machine Learn Res 8:891–933
Zurück zum Zitat Handley S (1994a) On the use of a directed acyclic graph to represent a population of computer programs. Proceedings of the IEEE World Congress on Computational Intelligence, Orlando, FL, pp 154–159 Handley S (1994a) On the use of a directed acyclic graph to represent a population of computer programs. Proceedings of the IEEE World Congress on Computational Intelligence, Orlando, FL, pp 154–159
Zurück zum Zitat Handley S (1994) On the use of a directed acyclic graph to represent a population of computer programs. In: Proceedings of the IEEE world congress on computational intelligence, Orlando, FL, pp 154–159 Handley S (1994) On the use of a directed acyclic graph to represent a population of computer programs. In: Proceedings of the IEEE world congress on computational intelligence, Orlando, FL, pp 154–159
Zurück zum Zitat Jansen T (1999) On classifications of fitness functions. Technical report CI-76/99. University of Dortmund, Germany Jansen T (1999) On classifications of fitness functions. Technical report CI-76/99. University of Dortmund, Germany
Zurück zum Zitat Jin Y (2005) A comprehensive survey of fitness approximation in evolutionary computation. Soft Comput 9(1):3–12CrossRef Jin Y (2005) A comprehensive survey of fitness approximation in evolutionary computation. Soft Comput 9(1):3–12CrossRef
Zurück zum Zitat Kalles D (1994) Decision trees and domain knowledge in pattern recognition. Dissertation, University of Manchester Institute of Science and Technology, UK Kalles D (1994) Decision trees and domain knowledge in pattern recognition. Dissertation, University of Manchester Institute of Science and Technology, UK
Zurück zum Zitat Kalles D, Papagelis A (2000) Stable decision trees: Using local anarchy for efficient incremental learning. Int J Artif Intell Tools 9(1):79–95CrossRef Kalles D, Papagelis A (2000) Stable decision trees: Using local anarchy for efficient incremental learning. Int J Artif Intell Tools 9(1):79–95CrossRef
Zurück zum Zitat Kalles D, Pierrakeas C (2006) Analyzing student performance in distance learning with genetic algorithms and decision trees. Appl Artif Intell 20(8):655–674CrossRef Kalles D, Pierrakeas C (2006) Analyzing student performance in distance learning with genetic algorithms and decision trees. Appl Artif Intell 20(8):655–674CrossRef
Zurück zum Zitat Koza JR (1991) Concept formation and decision tree induction using the genetic programming paradigm. In: Proceedings of the 1st workshop on parallel problem solving from nature. Springer, Berlin, pp 124–128 Koza JR (1991) Concept formation and decision tree induction using the genetic programming paradigm. In: Proceedings of the 1st workshop on parallel problem solving from nature. Springer, Berlin, pp 124–128
Zurück zum Zitat Krętowski M (2004) An evolutionary algorithm for oblique decision tree induction. In: Proceedings of ICAISC 2004. LNCS, vol 3070. Springer, Berlin, pp 432–437 Krętowski M (2004) An evolutionary algorithm for oblique decision tree induction. In: Proceedings of ICAISC 2004. LNCS, vol 3070. Springer, Berlin, pp 432–437
Zurück zum Zitat Krętowski M, Grześ M (2005) Global learning of decision trees by an evolutionary algorithm. In: Saeed K, Peja J (eds) Information processing and security systems, Springer, Berlin, pp 401–410 Krętowski M, Grześ M (2005) Global learning of decision trees by an evolutionary algorithm. In: Saeed K, Peja J (eds) Information processing and security systems, Springer, Berlin, pp 401–410
Zurück zum Zitat Linton RC (2004) Adapting binary fitness functions in genetic algorithms. In: Proceedings of the 42nd ACM southeast regional conference, Huntsville, AL, pp 391–395 Linton RC (2004) Adapting binary fitness functions in genetic algorithms. In: Proceedings of the 42nd ACM southeast regional conference, Huntsville, AL, pp 391–395
Zurück zum Zitat Mansour Y, McAllester D (2000) Generalization bounds for decision trees. In: Proceedings of the 13th annual conference on computer learning theory, San Francisco, CA, pp 69–80 Mansour Y, McAllester D (2000) Generalization bounds for decision trees. In: Proceedings of the 13th annual conference on computer learning theory, San Francisco, CA, pp 69–80
Zurück zum Zitat Meisel WS, Michalopoulos DA (1973) A partitioning algorithm with application in pattern classification and the optimization of decision trees. IEEE Trans Comput 22(1):93–103MATHCrossRef Meisel WS, Michalopoulos DA (1973) A partitioning algorithm with application in pattern classification and the optimization of decision trees. IEEE Trans Comput 22(1):93–103MATHCrossRef
Zurück zum Zitat Mitchell M (1996) An introduction to genetic algorithms. MIT Press, Cambridge Mitchell M (1996) An introduction to genetic algorithms. MIT Press, Cambridge
Zurück zum Zitat Musick R, Catlett J, Russel S (1993) Decision theoretic subsampling for induction on large databases. In: Proceedings of the 10th international conference on machine learning, Amherst, MA, pp 212–219 Musick R, Catlett J, Russel S (1993) Decision theoretic subsampling for induction on large databases. In: Proceedings of the 10th international conference on machine learning, Amherst, MA, pp 212–219
Zurück zum Zitat Naumov GE (1991) NP-completeness of problems of construction of optimal decision trees. Sov Phys Doklady 36(4):270–271MATHMathSciNet Naumov GE (1991) NP-completeness of problems of construction of optimal decision trees. Sov Phys Doklady 36(4):270–271MATHMathSciNet
Zurück zum Zitat Nikolaev N, Slavov V (1998) Inductive genetic programming with decision trees. Intell Data Anal 2(1):31–40CrossRef Nikolaev N, Slavov V (1998) Inductive genetic programming with decision trees. Intell Data Anal 2(1):31–40CrossRef
Zurück zum Zitat Pagallo G (1989) Learning DNF by decision trees. In: Proceedings of the 11th international joint conference on artificial intelligence, pp 639–644 Pagallo G (1989) Learning DNF by decision trees. In: Proceedings of the 11th international joint conference on artificial intelligence, pp 639–644
Zurück zum Zitat Papagelis A, Kalles D (2001) Breeding decision trees using evolutionary techniques. In: Proceedings of the 18th international conference on machine learning, Williamstown, MA, pp 393–400 Papagelis A, Kalles D (2001) Breeding decision trees using evolutionary techniques. In: Proceedings of the 18th international conference on machine learning, Williamstown, MA, pp 393–400
Zurück zum Zitat Pelikan M, Sastry K (2004) Fitness inheritance in the Bayesian optimization algorithm. In: Proceedings of the genetic and evolutionary computation conference, Seattle, WA, pp 48–59 Pelikan M, Sastry K (2004) Fitness inheritance in the Bayesian optimization algorithm. In: Proceedings of the genetic and evolutionary computation conference, Seattle, WA, pp 48–59
Zurück zum Zitat Quinlan JR (1986) Induction of decision trees. Machine Learn 1(1):81–106 Quinlan JR (1986) Induction of decision trees. Machine Learn 1(1):81–106
Zurück zum Zitat Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann, San Mateo
Zurück zum Zitat Quinlan JR, Rivest RL (1989) Inferring decision trees using the Minimum Description Length principle. Inf Comput 80(3):227–248MATHCrossRefMathSciNet Quinlan JR, Rivest RL (1989) Inferring decision trees using the Minimum Description Length principle. Inf Comput 80(3):227–248MATHCrossRefMathSciNet
Zurück zum Zitat Roberts MA (2003) The effectiveness of cost based sub-tree caching mechanisms in typed genetic programming for image segmentation. In: Proceedings of EvoIASP, applications of evolutionary computation. LNCS, vol 2611, Essex, UK, pp 444–454 Roberts MA (2003) The effectiveness of cost based sub-tree caching mechanisms in typed genetic programming for image segmentation. In: Proceedings of EvoIASP, applications of evolutionary computation. LNCS, vol 2611, Essex, UK, pp 444–454
Zurück zum Zitat Rokach L (2008) Genetic algorithm-based feature set partitioning for classification problems. Pattern Recognit 41(5):1693–1717CrossRef Rokach L (2008) Genetic algorithm-based feature set partitioning for classification problems. Pattern Recognit 41(5):1693–1717CrossRef
Zurück zum Zitat Rokach L, Maimon O (2008) Data mining with decision trees: theory and applications. World Scientific, SingaporeMATH Rokach L, Maimon O (2008) Data mining with decision trees: theory and applications. World Scientific, SingaporeMATH
Zurück zum Zitat Sastry K, Goldberg DE, Pelikan M (2001) Don’t evaluate, inherit. In: Proceedings of the genetic and evolutionary computation conference, San Francisco, CA, pp 551–558 Sastry K, Goldberg DE, Pelikan M (2001) Don’t evaluate, inherit. In: Proceedings of the genetic and evolutionary computation conference, San Francisco, CA, pp 551–558
Zurück zum Zitat Smith RE, Dike BA, Stegmann SA (1995) Fitness inheritance in genetic algorithms. In: Proceedings of the symposium on applied computing, Nashville, TN, pp 345–350 Smith RE, Dike BA, Stegmann SA (1995) Fitness inheritance in genetic algorithms. In: Proceedings of the symposium on applied computing, Nashville, TN, pp 345–350
Zurück zum Zitat Teller A, Andre D (1997) Automatically choosing the number of fitness cases: The rational allocation of trials. In: Proceedings of the 2nd annual conference on genetic programming, Stanford University, CA, pp 321–328 Teller A, Andre D (1997) Automatically choosing the number of fitness cases: The rational allocation of trials. In: Proceedings of the 2nd annual conference on genetic programming, Stanford University, CA, pp 321–328
Zurück zum Zitat Turney PD (1995) Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res 2:369–409 Turney PD (1995) Cost-sensitive classification: empirical evaluation of a hybrid genetic decision tree induction algorithm. J Artif Intell Res 2:369–409
Zurück zum Zitat Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco
Zurück zum Zitat Woodward JR (2006) Complexity and Cartesian genetic programming. In: Proceedings of the 9th European conference on genetic programming. Lecture notes in computer science, vol 3905, Budapest, Hungary, pp 260–269 Woodward JR (2006) Complexity and Cartesian genetic programming. In: Proceedings of the 9th European conference on genetic programming. Lecture notes in computer science, vol 3905, Budapest, Hungary, pp 260–269
Zurück zum Zitat Zhang B-T, Joung J-G (1999) Genetic programming with incremental data inheritance. In: Proceedings of the Genetic and Evolutionary Computation Conference, Orlando, FL, pp 1217–1224 Zhang B-T, Joung J-G (1999) Genetic programming with incremental data inheritance. In: Proceedings of the Genetic and Evolutionary Computation Conference, Orlando, FL, pp 1217–1224
Zurück zum Zitat Zygounas S (2004) Using genetic algorithms in the construction of decision trees. Dissertation (in Greek), Hellenic Open University, Greece Zygounas S (2004) Using genetic algorithms in the construction of decision trees. Dissertation (in Greek), Hellenic Open University, Greece
Metadaten
Titel
Lossless fitness inheritance in genetic algorithms for decision trees
verfasst von
Dimitris Kalles
Athanasios Papagelis
Publikationsdatum
01.07.2010
Verlag
Springer-Verlag
Erschienen in
Soft Computing / Ausgabe 9/2010
Print ISSN: 1432-7643
Elektronische ISSN: 1433-7479
DOI
https://doi.org/10.1007/s00500-009-0489-y

Weitere Artikel der Ausgabe 9/2010

Soft Computing 9/2010 Zur Ausgabe