Skip to main content
Erschienen in: Natural Computing 4/2008

01.12.2008

Repeated patterns in genetic programming

verfasst von: W. B. Langdon, W. Banzhaf

Erschienen in: Natural Computing | Ausgabe 4/2008

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Evolved genetic programming trees contain many repeated code fragments. Size fair crossover limits bloat in automatic programming, preventing the evolution of recurring motifs. We examine these complex properties in detail using depth vs. size Catalan binary tree shape plots, subgraph and subtree matching, information entropy, sensitivity analysis, syntactic and semantic fitness correlations. Programs evolve in a self-similar fashion, akin to fractal random trees, with diffuse introns. Data mining frequent patterns reveals that as software is progressively improved a large proportion of it is exactly repeated subtrees as well as exactly repeated subgraphs. We relate this emergent phenomenon to building blocks in GP and suggest GP works by jumbling subtrees which already have high fitness on the whole problem to give incremental improvements and create complete solutions with multiple identical components of different importance.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Achaz G, Rocha EPC, Netter P, Coissac E (2002) Origin and fate of repeats in bacteria. Nucleic Acids Res 30:2987–2994CrossRef Achaz G, Rocha EPC, Netter P, Coissac E (2002) Origin and fate of repeats in bacteria. Nucleic Acids Res 30:2987–2994CrossRef
Zurück zum Zitat Banzhaf W, Nordin P, Keller RE, Francone FD (1998) Genetic programming – an introduction. Morgan Kaufmann Banzhaf W, Nordin P, Keller RE, Francone FD (1998) Genetic programming – an introduction. Morgan Kaufmann
Zurück zum Zitat Blickle T (1996) Theory of evolutionary algorithms and application to system synthesis. PhD thesis, Swiss Federal Institute of Technology, Zurich Blickle T (1996) Theory of evolutionary algorithms and application to system synthesis. PhD thesis, Swiss Federal Institute of Technology, Zurich
Zurück zum Zitat Britten RJ, Kohnen DE (1968) Repeated sequences in DNA. Science 161:529–540CrossRef Britten RJ, Kohnen DE (1968) Repeated sequences in DNA. Science 161:529–540CrossRef
Zurück zum Zitat Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press Koza JR (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press
Zurück zum Zitat Langdon WB (1998) Genetic programming and data structures. Kluwer, BostonMATH Langdon WB (1998) Genetic programming and data structures. Kluwer, BostonMATH
Zurück zum Zitat Langdon WB (2000) Size fair and homologous tree genetic programming crossovers. Genet Program Evol Mach 1(1/2):95–119MATHCrossRef Langdon WB (2000) Size fair and homologous tree genetic programming crossovers. Genet Program Evol Mach 1(1/2):95–119MATHCrossRef
Zurück zum Zitat Langdon WB, Banzhaf W (2005a) Repeated sequences in linear genetic programming genomes. Complex Syst 15(4):285–306MathSciNet Langdon WB, Banzhaf W (2005a) Repeated sequences in linear genetic programming genomes. Complex Syst 15(4):285–306MathSciNet
Zurück zum Zitat Langdon WB, Banzhaf W (2005b) Repeated patterns in tree genetic programming. In: Keijzer M, Tettamanzi A, Collet P, van Hemert JI, Tomassini M (eds) Proceedings of the 8th European conference on genetic programming, vol 3447 of Lecture Notes in Computer Science, Lausanne. Springer, Switzerland, pp 190–202 Langdon WB, Banzhaf W (2005b) Repeated patterns in tree genetic programming. In: Keijzer M, Tettamanzi A, Collet P, van Hemert JI, Tomassini M (eds) Proceedings of the 8th European conference on genetic programming, vol 3447 of Lecture Notes in Computer Science, Lausanne. Springer, Switzerland, pp 190–202
Zurück zum Zitat Langdon WB, Barrett SJ (2004) Genetic programming in data mining for drug discovery. In: Ghosh A, Jain LC (eds) Evolutionary computing in data mining, vol 163 of Studies in fuzziness and soft computing, chapter 10. Springer, pp 211– 235 Langdon WB, Barrett SJ (2004) Genetic programming in data mining for drug discovery. In: Ghosh A, Jain LC (eds) Evolutionary computing in data mining, vol 163 of Studies in fuzziness and soft computing, chapter 10. Springer, pp 211– 235
Zurück zum Zitat Langdon WB, Poli R (2002) Foundations of genetic programming. Springer-Verlag Langdon WB, Poli R (2002) Foundations of genetic programming. Springer-Verlag
Zurück zum Zitat Langdon WB, Soule T, Poli R, Foster JA (1999) The evolution of size and shape. In: Spector L, Langdon WB, O’Reilly U-M, Angeline PJ (eds) Advances in genetic programming 3, chapter 8. MIT Press, pp 163–190 Langdon WB, Soule T, Poli R, Foster JA (1999) The evolution of size and shape. In: Spector L, Langdon WB, O’Reilly U-M, Angeline PJ (eds) Advances in genetic programming 3, chapter 8. MIT Press, pp 163–190
Zurück zum Zitat Lupski JR, Weinstock GM (1992) Short, interspersed repetitive DNA sequences in prokaryotic genomes. J Bacteriol 174:4525–4529 Lupski JR, Weinstock GM (1992) Short, interspersed repetitive DNA sequences in prokaryotic genomes. J Bacteriol 174:4525–4529
Zurück zum Zitat Oakley H (1994) Two scientific applications of genetic programming: stack filters and non-linear equation fitting to chaotic data. In: Kinnear Jr KE (ed) Advances in genetic programming, chapter 17. MIT Press, pp 369–389 Oakley H (1994) Two scientific applications of genetic programming: stack filters and non-linear equation fitting to chaotic data. In: Kinnear Jr KE (ed) Advances in genetic programming, chapter 17. MIT Press, pp 369–389
Zurück zum Zitat O’Reilly U-M, Oppacher U-M (1995) The troubling aspects of a building block hypothesis for genetic programming. In: Whitley LD, Vose MD (eds) Foundations of genetic algorithms 3, 31 July–2 August 1994, Estes Park, Colorado, USA. Morgan Kaufmann, pp 73–88 O’Reilly U-M, Oppacher U-M (1995) The troubling aspects of a building block hypothesis for genetic programming. In: Whitley LD, Vose MD (eds) Foundations of genetic algorithms 3, 31 July–2 August 1994, Estes Park, Colorado, USA. Morgan Kaufmann, pp 73–88
Zurück zum Zitat Patience C, Wilkinson DA, Weiss RA (1997) Our retroviral heritage. Trends Genet 13:116–120CrossRef Patience C, Wilkinson DA, Weiss RA (1997) Our retroviral heritage. Trends Genet 13:116–120CrossRef
Zurück zum Zitat Poli R (2003) A simple but theoretically-motivated method to control bloat in genetic programming. In: Ryan C, Soule T, Keijzer M, Tsang E, Poli R, Costa E (eds) Genetic programming, Proceedings of EuroGP’2003, vol 2610 of LNCS, Essex, UK. Springer-Verlag, pp 204–217 Poli R (2003) A simple but theoretically-motivated method to control bloat in genetic programming. In: Ryan C, Soule T, Keijzer M, Tsang E, Poli R, Costa E (eds) Genetic programming, Proceedings of EuroGP’2003, vol 2610 of LNCS, Essex, UK. Springer-Verlag, pp 204–217
Zurück zum Zitat Poli R (2004) TinyGP. See TinyGP GECCO 2004 competition at http://cswww.essex.ac.uk/staff/sml/gecco/TinyGP.html Poli R (2004) TinyGP. See TinyGP GECCO 2004 competition at http://​cswww.​essex.​ac.​uk/​staff/​sml/​gecco/​TinyGP.​html
Zurück zum Zitat Reinhardt A, Hubbard T (1998) Using neural networks for prediction of the subcellular location of proteins. Nucleic Acids Res 26(9):2230–2236CrossRef Reinhardt A, Hubbard T (1998) Using neural networks for prediction of the subcellular location of proteins. Nucleic Acids Res 26(9):2230–2236CrossRef
Zurück zum Zitat Sedgewick R, Flajolet P (1996) An introduction to the analysis of algorithms. Addison-Wesley Sedgewick R, Flajolet P (1996) An introduction to the analysis of algorithms. Addison-Wesley
Zurück zum Zitat Shannon CE and Weaver W (1964) The mathematical theory of communication. The University of Illinois Press, Urbana Shannon CE and Weaver W (1964) The mathematical theory of communication. The University of Illinois Press, Urbana
Zurück zum Zitat Smit AFA (1996) The origin of interspersed repeats in the human genome. Curr Opin Genet Dev 6:743–748CrossRef Smit AFA (1996) The origin of interspersed repeats in the human genome. Curr Opin Genet Dev 6:743–748CrossRef
Zurück zum Zitat Syswerda G (1989) Uniform crossover in genetic algorithms. In Schaffer JD (ed) Proceedings of the third international conference on genetic algorithms, 4–7 June, George Mason University. Morgan Kaufmann, pp 2–9 Syswerda G (1989) Uniform crossover in genetic algorithms. In Schaffer JD (ed) Proceedings of the third international conference on genetic algorithms, 4–7 June, George Mason University. Morgan Kaufmann, pp 2–9
Zurück zum Zitat Toth G, Gaspari Z, Jurka J (2000) Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res 10:967–981CrossRef Toth G, Gaspari Z, Jurka J (2000) Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res 10:967–981CrossRef
Metadaten
Titel
Repeated patterns in genetic programming
verfasst von
W. B. Langdon
W. Banzhaf
Publikationsdatum
01.12.2008
Verlag
Springer Netherlands
Erschienen in
Natural Computing / Ausgabe 4/2008
Print ISSN: 1567-7818
Elektronische ISSN: 1572-9796
DOI
https://doi.org/10.1007/s11047-007-9038-8

Weitere Artikel der Ausgabe 4/2008

Natural Computing 4/2008 Zur Ausgabe

EditorialNotes

Preface

Premium Partner