Foundations of Genetic Programming | springerprofessional.de

Springer Professional

nach oben

2002 | Buch

Kapitel lesen Erstes Kapitel lesen

Foundations of Genetic Programming

verfasst von: William B. Langdon, Riccardo Poli

Verlag: Springer Berlin Heidelberg

Enthalten in: Professional Book Archive

Einloggen, um Zugang zu erhalten

Über dieses Buch

Genetic programming (GP), one of the most advanced forms of evolutionary computation, has been highly successful as a technique for getting computers to automatically solve problems without having to tell them explicitly how. Since its inceptions more than ten years ago, GP has been used to solve practical problems in a variety of application fields. Along with this ad-hoc engineering approaches interest increased in how and why GP works. This book provides a coherent consolidation of recent work on the theoretical foundations of GP. A concise introduction to GP and genetic algorithms (GA) is followed by a discussion of fitness landscapes and other theoretical approaches to natural and artificial evolution. Having surveyed early approaches to GP theory it presents new exact schema analysis, showing that it applies to GP as well as to the simpler GAs. New results on the potentially infinite number of possible programs are followed by two chapters applying these new techniques.

Anzeige

Inhaltsverzeichnis

Frontmatter

1. Introduction

Abstract

Living creatures are divided into species. Over time (which may be comparatively short or very long) species change, i.e. they evolve. There are several essential components to natural evolution. Individuals (e.g. animals or plants) within a species population are different. As a result, some live longer and are more likely to have children that survive to adulthood than others (natural selection). These animals (or plants etc.) are said to be fitter (cf. “survival of the fittest”). Sometimes the variation has a genetic component, i.e. the children may inherit it from their parents. Consequently after a number of generations the proportion of individuals within the species with this favourable inheritable characteristic tends to increase. That is, over time the species as a whole changes or evolves.

William B. Langdon, Riccardo Poli

2. Fitness Landscapes

Abstract

The idea of fitness landscapes, firstly proposed in genetics by Sewall Wright [Wright, 1932], is well established in artificial evolution. They are a metaphor used to help visualise problems. In the metaphor landscapes are viewed as being like an area of countryside. Looking vertically down from a great height the countryside appears like a map. This corresponds to the area to be searched. That is, the plan view is the search space.

William B. Langdon, Riccardo Poli

3. Program Component Schema Theories

Abstract

Since John Holland’s work in the mid-1970s and his well known schema theorem [Holland, 1975, Goldberg, 1989b], schemata are often used to explain why genetic algorithms work. Schemata are similarity templates representing entire groups of chromosomes. The schema theorem describes how schemata are expected to propagate generation after generation under the effects of selection, crossover and mutation. The usefulness of schemata has been recently criticised (see for example [Altenberg, 1995, Macready and Wolpert, 1996, Kargupta, 1995, Kargupta and Goldberg, 1994]) and many researchers nowadays believe that Holland’s schema theorem is nothing more than a trivial tautology of no use whatsoever (see for example [Vose, 1999, preface]). However, as stated in [Radcliffe, 1997], the problem with Holland’s schema theorem is not the theorem itself, rather its over-interpretation.

William B. Langdon, Riccardo Poli

4. Pessimistic GP Schema Theories

Abstract

In Chapter 3 we described early schema theories for genetic programming These are based on the idea of schemata as components within the program tree. In this chapter we will consider two GP schema theories of a different kind. These schemata have the advantage that not only can they be considered as parts of programs but, also that they are true subsets of the search space. In Section 4.1 we summarise the schema theory proposed by Justinian Rosca [Rosca, 1997a] that is applicable to GP with standard crossover. How ever, the main focus of the chapter is on our schema theory. This is based on a concept of schema which is much closer to the original concept of schema in genetic algorithms and it is applicable to GP (and to different kinds of GAs) when one-point crossover is used This is a generalisation of the corresponding GA operator in which the same crossover point is selected in both parents.

William B. Langdon, Riccardo Poli

5. Exact GP Schema Theorems

Abstract

In Chapter 4 we introduced pessimistic schema theories for genetic programming in which schemata are treated as subsets of the search space. These two theories (Rosca’s and ours) do not take schema creation into account and consider only the worst-case scenario for schema disruption. As a result they only provide lower bounds for the expected number of members of the population matching a given schema at the next generation.

William B. Langdon, Riccardo Poli

6. Lessons from the GP Schema Theory

Abstract

This chapter considers different ways in which we can use the schema theory described in the previous chapters to better understand the dynamics of genetic programming populations. We will also illustrate the possible uses of the schema theory to clarify concepts such as effective fitness (next section), biases in GP (Section 6.2), building blocks (Section 6.3), problem hardness and deception (Section 6.5). We also discuss new ideas inspired by schema theories (Section 6.4).

William B. Langdon, Riccardo Poli

7. The Genetic Programming Search Space

Abstract

We shall investigate how the space of all possible programs, i e the space which genetic programming searches, scales. Particularly how it changes with respect to the size of programs We will show, in general, that above some problem dependent threshold, considering all programs, their fitness shows little variation with their size. The distribution of fitness levels, particularly the distribution of solutions, gives us directly the performance of random search. We can use this as a benchmark against which to compare GP and other techniques. These results are demonstrated in this chapter using a combination of enumeration and Monte Carlo sampling. For the interested reader, formal proofs are given in Chapter 8. Informal; arguments are presented in Section 7.5 to extend our results to modular GP, memory and Turing complete GP. Section 7.6 considers the relationship between tree size and depth for various types of program. We finish this chapter with a discussion of these results and their implications.

William B. Langdon, Riccardo Poli

8. The GP Search Space: Theoretical Analysis

Abstract

We start this chapter by formally proving results given in Chapter 7 for linear and tree based GP. The rest of the chapter gives proofs for other properties of program search spaces. The implications of these results have been previously discussed in Section 7.7. For the reader who does not wish to study the proofs in detail, the main results are summarised next.

William B. Langdon, Riccardo Poli

9. Example I: The Artificial Ant

Abstract

In genetic algorithms, deception (i.e. false peaks in the fitness landscape) has long been recognised as an important barrier to performance however in genetic programming it has not received the recognition it deserves. In this and the next chapter we study two benchmark problems using the techniques described in the previous chapters. These show that deception is one of the reasons why these problems appear to be hard.

William B. Langdon, Riccardo Poli

10. Example II: The Max Problem

Abstract

In this second chapter on the application of our analysis techniques, we consider a second GP benchmark problem: the “Max Problem”. Briefly, this problem is to find a program, subject to a size or depth bound, which produces the largest value [Gathercole, 1998]. This problem is unusual in several respects: we know in advance what the solution is, what value it returns and and exactly how many solutions there are. Also, the size or depth limit is fundamental to the problem. MAX is hard for GP because of the way deception interacts with size and depth bounds to make it difficult to evolve solutions.

William B. Langdon, Riccardo Poli

11. GP Convergence and Bloat

Abstract

While genetic programming with one-point crossover behaves like a genetic algorithm (see Sections 4.4.6–4.4.8), the question of convergence in genetic programming, with standard subtree crossover has a mixed history. Using experimental evidence, we will suggest that bloat in GP is a manifestation of convergence.

William B. Langdon, Riccardo Poli

12. Conclusions

Abstract

The role of this concluding chapter is twofold: to summarise the major contributions but also to draw lessons.

William B. Langdon, Riccardo Poli

Backmatter

Titel: Foundations of Genetic Programming
verfasst von: William B. Langdon
Riccardo Poli
Copyright-Jahr: 2002
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-662-04726-2
Print ISBN: 978-3-642-07632-9
DOI: https://doi.org/10.1007/978-3-662-04726-2

Weitere Formate
Fachgebiete
Bücher
Zeitschriften
Themenseiten
Jetzt Einzelzugang starten
Zugang für Unternehmen
Referenzkunden
SLX-Digitalkonferenz: Zukunftswerkstatt 2024