nach oben

2007 | Buch

Kapitel lesen Erstes Kapitel lesen

Genetic Programming

10th European Conference, EuroGP 2007, Valencia, Spain, April 11-13, 2007. Proceedings

herausgegeben von: Marc Ebner, Michael O’Neill, Anikó Ekárt, Leonardo Vanneschi, Anna Isabel Esparcia-Alcázar

Verlag: Springer Berlin Heidelberg

Buchreihe : Lecture Notes in Computer Science

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Inhaltsverzeichnis

Frontmatter

Plenary Talks

A Grammatical Genetic Programming Approach to Modularity in Genetic Algorithms

The ability of Genetic Programming to scale to problems of increasing difficulty operates on the premise that it is possible to capture regularities that exist in a problem environment by decomposition of the problem into a hierarchy of modules. As computer scientists and more generally as humans we tend to adopt a similar divide-and-conquer strategy in our problem solving. In this paper we consider the adoption of such a strategy for Genetic Algorithms. By adopting a modular representation in a Genetic Algorithm we can make efficiency gains that enable superior scaling characteristics to problems of increasing size. We present a comparison of two modular Genetic Algorithms, one of which is a Grammatical Genetic Programming algorithm, the meta-Grammar Genetic Algorithm (mGGA), which generates binary string sentences instead of traditional GP trees. A number of problems instances are tackled which extend the Checkerboard problem by introducing different kinds of regularity and noise. The results demonstrate some limitations of the modular GA (MGA) representation and how the mGGA can overcome these. The mGGA shows improved scaling when compared the MGA.

Erik Hemberg, Conor Gilligan, Michael O’Neill, Anthony Brabazon

An Empirical Boosting Scheme for ROC-Based Genetic Programming Classifiers

The so-called “boosting” principle was introduced by Schapire and Freund in the 1990s in relation to weak learners in the Probably Approximately Correct computational learning framework. Another practice that has developed in recent years consists in assessing the quality of evolutionary or genetic classifiers with Receiver Operating Characteristics (ROC) curves. Following the RankBoost algorithm by Freund et al., this article is a cross-bridge between these two techniques, and deals about boosting ROC-based genetic programming classifiers. Updating the weights after a boosting round turns to be the algorithm keystone since the ROC curve does not allow to know directly which training cases are learned or misclassified. We propose a geometrical interpretation of the ROC curve to attribute an error measure to every training case. We validate our ROCboost algorithm on several benchmarks from the UCI-Irvine repository, and we compare boosted Genetic Programming performance with published results on ROC-based Evolution Strategies and Support Vector Machines.

Denis Robilliard, Virginie Marion-Poty, Sébastien Mahler, Cyril Fonlupt

Confidence Intervals for Computational Effort Comparisons

When researchers make alterations to the genetic programming algorithm they almost invariably wish to measure the change in performance of the evolutionary system. No one specific measure is standard, but Koza’s computational effort statistic is frequently used [8]. In this paper the use of Koza’s statistic is discussed and a study is made of three methods that produce confidence intervals for the statistic. It is found that an approximate 95% confidence interval can be easily produced.

Matthew Walker, Howard Edwards, Chris Messom

Crossover Bias in Genetic Programming

Path length, or search complexity, is an understudied property of trees in genetic programming. Unlike size and depth measures, path length directly measures the balancedness or skewedness of a tree. Here a close relative to path length, called visitation length, is studied. It is shown that a population undergoing standard crossover will introduce a crossover bias in the visitation length. This bias is due to inserting variable length subtrees at various levels of the tree. The crossover bias takes the form of a covariance between the sizes and levels in the trees that form a population. It is conjectured that the crossover bias directly determines the size distribution of trees in genetic programming. Theorems are presented for the one-generation evolution of visitation length both with and without selection. The connection between path length and visitation length is made explicit.

Maarten Keijzer, James Foster

Density Estimation with Genetic Programming for Inverse Problem Solving

This paper addresses the resolution, by Genetic Programming (GP) methods, of ambiguous inverse problems, where for a single input, many outputs can be expected. We propose two approaches to tackle this kind of many-to-one inversion problems, each of them based on the estimation, by a team of predictors, of a probability density of the expected outputs. In the first one, Stochastic Realisation GP, the predictors outputs are considered as the realisations of an unknown random variable which distribution should approach the expected one. The second one, Mixture Density GP, directly models the expected distribution by the mean of a Gaussian mixture model, for which genetic programming has to find the parameters. Encouraging results are obtained on four test problems of different difficulty, exhibiting the interests of such methods.

Michael Defoin Platel, Sébastien Vérel, Manuel Clergue, Malik Chami

Empirical Analysis of GP Tree-Fragments

Researchers have attempted to explain the power of Genetic Programming (GP) search using various notions of schema. However empirical studies of schemas have been limited due to their vast numbers in typical populations. This paper addresses the problem of analyzing schemas represented by tree-fragments. It describes a new efficient way of representing the huge sets of fragments in a population of GP programs and presents an algorithm to find all fragments using this efficient representation. Using this algorithm, the paper presents an initial analysis of fragments in populations of up to 300 programs, each up to seven nodes deep. The analysis demonstrates a surprisingly large variation in the numbers of fragments through evolution and a non-monotonic rise in the most useful fragments. With his method, empirical investigation of the GP building block hypothesis and schema theory in realistic sized GP systems becomes possible.

Will Smart, Peter Andreae, Mengjie Zhang

Empirical Comparison of Evolutionary Representations of the Inverse Problem for Iterated Function Systems

In this paper we present an empirical comparison between evolutionary representations for the resolution of the inverse problem for iterated function systems (IFS). We introduce a class of problem instances that can be used for the comparison of the inverse IFS problem as well as a novel technique that aids exploratory analysis of experiment data. Our comparison suggests that representations that exploit problem specific information, apart from quality/fitness feedback, perform better for the resolution of the inverse problem for IFS.

Anargyros Sarafopoulos, Bernard Buxton

Evolution of an Efficient Search Algorithm for the Mate-In-N Problem in Chess

We propose an approach for developing efficient search algorithms through genetic programming. Focusing on the game of chess we evolve entire game-tree search algorithms to solve the

Mate-In-N

problem: find a key move such that even with the best possible counterplays, the opponent cannot avoid being mated in (or before) move

. We show that our evolved search algorithms successfully solve several instances of the Mate-In-N problem, for the hardest ones developing 47% less game-tree nodes than CRAFTY—a state-of-the-art chess engine with a ranking of 2614 points. Improvement is thus not over the basic alpha-beta algorithm, but over a world-class program using all standard enhancements.

Ami Hauptman, Moshe Sipper

Fast Genetic Programming on GPUs

As is typical in evolutionary algorithms, fitness evaluation in GP takes the majority of the computational effort. In this paper we demonstrate the use of the Graphics Processing Unit (GPU) to accelerate the evaluation of individuals. We show that for both binary and floating point based data types, it is possible to get speed increases of several hundred times over a typical CPU implementation. This allows for evaluation of many thousands of fitness cases, and hence should enable more ambitious solutions to be evolved using GP.

Simon Harding, Wolfgang Banzhaf

FIFTHTM: A Stack Based GP Language for Vector Processing

FIFTH

, a new stack-based genetic programming language, efficiently expresses solutions to a large class of feature recognition problems. This problem class includes mining time-series data, classification of multivariate data, image segmentation, and digital signal processing (DSP). FIFTH is based on FORTH principles. Key features of FIFTH are a single data stack for all data types and support for vectors and matrices as single stack elements. We demonstrate that the language characteristics allow simple and elegant representation of signal processing algorithms while maintaining the rules necessary to automatically evolve stack correct and control flow correct programs. FIFTH supports all essential program architecture constructs such as automatically defined functions, loops, branches, and variable storage. An XML configuration file provides easy selection from a rich set of operators, including domain specific functions such as the Fourier transform (FFT). The fully-distributed FIFTH environment (GPE5) uses CORBA for its underlying process communication.

Kenneth Holladay, Kay Robbins, Jeffery von Ronne

Genetic Programming with Fitness Based on Model Checking

Model checking is a way of analysing programs and program-like structures to decide whether they satisfy a list of temporal logic statements describing desired behaviour. In this paper we apply this to the fitness checking stage in an evolution strategy for learning finite state machines. We give experimental results consisting of learning the control program for a vending machine.

Colin G. Johnson

Geometric Particle Swarm Optimisation

Using a geometric framework for the interpretation of crossover of recent introduction, we show an intimate connection between particle swarm optimization (PSO) and evolutionary algorithms. This connection enables us to generalize PSO to virtually any solution representation in a natural and straightforward way. We demonstrate this for the cases of Euclidean, Manhattan and Hamming spaces.

Alberto Moraglio, Cecilia Di Chio, Riccardo Poli

GP Classifier Problem Decomposition Using First-Price and Second-Price Auctions

This work details an auction-based model for problem decomposition in Genetic Programming classification. The approach builds on the population-based methodology of Genetic Programming to evolve individuals that bid high for patterns that they can correctly classify. The model returns a set of individuals that decompose the problem by way of this bidding process and is directly applicable to multi-class domains. An investigation of two auction types emphasizes the effect of auction design on the properties of the resulting solution. The work demonstrates that auctions are an effective mechanism for problem decomposition in classification problems and that Genetic Programming is an effective means of evolving the underlying bidding behaviour.

Peter Lichodzijewski, Malcolm I. Heywood

Layered Learning in Boolean GP Problems

Layered learning is a decomposition and reuse technique that has proved to be effective in the evolutionary solution of difficult problems. Although previous work has integrated it with genetic programming (GP), much of the application of that research has been in relation to multi-agent systems. In extending this work, we have applied it to more conventional GP problems, specifically those involving Boolean logic. We have identified two approaches which, unlike previous methods, do not require prior understanding of a problem’s functional decomposition into sub-goals. Experimentation indicates that although one of the two approaches offers little advantage, the other leads to solution-finding performance significantly surpassing that of both conventional GP systems and those which incorporate automatically defined functions.

David Jackson, Adrian P. Gibbons

Mining Distributed Evolving Data Streams Using Fractal GP Ensembles

A Genetic Programming based boosting ensemble method for the classification of distributed streaming data is proposed. The approach handles flows of data coming from multiple locations by building a global model obtained by the aggregation of the local models coming from each node. A main characteristics of the algorithm presented is its adaptability in presence of concept drift. Changes in data can cause serious deterioration of the ensemble performance. Our approach is able to discover changes by adopting a strategy based on self-similarity of the ensemble behavior, measured by its fractal dimension, and to revise itself by promptly restoring classification accuracy. Experimental results on a synthetic data set show the validity of the approach in maintaining an accurate and up-to-date GP ensemble.

Gianluigi Folino, Clara Pizzuti, Giandomenico Spezzano

Multi-objective Genetic Programming for Improving the Performance of TCP

TCP is one of the fundamental components of the Internet. The performance of TCP is heavily dependent on the quality of its

round-trip time (RTT) estimator

, i.e. the formula that predicts dynamically the delay experienced by packets along a network connection. In this paper we apply

multi-objective genetic programming

for constructing an RTT estimator. We used two different approaches for multi-objective optimization and a collection of real traces collected at the mail server of our University. The solutions that we found outperform the RTT estimator currently used by all TCP implementations. This result could lead to several applications of genetic programming in the networking field.

Cyril Fillon, Alberto Bartoli

On Population Size and Neutrality: Facilitating the Evolution of Evolvability

The role of population size is investigated within a neutrality induced local optima free search space. Neutrality decouples genotypic variation in evolvability from fitness variation. Population diversity and neutrality work in conjunction to facilitate evolvability exploration whilst restraining its loss to drift, ultimately facilitating the evolution of evolvability. The characterising dynamics and implications are discussed.

Richard M. Downing

On the Limiting Distribution of Program Sizes in Tree-Based Genetic Programming

We provide strong theoretical and experimental evidence that standard sub-tree crossover with uniform selection of crossover points pushes a population of

-ary GP trees towards a distribution of tree sizes of the form:

$$ \Pr\{n\}= (1-ap_a) {a n+1 \choose n} \, (1-p_a)^{(a-1)n+1}\, p_a^{n} $$

where

is the number of internal nodes in a tree and

is a constant. This result generalises the result previously reported for the case

= 1.

Riccardo Poli, William B. Langdon, Stephen Dignum

Predicting Prime Numbers Using Cartesian Genetic Programming

Prime generating polynomial functions are known that can produce sequences of prime numbers (e.g. Euler polynomials). However, polynomials which produce consecutive prime numbers are much more difficult to obtain. In this paper, we propose approaches for both these problems. The first uses Cartesian Genetic Programming (CGP) to directly evolve integer based prime-prediction mathematical formulae. The second uses multi-chromosome CGP to evolve a digital circuit, which represents a polynomial. We evolved polynomials that can generate 43 primes in a row. We also found functions capable of producing the first 40 consecutive prime numbers, and a number of digital circuits capable of predicting up to 208 consecutive prime numbers, given consecutive input values. Many of the formulae have been previously unknown.

James Alfred Walker, Julian Francis Miller

Real-Time, Non-intrusive Evaluation of VoIP

Speech quality, as perceived by the users of Voice over Internet Protocol (VoIP) telephony, is critically important to the uptake of this service. VoIP quality can be degraded by network layer problems (delay, jitter, packet loss). This paper presents a method for real-time, non-intrusive speech quality estimation for VoIP that emulates the

subjective

listening quality measures based on

Mean Opinion Scores

(MOS). MOS provide the numerical indication of perceived quality of speech. We employ a Genetic Programming based symbolic regression approach to derive a speech quality estimation model. Our results compare favorably with the International Telecommunications Union-Telecommunication Standardization (ITU-T) PESQ algorithm which is the most widely accepted standard for speech quality estimation. Moreover, our model is suitable for real-time speech quality estimation of VoIP while PESQ is not. The performance of the proposed model was also compared to the new ITU-T recommendation P.563 for non-intrusive speech quality estimation and an improved performance was observed.

Adil Raja, R. Muhammad Atif Azad, Colin Flanagan, Conor Ryan

Training Binary GP Classifiers Efficiently: A Pareto-coevolutionary Approach

The conversion and extension of the Incremental Pareto-Coevolution Archive algorithm (IPCA) into the domain of Genetic Programming classification is presented. In particular, the coevolutionary aspect of the IPCA algorithm is utilized to simultaneously evolve a subset of the training data that provides distinctions between candidate classifiers. Empirical results indicate that such a scheme significantly reduces the computational overhead of fitness evaluation on large binary classification data sets. Moreover, unlike the performance of GP classifiers trained using alternative subset selection algorithms, the proposed Pareto-coevolutionary approach is able to match or better the classification performance of GP trained over all training exemplars. Finally, problem decomposition appears as a natural consequence of assuming a Pareto model for coevolution. In order to make use of this property a voting scheme is used to integrate the results of all classifiers from the Pareto front, post training.

Michal Lemczyk, Malcolm I. Heywood

Posters

A Comprehensive View of Fitness Landscapes with Neutrality and Fitness Clouds

We define a set of measures that capture some different aspects of neutrality in evolutionary algorithms fitness landscapes from a qualitative point of view. If considered all together, these measures offer a rather complete picture of the characteristics of fitness landscapes bound to neutrality and may be used as broad indicators of problem hardness. We compare the results returned by these measures with the ones of negative slope coefficient, a quantitative measure of problem hardness that has been recently defined and with success rate statistics on a well known genetic programming benchmark: the multiplexer problem. In order to efficaciously study the search space, we use a sampling technique that has recently been introduced and we show its suitability on this problem.

Leonardo Vanneschi, Marco Tomassini, Philippe Collard, Sébastien Vérel, Yuri Pirola, Giancarlo Mauri

Analysing the Regularity of Genomes Using Compression and Expression Simplification

We propose expression simplification and tree compression as aids in understanding the evolution of regular structure in Genetic Programming individuals. We apply the analysis to two previously-published algorithms, which aimed to promote regular and repeated structure. One relies on subtree duplication operators, the other uses repeated evaluation during a developmental process. Both successfully generated solutions to difficult problems, their success being ascribed to promotion of regular structure. Our analysis modifies this ascription: the evolution of regular structure is more complex than anticipated, and the success of the techniques may have arisen from a combination of promotion of regularity, and other, so far unidentified, effects.

Jungseok Shin, Moonyoung Kang, R. I. (Bob) McKay, Xuan Nguyen, Tuan-Hao Hoang, Naoki Mori, Daryl Essam

Changing the Genospace: Solving GA Problems with Cartesian Genetic Programming

Embedded Cartesian Genetic Programming (ECGP) is an extension of Cartesian Genetic Programming (CGP) capable of acquiring, evolving and re-using partial solutions. In this paper, we apply for the first time CGP and ECGP to the ones-max and order-3 deceptive problems, which are normally associated with Genetic Algorithms. Our approach uses CGP and ECGP to evolve a sequence of commands for a tape-head, which produces an arbitrary length binary string on a piece of tape. Computational effort figures are calculated for CGP and ECGP and our results compare favourably with those of Genetic Algorithms.

James Alfred Walker, Julian Francis Miller

Code Regulation in Open Ended Evolution

We explore a homeostatic approach to program execution in computer systems: the “concentration” of computation services is regulated according to their fitness. The goal is to obtain a self-healing effect so that the system can resist harmful mutations that could happen during on-line evolution. We present a model in which alternative program variants are stored in a repository representing the organism’s “genotype”. Positive feedback signals allow code in the repository to be

expressed

(in analogy to gene expression in biology), meaning that it is injected into a reaction vessel (execution environment) where it is executed and evaluated. Since execution is equivalent to a chemical reaction, the program is consumed in the process, therefore needs more feedback in order to be re-expressed. This leads to services that constantly regulate themselves to a stable condition given by the fitness feedback received from the users or the environment. We present initial experiments using this model, implemented using a chemical computing language.

Lidia Yamamoto

Data Mining of Genetic Programming Run Logs

We have applied a range of data mining techniques to a data base of log file records created from genetic programming runs on twelve different problems. We have looked for unexpected patterns, or golden nuggets in the data. Six were found. The main discoveries were a surprising amount of evaluation of duplicate programs across the twelve problems and one case of pathological behaviour which suggested a review of the genetic programming configuration. For problems with expensive fitness evaluation, the results suggest that there would be considerable speedup by caching evolved programs and fitness values. A data mining analysis performed routinely in a GP application could identify problems early and lead to more effective genetic programming applications.

Vic Ciesielski, Xiang Li

Evolving a Statistics Class Using Object Oriented Evolutionary Programming

Object Oriented Evolutionary Programming is used to evolve programs that calculate some statistical measures on a set of numbers. We compared this technique with a more standard functional representation. We also studied the effects of scalar and Pareto-based multi-objective fitness functions to the induction of multi-task programs. We found that the induction of a program residing in an OO representation space is more efficient, yielding less fitness evaluations, and that scalar fitness performed better than Pareto-based fitness in this problem domain.

Alexandros Agapitos, Simon M. Lucas

Evolving Modular Recursive Sorting Algorithms

A fundamental issue in evolutionary learning is the definition of the solution representation language. We present the application of Object Oriented Genetic Programming to the task of coevolving general recursive sorting algorithms along with their primitive representation alphabet. We report the computational effort required to evolve target solutions and provide a comparison between crossover and mutation variation operators, and also undirected random search. We found that the induction of evolved method signatures (typed parameters and return type) can be realized through an evolutionary fitness-driven process. We also found that the evolutionary algorithm outperformed undirected random search, and that mutation performed better than crossover in this problem domain. The main result is that modular sorting algorithms can be evolved.

Alexandros Agapitos, Simon M. Lucas

Fitness Landscape Analysis and Image Filter Evolution Using Functional-Level CGP

This work analyzes fitness landscapes for the image filter design problem approached using functional-level Cartesian Genetic Programming. Smoothness and ruggedness of fitness landscapes are investigated for five genetic operators. It is shown that the mutation operator and the single-point crossover operator generate the smoothest landscapes and thus they are useful for practical applications in this area. In contrast to the gate-level evolution, a destructive behavior of a simple crossover operator has not been confirmed.

Karel Slaný, Lukáš Sekanina

Genetic Programming Heuristics for Multiple Machine Scheduling

In this paper we present a method for creating scheduling heuristics for parallel proportional machine scheduling environment and arbitrary performance criteria. Genetic programming is used to synthesize the priority function which, coupled with an appropriate meta-algorithm for a given environment, forms the priority scheduling heuristic. We show that the procedures derived in this way can perform similarly or better than existing algorithms. Additionally, this approach may be particularly useful for those combinations of scheduling environment and criteria for which there are no adequate scheduling algorithms.

Domagoj Jakobović, Leonardo Jelenković, Leo Budin

Group-Foraging with Particle Swarms and Genetic Programming

This paper has been inspired by two quite different works in the field of Particle Swarm theory. Its main aims are to obtain particle swarm equations via genetic programming which perform better than hand-designed ones on the group-foraging problem, and to provide insight into behavioural ecology. With this work, we want to start a new research direction: the use of genetic programming together with particle swarm algorithms in the simulation of problems in behavioural ecology.

Cecilia Di Chio, Paolo Di Chio

Multiple Interactive Outputs in a Single Tree: An Empirical Investigation

This paper describes Multiple Interactive Outputs in a Single Tree (MIOST), a new form of Genetic Programming (GP). Our approach is based on two ideas. Firstly, we have taken inspiration from graph-GP representations. With this idea we decided to explore the possibility of representing programs as graphs with oriented links. Secondly, our individuals could have more than one output. This idea was inspired on the divide and conquer principle, a program is decomposed in subprograms, and so, we are expecting to make the original problem easier by breaking down a problem into two or more sub-problems. To verify the effectiveness of our approach, we have used several evolvable hardware problems of different complexity taken from the literature. Our results indicate that our approach has a better overall performance in terms of consistency to reach feasible solutions.

Kerwords:

Multiple Interactive Outputs in a Single Tree, Genetic Programming, Graph-GP representations.

Edgar Galván-López, Katya Rodríguez-Vázquez

Parsimony Doesn’t Mean Simplicity: Genetic Programming for Inductive Inference on Noisy Data

A Genetic Programming algorithm based on Solomonoff’s probabilistic induction is designed and used to face an Inductive Inference task, i.e., symbolic regression. To this aim, some test functions are dressed with increasing levels of noise and the algorithm is employed to denoise the resulting function and recover the starting functions. Then, the algorithm is compared against a classical parsimony–based GP. The results shows the superiority of the Solomonoff–based approach.

Ivanoe De Falco, Antonio Della Cioppa, Domenico Maisto, Umberto Scafuri, Ernesto Tarantino

The Holland Broadcast Language and the Modeling of Biochemical Networks

The Broadcast Language is a programming formalism devised by Holland in 1975, which aims at improving the efficiency of Genetic Algorithms (GAs) during long-term evolution. The key mechanism of the Broadcast Language is to allow GAs to employ an adaptable problem representation. Fixed problem encoding is commonly used by GAs but may limit their performance in particular cases. This paper describes an implementation of the Broadcast Language and its application to modeling biochemical networks. Holland presented the Broadcast Language in his book “Adaptation in Natural and Artificial Systems” where only a description of the language was provided, without any implementation. Our primary motivation for this work was the fact that there is currently no published implementation of the Broadcast Language available. Secondly, no additional examination of the Broadcast Language and its applications can be found in the literature. Holland proposed that the Broadcast Language would be suitable for the modeling of biochemical models. However, he did not support this belief with any experimental work. In this paper, we propose an implementation of the Broadcast Language which is then applied to the modeling of a signal transduction network. We conclude the paper by proposing that with some refinements it will be possible to use the Broadcast Language to evolve biochemical networks

in silico

James Decraene, George G. Mitchell, Barry McMullin, Ciaran Kelly

The Induction of Finite Transducers Using Genetic Programming

This paper reports on the results of a preliminary study conducted to evaluate genetic programming (GP) as a means of evolving finite state transducers. A genetic programming system representing each individual as a directed graph was implemented to evolve Mealy machines. Tournament selection was used to choose parents for the next generation and the reproduction, mutation and crossover operators were applied to the selected parents to create the next generation. The system was tested on six standard Mealy machine problems. The GP system was able to successfully induce solutions to all six problems. Furthermore, the solutions evolved were human-competitive and in all cases the minimal transducer was evolved.

Amashini Naidoo, Nelishia Pillay

Backmatter

Titel: Genetic Programming
herausgegeben von: Marc Ebner
Michael O’Neill
Anikó Ekárt
Leonardo Vanneschi
Anna Isabel Esparcia-Alcázar
Verlag: Springer Berlin Heidelberg
Electronic ISBN: 978-3-540-71605-1
Print ISBN: 978-3-540-71602-0
DOI: https://doi.org/10.1007/978-3-540-71605-1