Skip to main content
Top

2023 | Book

Genetic Programming

26th European Conference, EuroGP 2023, Held as Part of EvoStar 2023, Brno, Czech Republic, April 12–14, 2023, Proceedings

insite
SEARCH

About this book

This book constitutes the refereed proceedings of the 26th European Conference on Genetic Programming, EuroGP 2023, held as part of EvoStar 2023, in Brno, Czech Republic, during April 12–14, 2023, and co-located with the EvoStar events, EvoCOP, EvoMUSART, and EvoApplications.
The 14 revised full papers and 8 short papers presented in this book were carefully reviewed and selected from 38 submissions. The wide range of topics in this volume reflects the current state of research in the field. The collection of papers cover topics including developing new variants of GP algorithms for both optimization and machine learning problems as well as exploring GP to address complex real-world problems.

Table of Contents

Frontmatter

Long Presentations

Frontmatter
A Self-Adaptive Approach to Exploit Topological Properties of Different GAs’ Crossover Operators
Abstract
Evolutionary algorithms (EAs) are a family of optimization algorithms inspired by the Darwinian theory of evolution, and Genetic Algorithm (GA) is a popular technique among EAs. Similar to other EAs, common limitations of GAs have geometrical origins, like premature convergence, where the final population’s convex hull might not include the global optimum. Population diversity maintenance is a central idea to tackle this problem but is often performed through methods that constantly diminish the search space’s area. This work presents a self-adaptive approach, where the non-geometric crossover is strategically employed with geometric crossover to maintain diversity from a geometrical/topological perspective. To evaluate the performance of the proposed method, the experimental phase compares it against well-known diversity maintenance methods over well-known benchmarks. Experimental results clearly demonstrate the suitability of the proposed self-adaptive approach and the possibility of applying it to different types of crossover and EAs.
José Ferreira, Mauro Castelli, Luca Manzoni, Gloria Pietropolli
A Genetic Programming Encoder for Increasing Autoencoder Interpretability
Abstract
Autoencoders are powerful models for non-linear dimensionality reduction. However, their neural network structure makes it difficult to interpret how the high dimensional features relate to the low-dimensional embedding, which is an issue in applications where explainability is important. There have been attempts to replace both the neural network components in autoencoders with interpretable genetic programming (GP) models. However, for the purposes of interpretable dimensionality reduction, we observe that replacing only the encoder with GP is sufficient. In this work, we propose the Genetic Programming Encoder for Autoencoding (GPE-AE). GPE-AE uses a multi-tree GP individual as an encoder, while retaining the neural network decoder. We demonstrate that GPE-AE is a competitive non-linear dimensionality reduction technique compared to conventional autoencoders and a GP based method that does not use an autoencoder structure. As visualisation is a common goal for dimensionality reduction, we also evaluate the quality of visualisations produced by our method, and highlight the value of functional mappings by demonstrating insights that can be gained from interpreting the GP encoders.
Finn Schofield, Luis Slyfield, Andrew Lensen
Graph Networks as Inductive Bias for Genetic Programming: Symbolic Models for Particle-Laden Flows
Abstract
High-resolution simulations of particle-laden flows are computationally limited to a scale of thousands of particles due to the complex interactions between particles and fluid. Some approaches to increase the number of particles in such simulations require information about the fluid-induced force on a particle, which is a major challenge in this research area. In this paper, we present an approach to develop symbolic models for the fluid-induced force. We use a graph network as inductive bias to model the underlying pairwise particle interactions. The internal parts of the network are then replaced by symbolic models using a genetic programming algorithm. We include prior problem knowledge in our algorithm. The resulting equations show an accuracy in the same order of magnitude as state-of-the-art approaches for different benchmark datasets. They are interpretable and deliver important building blocks. Our approach is a promising alternative to “black-box” models from the literature.
Julia Reuter, Hani Elmestikawy, Fabien Evrard, Sanaz Mostaghim, Berend van Wachem
Phenotype Search Trajectory Networks for Linear Genetic Programming
Abstract
In this study, we visualise the search trajectories of a genetic programming system as graph-based models, where nodes are genotypes/phenotypes and edges represent their mutational transitions. We also quantitatively measure the characteristics of phenotypes including their genotypic abundance (the requirement for neutrality) and Kolmogorov complexity. We connect these quantified metrics with search trajectory visualisations, and find that more complex phenotypes are under-represented by fewer genotypes and are harder for evolution to discover. Less complex phenotypes, on the other hand, are over-represented by genotypes, are easier to find, and frequently serve as stepping-stones for evolution.
Ting Hu, Gabriela Ochoa, Wolfgang Banzhaf
GPAM: Genetic Programming with Associative Memory
Abstract
We focus on the evolutionary design of programs capable of capturing more randomness and outliers in the input data set than the standard genetic programming (GP)-based methods typically allow. We propose Genetic Programming with Associative Memory (GPAM) – a GP-based system for symbolic regression which can utilize a small associative memory to store various data points to better approximate the original data set. The method is evaluated on five standard benchmarks in which a certain number of data points is replaced by randomly generated values. In another case study, GPAM is used as an on-chip generator capable of approximating the weights for a convolutional neural network (CNN) to reduce the access to an external weight memory. Using Cartesian genetic programming (CGP), we evolved expression-memory pairs that can generate weights of a single CNN layer. If the associative memory contains 10% of the original weights, the weight generator evolved for a convolutional layer can approximate the original weights such that the CNN utilizing the generated weights shows less than a 1% drop in the classification accuracy on the MNIST data set.
Tadeas Juza, Lukas Sekanina
MAP-Elites with Cosine-Similarity for Evolutionary Ensemble Learning
Abstract
Evolutionary ensemble learning methods with Genetic Programming have achieved remarkable results on regression and classification tasks by employing quality-diversity optimization techniques like MAP-Elites and Neuro-MAP-Elites. The MAP-Elites algorithm uses dimensionality reduction methods, such as variational auto-encoders, to reduce the high-dimensional semantic space of genetic programming to a two-dimensional behavioral space. Then, it constructs a grid of high-quality and diverse models to form an ensemble model. In MAP-Elites, however, variational auto-encoders rely on Euclidean space topology, which is not effective at preserving high-quality individuals. To solve this problem, this paper proposes a principal component analysis method based on a cosine-kernel for dimensionality reduction. In order to deal with unbalanced distributions of good individuals, we propose a zero-cost reference points synthesizing method. Experimental results on 108 datasets show that combining principal component analysis using a cosine kernel with reference points significantly improves the performance of the MAP-Elites evolutionary ensemble learning algorithm.
Hengzhe Zhang, Qi Chen, Alberto Tonda, Bing Xue, Wolfgang Banzhaf, Mengjie Zhang
Small Solutions for Real-World Symbolic Regression Using Denoising Autoencoder Genetic Programming
Abstract
Denoising Autoencoder Genetic Programming (DAE-GP) is a model-based evolutionary algorithm that uses denoising autoencoder long short-term memory networks as probabilistic model to replace the standard recombination and mutation operators of genetic programming (GP). In this paper, we use the DAE-GP to solve a set of nine standard real-world symbolic regression tasks. We compare the prediction quality of the DAE-GP to standard GP, geometric semantic GP (GSGP), and the gene-pool optimal mixing evolutionary algorithm for GP (GOMEA-GP), and find that the DAE-GP shows similar prediction quality using a much lower number of fitness evaluations than GSGP or GOMEA-GP. In addition, the DAE-GP consistently finds small solutions. The best candidate solutions of the DAE-GP are 69% smaller (median number of nodes) than the best candidate solutions found by standard GP. An analysis of the bias of the selection and variation step for both the DAE-GP and standard GP gives insight into why differences in solution size exist: the strong increase in solution size for standard GP is a result of both selection and variation bias. The results highlight that learning and sampling from a probabilistic model is a promising alternative to classic GP variation operators where the DAE-GP is able to generate small solutions for real-world symbolic regression tasks.
David Wittenberg, Franz Rothlauf
Context Matters: Adaptive Mutation for Grammars
Abstract
This work proposes Adaptive Facilitated Mutation, a self-adaptive mutation method for Structured Grammatical Evolution (SGE), biologically inspired by the theory of facilitated variation. In SGE, the genotype of individuals contains a list for each non-terminal of the grammar that defines the search space. In our proposed mutation, each individual contains an array with a different, self-adaptive mutation rate for each non-terminal. We also propose Function Grouped Grammars, a grammar design procedure to enhance the benefits of the propose mutation. Experiments were conducted on three symbolic regression benchmarks using Probabilistic Structured Grammatical Evolution (PSGE), a variant of SGE. Results show our approach is similar or better when compared with the standard grammar and mutation.
Pedro Carvalho, Jessica Mégane, Nuno Lourenço, Penousal Machado
A Boosting Approach to Constructing an Ensemble Stack
Abstract
An approach to evolutionary ensemble learning for classification is proposed using genetic programming in which boosting is used to construct a stack of programs. Each application of boosting identifies a single champion and a residual dataset, i.e. the training records that thus far were not correctly classified. The next program is only trained against the residual, with the process iterating until some maximum ensemble size or no further residual remains. Training against a residual dataset actively reduces the cost of training. Deploying the ensemble as a stack also means that only one classifier might be necessary to make a prediction, so improving interpretability. Benchmarking studies are conducted to illustrate competitiveness with the prediction accuracy of current state-of-the-art evolutionary ensemble learning algorithms, while providing solutions that are orders of magnitude simpler. Further benchmarking with a high cardinality dataset indicates that the proposed method is also more accurate and efficient than XGBoost.
Zhilei Zhou, Ziyu Qiu, Brad Niblett, Andrew Johnston, Jeffrey Schwartzentruber, Nur Zincir-Heywood, Malcolm I. Heywood
Adaptive Batch Size CGP: Improving Accuracy and Runtime for CGP Logic Optimization Flow
Abstract
With the recent advances in the Machine Learning field, alongside digital circuits becoming more complex each day, machine learning based methods are being used in error-tolerant applications to solve the challenges imposed by large integrated circuits, where the designer can obtain a better overall circuit while relaxing its accuracy requirement. One of these methods is the Cartesian Genetic Programming (CGP), a subclass of Evolutionary Algorithms that uses concepts from biological evolution applied in electronic design automation. CGP-based approaches show advantages in the logic learning and logic optimization processes. However, the main challenge of CGP-based flows is the extensive runtime compared to other logic synthesis strategies. We propose a new strategy to tackle this challenge, called Adaptive Batch Size (ABS) CGP, in which the CGP algorithm incrementally improves the fitness estimation of the candidate solutions by using more terms of the truth table for evaluating them along the evolutionary process. The proposed approach was evaluated in nine exemplars from the IWLS 2020 contest, in which 3 exemplars are from the arithmetic domain, and six are from image recognition domain, specifically three from the CIFAR-10 dataset and three from the MNIST dataset. The results show that ABS presented an accuracy increase of up to 8.19% and decreased the number of candidate solutions evaluations required by up to 84.56%, in which directly affects the runtime of the algorithm. Furthermore, for all circuits, no significant accuracy reduction was observed while a significant reduction in the number of evaluations was achieved.
Bryan Martins Lima, Naiara Sachetti, Augusto Berndt, Cristina Meinhardt, Jonata Tyska Carvalho
Faster Convergence with Lexicase Selection in Tree-Based Automated Machine Learning
Abstract
In many evolutionary computation systems, parent selection methods can affect, among other things, convergence to a solution. In this paper, we present a study comparing the role of two commonly used parent selection methods in evolving machine learning pipelines in an automated machine learning system called Tree-based Pipeline Optimization Tool (TPOT). Specifically, we demonstrate, using experiments on multiple datasets, that lexicase selection leads to significantly faster convergence as compared to NSGA-II in TPOT. We also compare the exploration of parts of the search space by these selection methods using a trie data structure that contains information about the pipelines explored in a particular run.
Nicholas Matsumoto, Anil Kumar Saini, Pedro Ribeiro, Hyunjun Choi, Alena Orlenko, Leo-Pekka Lyytikäinen, Jari O. Laurikka, Terho Lehtimäki, Sandra Batista, Jason H. Moore
Using FPGA Devices to Accelerate Tree-Based Genetic Programming: A Preliminary Exploration with Recent Technologies
Abstract
In this paper, we explore the prospect of accelerating tree-based genetic programming (TGP) by way of modern field-programmable gate array (FPGA) devices, which is motivated by the fact that FPGAs can sometimes leverage larger amounts of data/function parallelism, as well as better energy efficiency, when compared to general-purpose CPU/GPU systems. In our preliminary study, we introduce a fixed-depth, tree-based architecture capable of evaluating type-consistent primitives that can be fully unrolled and pipelined. The current primitive constraints preclude arbitrary control structures, but they allow for entire programs to be evaluated every clock cycle. Using a variety of floating-point primitives and random programs, we compare to the recent TensorGP tool executing on a modern 8 nm GPU, and we show that our accelerator implemented on a 14 nm FPGA achieves an average speedup of 43\(\times \). When compared to the popular baseline tool DEAP executing across all cores of a 2-socket, 28-core (56-thread), 14 nm CPU server, our accelerator achieves an average speedup of 4,902\(\times \). Finally, when compared to the recent state-of-the-art tool Operon executing on the same 2-processor CPU system, our accelerator executes about 2.4\(\times \) slower on average. Despite not achieving an average speedup over every tool tested, our single-FPGA accelerator is the fastest in several instances, and we describe five future extensions that could allow for a 32–144\(\times \) speedup over our current design as well as allow for larger program depths/sizes. Overall, we estimate that a future version of our accelerator will constitute a state-of-the-art GP system for many applications.
Christopher Crary, Wesley Piard, Greg Stitt, Caleb Bean, Benjamin Hicks
Memetic Semantic Genetic Programming for Symbolic Regression
Abstract
This paper describes a new memetic semantic algorithm for symbolic regression (SR). While memetic computation offers a way to encode domain knowledge into a population-based process, semantic-based algorithms allow one to improve them locally to achieve a desired output. Hence, combining memetic and semantic enables us to (a) enhance the exploration and exploitation features of genetic programming (GP) and (b) discover short symbolic expressions that are easy to understand and interpret without losing the expressivity characteristics of symbolic regression. Experimental results show that our proposed memetic semantic algorithm can outperform traditional evolutionary and non-evolutionary methods on several real-world symbolic regression problems, paving a new direction to handle both the bloating and generalization endeavors of genetic programming.
Alessandro Leite, Marc Schoenauer
Grammatical Evolution with Code2vec
Abstract
This paper concerns using the code2vec vector embeddings of the source code to improve automatic source code generation in Grammatical Evolution. Focusing on a particular programming language, such as Java in the research presented, and being able to represent each Java function in the form of a continuous vector in a linear space by the code2vec model, GE gains some additional knowledge on similarities between constructed functions in the linear space instead of semantic similarities, which are harder to process. We propose a few improvements to the regular GE algorithm, including a code2vec-based initialization of the evolutionary algorithm and a code2vec-based crossover operator. Computational experiments confirm the efficiency of the approach proposed on a few typical benchmarks.
Michał Kowalczykiewicz, Piotr Lipiński

Short Presentations

Frontmatter
Domain-Aware Feature Learning with Grammar-Guided Genetic Programming
Abstract
Feature Learning (FL) is key to well-performing machine learning models. However, the most popular FL methods lack interpretability, which is becoming a critical requirement of Machine Learning. We propose to incorporate information from the problem domain in the structure of programs on top of the existing M3GP approach. This technique, named Domain-Knowledge M3GP, works by defining the possible feature transformations using a grammar through Grammar-Guided Genetic Programming. While requiring the user to specify the domain knowledge, this approach has the advantage of limiting the search space, excluding programs that make no sense to humans. We extend this approach with the possibility of introducing complex, aggregating queries over historic data. This extension allows to expand the search space to include relevant programs that were not possible before. We evaluate our methods on performance and interpretability in 6 use cases, showing promising results in both areas. We conclude that performance and interpretability of FL methods can benefit from domain-knowledge incorporation and aggregation, and give guidelines on when to use them.
Leon Ingelse, Alcides Fonseca
Genetic Improvement of LLVM Intermediate Representation
Abstract
Evolving LLVM IR is widely applicable, with LLVM Clang offering support for an increasing range of computer hardware and programming languages. Local search mutations are used to hill climb industry C code released to support geographic open standards: Open Location Code (OLC) from Google and Uber’s Hexagonal Hierarchical Spatial Index (H3), giving up to two percent speed up on compiler optimised code.
William B. Langdon, Afnan Al-Subaihin, Aymeric Blot, David Clark
Spatial Genetic Programming
Abstract
An essential characteristic of brains in intelligent organisms is their spatial organization, in which different parts of the brain are responsible for solving different classes of problems. Inspired by this concept, we introduce Spatial Genetic Programming (SGP) - a new GP paradigm in which Linear Genetic Programming (LGP) programs, represented as graph nodes, are spread in a 2D space. Each individual model is represented as a graph and the execution order of these programs is determined by the network of interactions between them. SGP considers space as a first-order effect to optimize which aids with determining the suitable order of execution of LGP programs to solve given problems and causes spatial dynamics to appear in the system. RetCons are internal SGP operators which enhance the evolution of conditional pathways in SGP model structures. To demonstrate the effectiveness of SGP, we have compared its performance and internal dynamics with LGP and TreeGP for a diverse range of problems, most of which require decision making. Our results indicate that SGP, due to its unique spatial organization, outperforms the other methods and solves a wide range of problems. We also carry out an analysis of the spatial properties of SGP individuals.
Iliya Miralavy, Wolfgang Banzhaf
All You Need is Sex for Diversity
Abstract
Maintaining genetic diversity as a means to avoid premature convergence is critical in Genetic Programming. Several approaches have been proposed to achieve this, with some focusing on the mating phase from coupling dissimilar solutions to some form of self-adaptive selection mechanism. In nature, genetic diversity can be the consequence of many different factors, but when considering reproduction Sexual Selection can have an impact on promoting variety within a species. Specifically, Mate Choice often results in different selective pressures between sexes, which in turn may trigger evolutionary differences among them. Although some mechanisms of Sexual Selection have been applied to Genetic Programming in the past, the literature is scarce when it comes to mate choice. Recently, a way of modelling mating preferences by ideal mate representations was proposed, achieving good results when compared to a standard approach. These mating preferences evolve freely in a self-adaptive fashion, creating an evolutionary driving force of its own alongside fitness pressure. The inner mechanisms of this approach operate from personal choice, as each individual has its own representation of a perfect mate which affects the mate to be selected.
In this paper, we compare this method against a random mate choice to assess whether there are advantages in evolving personal preferences. We conducted experiments using three symbolic regression problems and different mutation rates. The results show that self-adaptive mating preferences are able to create a more diverse set of solutions when compared to the traditional approach and a random mate approach (with statistically significant differences) and have a higher success rate in three of the six instances tested.
José Maria Simões, Nuno Lourenço, Penousal Machado
On the Effects of Collaborators Selection and Aggregation in Cooperative Coevolution: An Experimental Analysis
Abstract
Cooperative Coevolution is a way to solve complex optimization problems by dividing them in smaller, simpler sub-problems. Those sub-problems are then tackled concurrently by evolving one population of solutions—actually, components of a larger solution—for each of them. However, components cannot be evaluated in isolation: in the common case of two concurrently evolving populations, each solution of one population must be coupled with another solution of the other population (the collaborator) in order to compute the fitness of the pair. Previous studies have already shown that the way collaborators are chosen and, if more than one is chosen, the way the resulting fitness measures are aggregated, play a key role in determining the success of coevolution. In this paper we perform an experimental analysis aimed at shedding new light on the effects of collaborators selection and aggregation. We first propose a general scheme for cooperative coevolution of two populations that allows to (a) use different EAs and solution representations on the two sub-problems and to (b) set different collaborators selection and aggregation strategies. Second, we instantiate this general scheme in a few variants and apply it to four optimization problems with different degrees of separability: two toy problems and two real prediction problems tackled with different kinds of model (symbolic regression and neural networks). We analyze the outcomes in terms of (a) effectiveness and efficiency of the optimization and (b) complexity and generalization power of the solutions. We find that the degree to which selection and aggregation schemes differ strongly depends on the interaction between the components of the solution.
Giorgia Nadizar, Eric Medvet
To Bias or Not to Bias: Probabilistic Initialisation for Evolving Dispatching Rules
Abstract
The automatic generation of dispatching rules (DRs) for various scheduling problems using genetic programming (GP) has become an increasingly researched topic in recent years. Creating DRs in this way relieves domain experts of the tedious task of manually designing new rules, but also often leads to the discovery of better rules than those already available. However, developing new DRs is a computationally intensive process that takes time to converge to good solutions. One possible way to improve the convergence of evolutionary algorithms is to use a more sophisticated method to generate the initial population of individuals. In this paper, we propose a simple method for initialising individuals that uses probabilistic information from previously evolved DRs. The method extracts the information on how many times each node occurs at each level of the tree and in each context. This information is then used to introduce bias in the selection of the node to be selected at a particular position during the construction of the expression tree. The experiments show that with the proposed method it is possible to improve the convergence of GP when generating new DRs, so that GP can obtain high-quality DRs in a much shorter time.
Marko Đurasević, Francisco Javier Gil-Gala, Domagoj Jakobović
MTGP: Combining Metamorphic Testing and Genetic Programming
Abstract
Genetic programming is an evolutionary approach known for its performance in program synthesis. However, it is not yet mature enough for a practical use in real-world software development, since usually many training cases are required to generate programs that generalize to unseen test cases. As in practice, the training cases have to be expensively hand-labeled by the user, we need an approach to check the program behavior with a lower number of training cases. Metamorphic testing needs no labeled input/output examples. Instead, the program is executed multiple times, first on a given (randomly generated) input, followed by related inputs to check whether certain user-defined relations between the observed outputs hold. In this work, we suggest MTGP, which combines metamorphic testing and genetic programming and study its performance and the generalizability of the generated programs. Further, we analyze how the generalizability depends on the number of given labeled training cases. We find that using metamorphic testing combined with labeled training cases leads to a higher generalization rate than the use of labeled training cases alone in almost all studied configurations. Consequently, we recommend researchers to use metamorphic testing in their systems if the labeling of the training data is expensive.
Dominik Sobania, Martin Briesch, Philipp Röchner, Franz Rothlauf
Interacting Robots in an Artificial Evolutionary Ecosystem
Abstract
In Evolutionary Robotics where both body and brain are malleable, it is common practice to evaluate individuals in isolated environments. With the objective of implementing a more naturally plausible system, we designed a single interactive ecosystem for robots to be evaluated in. In this ecosystem robots are physically present and can interact each other and we implemented decentralized rules for mate selection and reproduction. To study the effects of evaluating robots in an interactive ecosystem has on evolution, we compare the evolutionary process with a more traditional, oracle–based approach. In our analysis, we observe how the different approach has a substantial impact on the final behaviour and morphology of the robots, while maintaining decent fitness performance.
Matteo De Carlo, Eliseo Ferrante, Jacintha Ellers, Gerben Meynen, A. E. Eiben
Backmatter
Metadata
Title
Genetic Programming
Editors
Gisele Pappa
Mario Giacobini
Zdenek Vasicek
Copyright Year
2023
Electronic ISBN
978-3-031-29573-7
Print ISBN
978-3-031-29572-0
DOI
https://doi.org/10.1007/978-3-031-29573-7

Premium Partner