Skip to main content
Top

2021 | Book

Genetic Programming

24th European Conference, EuroGP 2021, Held as Part of EvoStar 2021, Virtual Event, April 7–9, 2021, Proceedings

insite
SEARCH

About this book

This book constitutes the refereed proceedings of the 24th European Conference on Genetic Programming, EuroGP 2021, held as part of Evo*2021, as Virtual Event, in April 2021, co-located with the Evo*2021 events, EvoCOP, EvoMUSART, and EvoApplications. The 11 revised full papers and 6 short papers presented in this book were carefully reviewed and selected from 27 submissions. The wide range of topics in this volume reflects the current state of research in the field. The collection of papers cover interesting topics including developing new operators for variants of GP algorithms, as well as exploring GP applications to the optimisation of machine learning methods and the evolution of complex combinational logic circuits.

Table of Contents

Frontmatter

Long Talks

Frontmatter
Quality Diversity Genetic Programming for Learning Decision Tree Ensembles
Abstract
Quality Diversity (QD) algorithms are a class of population-based evolutionary algorithms designed to generate sets of solutions that are both fit and diverse. In this paper, we describe a strategy for applying QD concepts to the generation of decision tree ensembles by optimizing collections of trees for both individually accurate and collectively diverse predictive behavior. We compare three variants of this QD strategy with two existing ensemble generation strategies over several classification data sets. We then briefly highlight the effect of the evolutionary algorithm at the core of the strategy. The examined algorithms generate ensembles with distinct predictive behaviors as measured by classification accuracy and intrinsic diversity. The plotted behaviors hint at highly data-dependent relationships between these metrics. QD-based strategies are suggested as a means to optimize classifier ensembles along this performance curve along with other suggestions for future work.
Stephen Boisvert, John W. Sheppard
Progressive Insular Cooperative GP
Abstract
This work presents a novel genetic programming system for multi-class classification, called progressively insular cooperative genetic programming (PIC GP). Based on the idea that effective multiclass classification can be obtained by appropriately joining classifiers that are highly specialized on the single classes, PIC GP evolves, at the same time, two populations. The first population contains individuals called specialists, and each specialist is optimized on one specific target class. The second population contains higher-level individuals, called teams, that join specialists to obtain the final algorithm prediction. By means of three simple parameters, PIC GP can tune the amount of cooperation between specialists of different classes. The first part of the paper is dedicated to a study of the influence of these parameters on the evolution dynamics. The obtained results indicate that PIC GP achieves the best performance when the evolution begins with a high level of cooperation between specialists of different classes, and then this type of cooperation is progressively decreased, until only specialists of the same class can cooperate between each other. The last part of the work is dedicated to an experimental comparison between PIC GP and a set of state-of-the-art classification algorithms. The presented results indicate that PIC GP outperforms the majority of its competitors on the studied test problems.
Karina Brotto Rebuli, Leonardo Vanneschi
Regenerating Soft Robots Through Neural Cellular Automata
Abstract
Morphological regeneration is an important feature that highlights the environmental adaptive capacity of biological systems. Lack of this regenerative capacity significantly limits the resilience of machines and the environments they can operate in. To aid in addressing this gap, we develop an approach for simulated soft robots to regrow parts of their morphology when being damaged. Although numerical simulations using soft robots have played an important role in their design, evolving soft robots with regenerative capabilities have so far received comparable little attention. Here we propose a model for soft robots that regenerate through a neural cellular automata. Importantly, this approach only relies on local cell information to regrow damaged components, opening interesting possibilities for physical regenerable soft robots in the future. Our approach allows simulated soft robots that are damaged to partially regenerate their original morphology through local cell interactions alone and regain some of their ability to locomote. These results take a step towards equipping artificial systems with regenerative capacities and could potentially allow for more robust operations in a variety of situations and environments. The code for the experiments in this paper is available at: http://​github.​com/​KazuyaHoribe/​RegeneratingSoft​Robots.
Kazuya Horibe, Kathryn Walker, Sebastian Risi
Inclusive Genetic Programming
Abstract
The promotion and maintenance of the population diversity in a Genetic Programming (GP) algorithm was proved to be an important part of the evolutionary process. Such diversity maintenance improves the exploration capabilities of the GP algorithm, which as a consequence improves the quality of the found solutions by avoiding local optima. This paper aims to further investigate and prove the efficacy of a GP heuristic proposed in a previous work: the Inclusive Genetic Programming (IGP). Such heuristic can be classified as a niching technique, which performs the evolutionary operations like crossover, mutation and selection by considering the individuals belonging to different niches in order to maintain and exploit a certain degree of diversity in the population, instead of evolving the niches separately to find different local optima. A comparison between a standard formulation of GP and the IGP is carried out on nine different benchmarks coming from synthetic and real world data. The obtained results highlight how the greater diversity in the population, measured in terms of entropy, leads to better results on both training and test data, showing that an improvement on the generalization capabilities is also achieved.
Francesco Marchetti, Edmondo Minisci
Towards Incorporating Human Knowledge in Fuzzy Pattern Tree Evolution
Abstract
This paper shows empirically that Fuzzy Pattern Trees (FPT) evolved using Grammatical Evolution (GE), a system we call FGE, meet the criteria to be considered a robust Explainable Artificial Intelligence (XAI) system. Experimental results show FGE achieves competitive results against state of the art black box methods on a set of real world benchmark problems. Various selection methods were investigated to see which was best for finding smaller, more interpretable models and a human expert was recruited to test the interpretability of the models found and to give a confidence score for each model. Models which were deemed interpretable but not trustworthy by the expert were seen to be outperformed in classification accuracy by interpretable models which were judge trustworthy, validating that FGE can be a powerful XAI technique.
Aidan Murphy, Gráinne Murphy, Jorge Amaral, Douglas MotaDias, Enrique Naredo, Conor Ryan
Evolutionary Neural Architecture Search Supporting Approximate Multipliers
Abstract
There is a growing interest in automated neural architecture search (NAS) methods. They are employed to routinely deliver high-quality neural network architectures for various challenging data sets and reduce the designer’s effort. The NAS methods utilizing multi-objective evolutionary algorithms are especially useful when the objective is not only to minimize the network error but also to minimize the number of parameters (weights) or power consumption of the inference phase. We propose a multi-objective NAS method based on Cartesian genetic programming for evolving convolutional neural networks (CNN). The method allows approximate operations to be used in CNNs to reduce power consumption of a target hardware implementation. During the NAS process, a suitable CNN architecture is evolved together with approximate multipliers to deliver the best trade-offs between the accuracy, network size and power consumption. The most suitable approximate multipliers are automatically selected from a library of approximate multipliers. Evolved CNNs are compared with common human-created CNNs of a similar complexity on the CIFAR-10 benchmark problem.
Michal Pinos, Vojtech Mrazek, Lukas Sekanina
Automatic Design of Deep Neural Networks Applied to Image Segmentation Problems
Abstract
A U-Net is a convolutional neural network mainly used for image segmentation domains such as medical image analysis. As other deep neural networks, the U-Net architecture influences the efficiency and accuracy of the network. We propose the use of a grammar-based evolutionary algorithm for the automatic design of deep neural networks for image segmentation tasks. The approach used is called Dynamic Structured Grammatical Evolution (DSGE), which employs a grammar to define the building blocks that are used to compose the networks, as well as the rules that help build them. We perform a set of experiments on the BSDS500 and ISBI12 datasets, designing networks tuned to image segmentation and edge detection. Subsequently, by using image similarity metrics, the results of our best performing networks are compared with the original U-Net. The results show that the proposed approach is able to design a network that is less complex in the number of trainable parameters, while also achieving slightly better results than the U-Net with a more consistent training.
Ricardo Lima, Aurora Pozo, Alexander Mendiburu, Roberto Santana
On the Influence of Grammars on Crossover in Grammatical Evolution
Abstract
Standard grammatical evolution (GE) uses a one-point crossover (“ripple crossover”) that exchanges codons between two genotypes. The two resulting genotypes are then mapped to their respective phenotypes using a Backus-Naur form grammar. This article studies how different types of grammars affect the resulting individuals of a ripple crossover. We distinguish different grammars based on the expected number of non-terminals chosen when mapping genotype codons to phenotypes, \(B_{avg}\). The grammars only differ in \(B_{avg}\) but can express the same phenotypes. We perform crossover operations on the genotypes and find that grammars with \(B_{avg} > 1\) lead to high numbers of either very small trees or invalid individuals. Due to the re-sampling of the invalid individuals, the algorithmic runtime is higher compared to grammars with a small \(B_{avg}\), despite being able to express the same phenotypes. In grammars with \(B_{avg} \le 1\), the bias towards small trees is reduced and instead, the frequency of valid large trees is increased. Our results give insights on favorable grammar designs and underline the central role of grammar design in GE.
Dirk Schweim
On the Generalizability of Programs Synthesized by Grammar-Guided Genetic Programming
Abstract
Grammar-guided Genetic Programming is a common approach for program synthesis where the user’s intent is given by a set of input/output examples. For use in real-world software development, the generated programs must work on previously unseen test cases too. Therefore, we study in this work the generalizability of programs synthesized by grammar-guided GP with lexicase selection. As benchmark, we analyze proportionate and tournament selection too. We find that especially for program synthesis problems with a low output cardinality (e.g., a Boolean output) lexicase selection overfits the training cases and does not generalize well to unseen test cases. An analysis using common software metrics shows for such a problem that lexicase selection generates more complex programs with many code lines and a heavier use of control structures compared to the other studied selection methods. Nevertheless, the generalizability can be improved when we do not stop a GP run as usual after a first program is found that solves all training cases correctly, but give GP more time to find further solution candidates (also solving correctly all training cases) and select the smallest program (measured with different software metrics) out of these.
Dominik Sobania
Evolution of Complex Combinational Logic Circuits Using Grammatical Evolution with SystemVerilog
Abstract
Scalability problems have hindered the progress of Evolvable Hardware in tackling complex circuits. The two key issues are the amount of testing (for example, a 64-bit \(\times \) 64-bit add-shift multiplier problem has \(2^{64 + 64}\) test cases) and low level that hardware works at: a circuit to implement 64-bit \(\times \) 64-bit add-shift multiplier would require approximately 33,234 gates when synthesized using the powerful Yosys Open SYnthesis Suite tool. We use Grammatical Evolution and SystemVerilog, a Hardware Description Language (HDL), to evolve fully functional parameterized adder, multiplier and selective parity circuits with default input bit-width sizes of 64-bit + 64-bit, 64-bit \(\times \) 64-bit and 128-bit respectively.
These are substantially larger than the current state of the art for evolutionary approaches, specifically, 6.4\(\times \) (adder), 10.7\(\times \) (multiplier), and 6.7\(\times \) (parity). We are able to scale so dramatically because our use of an HDL permits us to operate at a far higher level of abstraction than most other approaches. This has the additional benefit that no further evolutionary experiments are needed to design different input bit-width sizes of the same circuit as is the case for existing EHW approaches. Thus, one can evolve once and reuse multiple times, simply by specifying the newly desired input/output bit-width sizes during module instantiation.
For example, 32-bit \(\times \) 32-bit and 256-bit \(\times \) 256-bit multipliers can be instantiated from an evolved parameterized multiplier. We also adopt a method for reducing testing from Digital Circuit Design known as corner case testing, well-known technique heavily relied upon by circuit designers to avoid time-consuming exhaustive testing; we demonstrate a simple way to identify and use corner cases for evolutionary testing and show that it enables the generation of massively complex circuits with a huge number of inputs.
We obtain successful results (ranging from 72% to 100%) on each benchmark and all three problems were tackled without resorting to the use of any standard decomposition methods due to our ability to use high-level programming constructs and operators available in SystemVerilog.
Michael Kwaku Tetteh, Douglas Mota Dias, Conor Ryan
Evofficient: Reproducing a Cartesian Genetic Programming Method
Abstract
Designing Neural Network Architectures requires expert knowledge and extensive parameter searches. Neural Architecture Search (NAS) aims to change that by automating the design process. It is important that these approaches are reproducible so they can be used in real-life scenarios. In our work, we reproduce a genetic programming approach to designing convolutional neural networks called CGP-CNN. We show that this is difficult and requires many changes to the training scheme, reducing real-life applicability. We achieve a final accuracy of \(90.6\% \pm 0.005\), substantially lower than the reported \(93.7\% \pm 0.005\). This negates some of the benefits of using CGP-CNN for NAS. We establish a random search as a consensus baseline and show that it produces similar results to the evolutionary method of CGP-CNN. To assess the adaptability and generality of the presented algorithm, it is applied to CIFAR-100 and SVHN with a final accuracy of 63.1% and 95.6%, respectively. We extend the investigated NAS by two methods for predicting candidate fitnesses from partial learning curves. This improves CGP-CNN runtime efficiency by a factor of 1.69.
Lorenz Wendlinger, Julian Stier, Michael Granitzer

Short Talks

Frontmatter
Software Anti-patterns Detection Under Uncertainty Using a Possibilistic Evolutionary Approach
Abstract
Code smells (a.k.a. anti-patterns) are manifestations of poor design solutions that could deteriorate the software maintainability and evolution. Despite the high number of existing detection methods, the issue of class label uncertainty is usually omitted. Indeed, two human experts may have different degrees of uncertainty about the smelliness of a particular software class not only for the smell detection task but also for the smell type identification one. Thus, this uncertainty should be taken into account and then processed by detection tools. Unfortunately, these latter usually reject and/or ignore uncertain data that correspond to software classes (i.e. dataset instances) with uncertain labels. This practice could considerably degrade the detection/identification process effectiveness. Motivated by this observation and the interesting performance of the Possibilistic K-NN (PK-NN) classifier in dealing with uncertain data, we propose a new possibilistic evolutionary detection method, named ADIPOK (Anti-patterns Detection and Identification using Possibilistic Optimized K-NNs), that is able to deal with label uncertainty using some concepts stemming from the Possibility theory. ADIPOK is validated using a possibilistic base of smell examples that simulates the subjectivity of software engineers’ opinions’ uncertainty. The statistical analysis of the obtained results on a set of comparative experiments with respect to four state-of-the-art methods show the merits of our proposed method.
Sofien Boutaib, Maha Elarbi, Slim Bechikh, Chih-Cheng Hung, Lamjed Ben Said
Probabilistic Grammatical Evolution
Abstract
Grammatical Evolution (GE) is one of the most popular Genetic Programming (GP) variants, and it has been used with success in several problem domains. Since the original proposal, many enhancements have been proposed to GE in order to address some of its main issues and improve its performance.
In this paper we propose Probabilistic Grammatical Evolution (PGE), which introduces a new genotypic representation and new mapping mechanism for GE. Specifically, we resort to a Probabilistic Context-Free Grammar (PCFG) where its probabilities are adapted during the evolutionary process, taking into account the productions chosen to construct the fittest individual. The genotype is a list of real values, where each value represents the likelihood of selecting a derivation rule. We evaluate the performance of PGE in two regression problems and compare it with GE and Structured Grammatical Evolution (SGE).
The results show that PGE has a better performance than GE, with statistically significant differences, and achieved similar performance when comparing with SGE.
Jessica Mégane, Nuno Lourenço, Penousal Machado
Evolving Allocation Rules for Beam Search Heuristics in Assembly Line Balancing
Abstract
We study the evolution of rules that define how to assign tasks to workstations in heuristic procedures for assembly line balancing. In assembly line balancing, a set of partially ordered tasks has to be assigned to workstations. The variant we consider, known as the assembly line worker assignment and balancing problem (ALWABP), has a fixed number of machines and workers, and different workers need different times to execute the tasks. A solution is an assignment of tasks and workers to workstations satisfying the partial order of the tasks, and the problem is to find a solution that maximizes the production rate of the assembly line. These problems are often solved by station-based assignment procedures, which use heuristic rules to select the tasks to be assigned to stations. There are many selection rules available in the literature. We show how efficient rules can be evolved, and demonstrate that rules evolved for simple assignment procedures are also effective in stochastic heuristic procedures using beam search, leading to improved heuristics.
João Pedro Gonçalves Moreira, Marcus Ritt
Incremental Evaluation in Genetic Programming
Abstract
Often GP evolves side effect free trees. These pure functional expressions can be evaluated in any order. In particular they can be interpreted from the genetic modification point outwards. Incremental evaluation exploits the fact that: in highly evolved children the semantic difference between child and parent falls with distance from the syntactic disruption (e.g. crossover point) and can reach zero before the whole child has been interpreted. If so, its fitness is identical to its parent (mum).
Considerable savings in bloated binary tree GP runs are given by exploiting population convergence with existing GPquick data structures, leading to near linear O(gens) runtime. With multi-threading and SIMD AVX parallel computing a 16 core desktop can deliver the equivalent of 571 billion GP operations per second, 571 giga GPop/s.
GP convergence is viewed via information theory as evolving a smooth landscape and software plasticity. Which gives rise to functional resilience to source code changes. On average a mixture of 100 +, −, \(\times \) and (protected) \(\div \) tree nodes remove test case effectiveness at exposing changes and so fail to propagate crossover infected errors.
William B. Langdon
Mining Feature Relationships in Data
Abstract
When faced with a new dataset, most practitioners begin by performing exploratory data analysis to discover interesting patterns and characteristics within data. Techniques such as association rule mining are commonly applied to uncover relationships between features (attributes) of the data. However, association rules are primarily designed for use on binary or categorical data, due to their use of rule-based machine learning. A large proportion of real-world data is continuous in nature, and discretisation of such data leads to inaccurate and less informative association rules. In this paper, we propose an alternative approach called feature relationship mining (FRM), which uses a genetic programming approach to automatically discover symbolic relationships between continuous or categorical features in data. To the best of our knowledge, our proposed approach is the first such symbolic approach with the goal of explicitly discovering relationships between features. Empirical testing on a variety of real-world datasets shows the proposed method is able to find high-quality, simple feature relationships which can be easily interpreted and which provide clear and non-trivial insight into data.
Andrew Lensen
Getting a Head Start on Program Synthesis with Genetic Programming
Abstract
We explore how to give Genetic Programming (GP) a head start to synthesize a programming problem. Our method uses a related problem and introduces a schedule that directs GP to solve the related problem first either fully or to some extent first, or at the same time. In addition, if the related problem’s solutions are written by students or evolved by GP, we explore the extent to which initializing the GP population with some of these solutions provides a head start. We find that having a population solve one programming problem before working to solve a related programming problem helps to a greater extent as the targeted problems and the intermediate problems themselves are selected to be more challenging.
Jordan Wick, Erik Hemberg, Una-May O’Reilly
Backmatter
Metadata
Title
Genetic Programming
Editors
Dr. Ting Hu
Nuno Lourenço
Eric Medvet
Copyright Year
2021
Electronic ISBN
978-3-030-72812-0
Print ISBN
978-3-030-72811-3
DOI
https://doi.org/10.1007/978-3-030-72812-0

Premium Partner