Skip to main content
Top

2020 | Book

Parallel Problem Solving from Nature – PPSN XVI

16th International Conference, PPSN 2020, Leiden, The Netherlands, September 5-9, 2020, Proceedings, Part II

Editors: Prof. Thomas Bäck, Mike Preuss, Dr. André Deutz, Hao Wang, Dr. Carola Doerr, Assoc. Prof. Michael Emmerich, Heike Trautmann

Publisher: Springer International Publishing

Book Series : Lecture Notes in Computer Science

insite
SEARCH

About this book

This two-volume set LNCS 12269 and LNCS 12270 constitutes the refereed proceedings of the 16th International Conference on Parallel Problem Solving from Nature, PPSN 2020, held in Leiden, The Netherlands, in September 2020.

The 99 revised full papers were carefully reviewed and selected from 268 submissions. The topics cover classical subjects such as automated algorithm selection and configuration; Bayesian- and surrogate-assisted optimization; benchmarking and performance measures; combinatorial optimization; connection between nature-inspired optimization and artificial intelligence; genetic and evolutionary algorithms; genetic programming; landscape analysis; multiobjective optimization; real-world applications; reinforcement learning; and theoretical aspects of nature-inspired optimization.

Table of Contents

Frontmatter

Genetic Programming

Frontmatter
Generation of New Scalarizing Functions Using Genetic Programming

In recent years, there has been a growing interest in multiobjective evolutionary algorithms (MOEAs) with a selection mechanism different from Pareto dominance. This interest has been mainly motivated by the poor performance of Pareto-based selection mechanisms when dealing with problems having more than three objectives (the so-called many-objective optimization problems). Two viable alternatives for solving many-objective optimization problems are decomposition-based and indicator-based MOEAs. However, it is well-known that the performance of decomposition-based MOEAs (and also of indicator-based MOEAs designed around R2) heavily relies on the scalarizing function adopted. In this paper, we propose an approach for generating novel scalarizing functions using genetic programming. Using our proposed approach, we were able to generate two new scalarizing functions (called AGSF1 and AGSF2), which were validated using an indicator-based MOEA designed around R2 (MOMBI-II). This validation was conducted using a set of standard test problems and two performance indicators (hypervolume and s-energy). Our results indicate that AGSF1 has a similar performance to that obtained when using the well-known Achievement Scalarizing Function (ASF). However, AGSF2 provided a better performance than ASF in most of the test problems adopted. Nevertheless, our most remarkable finding is that genetic programming can indeed generate novel (and possible more competitive) scalarizing functions.

Amín V. Bernabé Rodríguez, Carlos A. Coello Coello
The Usability Argument for Refinement Typed Genetic Programming

The performance of Evolutionary Algorithms is frequently hindered by arbitrarily large search spaces. In order to overcome this challenge, domain-specific knowledge is often used to restrict the representation or evaluation of candidate solutions to the problem at hand. Due to the diversity of problems and the unpredictable performance impact, the encoding of domain-specific knowledge is a frequent problem in the implementation of evolutionary algorithms.We propose the use of Refinement Typed Genetic Programming, an enhanced hybrid of Strongly Typed Genetic Programming (STGP) and Grammar-Guided Genetic Programming (GGGP) that features an advanced type system with polymorphism and dependent and refined types.We argue that this approach is more usable for describing common problems in machine learning, optimisation and program synthesis, due to the familiarity of the language (when compared to GGGP) and the use of a unifying language to express the representation, the phenotype translation, the evaluation function and the context in which programs are executed.

Alcides Fonseca, Paulo Santos, Sara Silva
Program Synthesis in a Continuous Space Using Grammars and Variational Autoencoders

An important but elusive goal of computer scientists is the automatic creation of computer programs given only input and output examples. We present a novel approach to program synthesis based on the combination of grammars, generative neural models, and evolutionary algorithms. Programs are described by sequences of productions sampled from a Backus-Naur form grammar. A sequence-to-sequence Variational Autoencoder (VAE) is trained to embed randomly sampled programs in a continuous space – the VAE’s encoder maps a sequence of productions (a program) to a point z in the latent space, and the VAE’s decoder reconstructs the program given z. After the VAE has converged, we can engage the decoder as a generative model that maps locations in the latent space to executable programs. Hence, an Evolutionary Algorithm can be employed to search for a vector z (and its corresponding program) that solves the synthesis task. Experiments on the program synthesis benchmark suite suggest that the proposed approach is competitive with tree-based GP and PushGP. Crucially, code can be synthesised in any programming language.

David Lynch, James McDermott, Michael O’Neill
Cooperative Co-Evolutionary Genetic Programming for High Dimensional Problems

We propose a framework for Cooperative Co-Evolutionary Genetic Programming (CCGP) that considers co-evolution at three different abstraction levels: genotype, feature and output level. A thorough empirical evaluation is carried out on a real-world high dimensional ML problem (image denoising). Results indicate that GP’s performance is enhanced only when cooperation happens at an output level (ensemble-alike). The proposed co-evolutionary ensemble approach is compared against a canonical GP implementation and a GP customized for image processing tasks. Preliminary results show that the proposed framework obtains superior average performance in comparison to the other GP models. Our most relevant finding is the empirical evidence showing that the proposed CCGP model is a promising alternative to specialized GP implementations that require knowledge of the problem’s domain.

Lino Rodriguez-Coayahuitl, Alicia Morales-Reyes, Hugo Jair Escalante, Carlos A. Coello Coello
Image Feature Learning with Genetic Programming

Learning features from raw data is an important topic in machine learning. This paper presents Genetic Program Feature Learner (GPFL), a novel generative GP feature learner for 2D images. GPFL executes multiple GP runs, each run generates a model that focuses on a particular high-level feature of the training images. Then, it combines the models generated by each run into a function that reconstructs the observed images. As a sanity check, we evaluated GPFL on the popular MNIST dataset of handwritten digits, and compared it with the convolutional neural network LeNet5. Our evaluation results show that when considering smaller training sets, GPFL achieves comparable/slightly-better classification accuracy than LeNet5. However, GPFL drastically outperforms LeNet5 when considering noisy images as test sets.

Stefano Ruberto, Valerio Terragni, Jason H. Moore
Learning a Formula of Interpretability to Learn Interpretable Formulas

Many risk-sensitive applications require Machine Learning (ML) models to be interpretable. Attempts to obtain interpretable models typically rely on tuning, by trial-and-error, hyper-parameters of model complexity that are only loosely related to interpretability. We show that it is instead possible to take a meta-learning approach: an ML model of non-trivial Proxies of Human Interpretability (PHIs) can be learned from human feedback, then this model can be incorporated within an ML training process to directly optimize for interpretability. We show this for evolutionary symbolic regression. We first design and distribute a survey finalized at finding a link between features of mathematical formulas and two established PHIs, simulatability and decomposability. Next, we use the resulting dataset to learn an ML model of interpretability. Lastly, we query this model to estimate the interpretability of evolving solutions within bi-objective genetic programming. We perform experiments on five synthetic and eight real-world symbolic regression problems, comparing to the traditional use of solution size minimization. The results show that the use of our model leads to formulas that are, for a same level of accuracy-interpretability trade-off, either significantly more or equally accurate. Moreover, the formulas are also arguably more interpretable. Given the very positive results, we believe that our approach represents an important stepping stone for the design of next-generation interpretable (evolutionary) ML algorithms.

Marco Virgolin, Andrea De Lorenzo, Eric Medvet, Francesca Randone

Landscape Analysis

Frontmatter
On Stochastic Fitness Landscapes: Local Optimality and Fitness Landscape Analysis for Stochastic Search Operators

Fitness landscape analysis is a well-established tool for gaining insights about optimization problems and informing about the behavior of local and evolutionary search algorithms. In the conventional definition of a fitness landscape, the neighborhood of a given solution is a set containing nearby solutions whose distance is below a threshold, or that are reachable using a deterministic local search operator. In this paper, we generalize this definition in order to analyze the induced fitness landscape for stochastic search operators, that is when neighboring solutions are reachable under different probabilities. More particularly, we give the definition of a stochastic local optimum under this setting, in terms of a probability to reach strictly improving solutions. We illustrate the relevance of stochastic fitness landscapes for enumerable combinatorial benchmark problems, and we empirically analyze their properties for different stochastic operators, neighborhood sample sizes, and local optimality thresholds. We also portray their differences through stochastic local optima networks, intending to gather a better understanding of fitness landscapes under stochastic search operators.

Brahim Aboutaib, Sébastien Verel, Cyril Fonlupt, Bilel Derbel, Arnaud Liefooghe, Belaïd Ahiod
Fitness Landscape Analysis of Dimensionally-Aware Genetic Programming Featuring Feynman Equations

Genetic programming is an often-used technique for symbolic regression: finding symbolic expressions that match data from an unknown function. To make the symbolic regression more efficient, one can also use dimensionally-aware genetic programming that constrains the physical units of the equation. Nevertheless, there is no formal analysis of how much dimensionality awareness helps in the regression process. In this paper, we conduct a fitness landscape analysis of dimensionally-aware genetic programming search spaces on a subset of equations from Richard Feynman’s well-known lectures. We define an initialisation procedure and an accompanying set of neighbourhood operators for conducting the local search within the physical unit constraints. Our experiments show that the added information about the variable dimensionality can efficiently guide the search algorithm. Still, further analysis of the differences between the dimensionally-aware and standard genetic programming landscapes is needed to help in the design of efficient evolutionary operators to be used in a dimensionally-aware regression.

Marko Durasevic, Domagoj Jakobovic, Marcella Scoczynski Ribeiro Martins, Stjepan Picek, Markus Wagner
Global Landscape Structure and the Random MAX-SAT Phase Transition

We revisit the fitness landscape structure of random MAX-SAT instances, and address the question: what structural features change when we go from easy underconstrained instances to hard overconstrained ones? Some standard techniques such as autocorrelation analysis fail to explain what makes instances hard to solve for stochastic local search algorithms, indicating that deeper landscape features are required to explain the observed performance differences. We address this question by means of local optima network (LON) analysis and visualisation. Our results reveal that the number, size, and, most importantly, the connectivity pattern of local and global optima change significantly over the easy-hard transition. Our empirical results suggests that the landscape of hard MAX-SAT instances may feature sub-optimal funnels, that is, clusters of sub-optimal solutions where stochastic local search methods can get trapped.

Gabriela Ochoa, Francisco Chicano, Marco Tomassini
Exploratory Landscape Analysis is Strongly Sensitive to the Sampling Strategy

Exploratory landscape analysis (ELA) supports supervised learning approaches for automated algorithm selection and configuration by providing sets of features that quantify the most relevant characteristics of the optimization problem at hand. In black-box optimization, where an explicit problem representation is not available, the feature values need to be approximated from a small number of sample points. In practice, uniformly sampled random point sets and Latin hypercube constructions are commonly used sampling strategies.In this work, we analyze how the sampling method and the sample size influence the quality of the feature value approximations and how this quality impacts the accuracy of a standard classification task. While, not unexpectedly, increasing the number of sample points gives more robust estimates for the feature values, to our surprise we find that the feature value approximations for different sampling strategies do not converge to the same value. This implies that approximated feature values cannot be interpreted independently of the underlying sampling strategy. As our classification experiments show, this also implies that the feature approximations used for training a classifier must stem from the same sampling strategy as those used for the actual classification tasks.As a side result we show that classifiers trained with feature values approximated by Sobol’ sequences achieve higher accuracy than any of the standard sampling techniques. This may indicate improvement potential for ELA-trained machine learning models.

Quentin Renau, Carola Doerr, Johann Dreo, Benjamin Doerr
One PLOT to Show Them All: Visualization of Efficient Sets in Multi-objective Landscapes

Visualization techniques for the decision space of continuous multi-objective optimization problems (MOPs) are rather scarce in research. For long, all techniques focused on global optimality and even for the few available landscape visualizations, e.g., cost landscapes, globality is the main criterion. In contrast, the recently proposed gradient field heatmaps (GFHs) emphasize the location and attraction basins of local efficient sets, but ignore the relation of sets in terms of solution quality.In this paper, we propose a new and hybrid visualization technique, which combines the advantages of both approaches in order to represent local and global optimality together within a single visualization. Therefore, we build on the GFH approach but apply a new technique for approximating the location of locally efficient points and using the divergence of the multi-objective gradient vector field as a robust second-order condition. Then, the relative dominance relationship of the determined locally efficient points is used to visualize the complete landscape of the MOP. Augmented by information on the basins of attraction, this Plot of Landscapes with Optimal Trade-offs (PLOT) becomes one of the most informative multi-objective landscape visualization techniques available.

Lennart Schäpermeier, Christian Grimme, Pascal Kerschke

Multi-objective Optimization

Frontmatter
On Sharing Information Between Sub-populations in MOEA/S

This work investigates the effect of information exchange in decomposition methods that work with multi-membered populations as sub-problems. As an algorithm framework, we use the Multi-objective Evolutionary Algorithm based on Sub-populations (MOEA/S). This algorithm uses parallel sub-populations that can exchange information via migration and/or recombination. For this work, each sub-population is constructed by a few weighted utility functions, grouped by distance between their weighting vectors. The question investigated in this paper is: How is the distance between sub-populations and the mechanism of information exchange influencing the performance of MOEA/S? The study considers two ways of transferring information: (1) migration of individuals, (2) recombination using parents from two different sub-populations. A matrix describing the linkage patterns between sub-populations governs migration and recombination mechanisms. This work conducts a systematic study using the multi-objective knapsack problem (MOKP) and multi-objective traveling salesperson (MOTSP) for two and three objectives test problems. The results motivated a restriction policy for sharing information. We compare an algorithm using this policy with other state-of-the-art MOEAs, including NSGA III, MOEA/D, and the previous version of MOEA/S.

Lucas de Almeida Ribeiro, Michael Emmerich, Anderson da Silva Soares, Telma Woerle de Lima
Multi-objective Optimization by Uncrowded Hypervolume Gradient Ascent

Evolutionary algorithms (EAs) are the preferred method for solving black-box multi-objective optimization problems, but when gradients of the objective functions are available, it is not straightforward to exploit these efficiently. By contrast, gradient-based optimization is well-established for single-objective optimization. A single-objective reformulation of the multi-objective problem could therefore offer a solution. Of particular interest to this end is the recently introduced uncrowded hypervolume (UHV) indicator, which is Pareto compliant and also takes into account dominated solutions. In this work, we show that the gradient of the UHV can often be computed, which allows for a direct application of gradient ascent algorithms. We compare this new approach with two EAs for UHV optimization as well as with one gradient-based algorithm for optimizing the well-established hypervolume. On several bi-objective benchmarks, we find that gradient-based algorithms outperform the tested EAs by obtaining a better hypervolume with fewer evaluations whenever exact gradients of the multiple objective functions are available and in case of small evaluation budgets. For larger budgets, however, EAs perform similarly or better. We further find that, when finite differences are used to approximate the gradients of the multiple objectives, our new gradient-based algorithm is still competitive with EAs in most considered benchmarks. Implementations are available at https://github.com/scmaree/uncrowded-hypervolume .

Timo M. Deist, Stefanus C. Maree, Tanja Alderliesten, Peter A. N. Bosman
An Ensemble Indicator-Based Density Estimator for Evolutionary Multi-objective Optimization

Ensemble learning is one of the most employed methods in machine learning. Its main ground is the construction of stronger mechanisms based on the combination of elementary ones. In this paper, we employ AdaBoost, which is one of the most well-known ensemble methods, to generate an ensemble indicator-based density estimator for multi-objective optimization. It combines the search properties of five density estimators, based on the hypervolume, R2, IGD $$^+$$ , $$\epsilon ^+$$ , and $$\varDelta _p$$ quality indicators. Through the multi-objective evolutionary search process, the proposed ensemble mechanism adapts itself using a learning process that takes the preferences of the underlying quality indicators into account. The proposed method gives rise to the ensemble indicator-based multi-objective evolutionary algorithm (EIB-MOEA) that shows a robust performance on different multi-objective optimization problems when compared with respect to several existing indicator-based multi-objective evolutionary algorithms.

Jesús Guillermo Falcón-Cardona, Arnaud Liefooghe, Carlos A. Coello Coello
Ensuring Smoothly Navigable Approximation Sets by Bézier Curve Parameterizations in Evolutionary Bi-objective Optimization

The aim of bi-objective optimization is to obtain an approximation set of (near) Pareto optimal solutions. A decision maker then navigates this set to select a final desired solution, often using a visualization of the approximation front. The front provides a navigational ordering of solutions to traverse, but this ordering does not necessarily map to a smooth trajectory through decision space. This forces the decision maker to inspect the decision variables of each solution individually, potentially making navigation of the approximation set unintuitive. In this work, we aim to improve approximation set navigability by enforcing a form of smoothness or continuity between solutions in terms of their decision variables. Imposing smoothness as a restriction upon common domination-based multi-objective evolutionary algorithms is not straightforward. Therefore, we use the recently introduced uncrowded hypervolume (UHV) to reformulate the multi-objective optimization problem as a single-objective problem in which parameterized approximation sets are directly optimized. We study here the case of parameterizing approximation sets as smooth Bézier curves in decision space. We approach the resulting single-objective problem with the gene-pool optimal mixing evolutionary algorithm (GOMEA), and we call the resulting algorithm BezEA. We analyze the behavior of BezEA and compare it to optimization of the UHV with GOMEA as well as the domination-based multi-objective GOMEA. We show that high-quality approximation sets can be obtained with BezEA, sometimes even outperforming the domination- and UHV-based algorithms, while smoothness of the navigation trajectory through decision space is guaranteed.

Stefanus C. Maree, Tanja Alderliesten, Peter A. N. Bosman
Many-Objective Test Database Generation for SQL

Generating test database for SQL queries is an important but challenging task in software engineering. Existing approaches have modeled the task as a single-objective optimization problem. However, due to the improper handling of the relationship between different targets, the existing approaches face strong limitations, which we summarize as the inter-objective barrier and the test database bloating barrier. In this study, we propose a two-stage approach MoeSQL, which features the combination of many-objective evolutionary algorithm and decomposition based test database reduction. The effectiveness of MoeSQL lie in the ability to handle multiple targets simultaneously, and a local search to avoid the test database from bloating. Experiments over 1888 SQL queries demonstrate that, MoeSQL is able to achieve high coverage comparable to the state-of-the-art algorithm EvoSQL, and obtain more compact solutions, only 59.47% of those obtained by EvoSQL, measured by the overall number of data rows.

Zhilei Ren, Shaozheng Dong, Xiaochen Li, Zongzheng Chi, He Jiang
A New Paradigm in Interactive Evolutionary Multiobjective Optimization

Over the years, scalarization functions have been used to solve multiobjective optimization problems by converting them to one or more single objective optimization problem(s). This study proposes a novel idea of solving multiobjective optimization problems in an interactive manner by using multiple scalarization functions to map vectors in the objective space to a new, so-called preference incorporated space (PIS). In this way, the original problem is converted into a new multiobjective optimization problem with typically fewer objectives in the PIS. This mapping enables a modular incorporation of decision maker’s preferences to convert any evolutionary algorithm to an interactive one, where preference information is directing the solution process. Advantages of optimizing in this new space are discussed and the idea is demonstrated with two interactive evolutionary algorithms: IOPIS/RVEA and IOPIS/NSGA-III. According to the experiments conducted, the new algorithms provide solutions that are better in quality as compared to those of state-of-the-art evolutionary algorithms and their variants where preference information is incorporated in the original objective space. Furthermore, the promising results require fewer function evaluations.

Bhupinder Singh Saini, Jussi Hakanen, Kaisa Miettinen
Hypervolume Optimal -Distributions on Line-Based Pareto Fronts in Three Dimensions

Hypervolume optimal $$\mu $$ -distribution is a fundamental research topic which investigates the distribution of $$\mu $$ solutions on the Pareto front for hypervolume maximization. It has been theoretically shown that the optimal distribution of $$\mu $$ solutions on a linear Pareto front in two dimensions is the one with $$\mu $$ equispaced solutions. However, the equispaced property of an optimal distribution does not always hold for a single-line Pareto front in three dimensions. It only holds for the single-line Pareto front where one objective of the Pareto front is constant. In this paper, we further theoretically investigate the hypervolume optimal $$\mu $$ -distribution on line-based Pareto fronts in three dimensions. In addition to a single-line Pareto front, we consider Pareto fronts constructed with two lines and three lines, where each line is a Pareto front with one constant objective. We show that even the equispaced property holds for each single-line Pareto front, it does not always hold for the Pareto fronts combined with them. Specifically, whether this property holds or not depends on how the lines are combined.

Ke Shang, Hisao Ishibuchi, Weiyu Chen, Lukáš Adam
Adaptive Operator Selection Based on Dynamic Thompson Sampling for MOEA/D

In evolutionary computation, different reproduction operators have various search dynamics. To strike a well balance between exploration and exploitation, it is attractive to have an adaptive operator selection (AOS) mechanism that automatically chooses the most appropriate operator on the fly according to the current status. This paper proposes a new AOS mechanism for multi-objective evolutionary algorithm based on decomposition (MOEA/D). More specifically, the AOS is formulated as a multi-armed bandit problem where the dynamic Thompson sampling (DYTS) is applied to adapt the bandit learning model, originally proposed with an assumption of a fixed award distribution, to a non-stationary setup. In particular, each arm of our bandit learning model represents a reproduction operator and is assigned with a prior reward distribution. The parameters of these reward distributions will be progressively updated according to the performance of its performance collected from the evolutionary process. When generating an offspring, an operator is chosen by sampling from those reward distribution according to the DYTS. Experimental results fully demonstrate the effectiveness and competitiveness of our proposed AOS mechanism compared with other four state-of-the-art MOEA/D variants.

Lei Sun, Ke Li
A Study of Swarm Topologies and Their Influence on the Performance of Multi-Objective Particle Swarm Optimizers

It has been shown that swarm topologies influence the behavior of Particle Swarm Optimization (PSO). A large number of connections stimulates exploitation, while a low number of connections stimulates exploration. Furthermore, a topology with four links per particle is known to improve PSO’s performance. In spite of this, there are few studies about the influence of swarm topologies in Multi-Objective Particle Swarm Optimizers (MOPSOs). We analyze the influence of star, tree, lattice, ring and wheel topologies in the performance of the Speed-constrained Multi-objective Particle Swarm Optimizer (SMPSO) when adopting a variety of multi-objective problems, including the well-known ZDT, DTLZ and WFG test suites. Our results indicate that the selection of the proper topology does indeed improve the performance in SMPSO.

Diana Cristina Valencia-Rodríguez, Carlos A. Coello Coello
Visualising Evolution History in Multi- and Many-objective Optimisation

Evolutionary algorithms are widely used to solve optimisation problems. However, challenges of transparency arise in both visualising the processes of an optimiser operating through a problem and understanding the problem features produced from many-objective problems, where comprehending four or more spatial dimensions is difficult. This work considers the visualisation of a population as an optimisation process executes. We have adapted an existing visualisation technique to multi- and many-objective problem data, enabling a user to visualise the EA processes and identify specific problem characteristics and thus providing a greater understanding of the problem landscape. This is particularly valuable if the problem landscape is unknown, contains unknown features or is a many-objective problem. We have shown how using this framework is effective on a suite of multi- and many-objective benchmark test problems, optimising them with NSGA-II and NSGA-III.

Mathew J. Walter, David J. Walker, Matthew J. Craven
Improving Many-Objective Evolutionary Algorithms by Means of Edge-Rotated Cones

Given a point in m-dimensional objective space, any $$\varepsilon $$ -ball of a point can be partitioned into the incomparable, the dominated and dominating region. The ratio between the size of the incomparable region, and the dominated (and dominating) region decreases proportionally to $$1/2^{m-1}$$ , i.e., the volume of the Pareto dominating orthant as compared to all other volumes. Due to this reason, it gets increasingly unlikely that dominating points can be found by random, isotropic mutations. As a remedy to stagnation of search in many objective optimization, in this paper, we suggest to enhance the Pareto dominance order by involving an obtuse convex dominance cone in the convergence phase of an evolutionary optimization algorithm. We propose edge-rotated cones as generalizations of Pareto dominance cones for which the opening angle can be controlled by a single parameter only. The approach is integrated in several state-of-the-art multi-objective evolutionary algorithms (MOEAs) and tested on benchmark problems with four, five, six and eight objectives. Computational experiments demonstrate the ability of these edge-rotated cones to improve the performance of MOEAs on many-objective optimization problems.

Yali Wang, André Deutz, Thomas Bäck, Michael Emmerich

Real-World Applications

Frontmatter
Human-Like Summaries from Heterogeneous and Time-Windowed Software Development Artefacts

Automatic text summarisation has drawn considerable interest in the area of software engineering. It is challenging to summarise the activities related to a software project, (1) because of the volume and heterogeneity of involved software artefacts, and (2) because it is unclear what information a developer seeks in such a multi-document summary. We present the first framework for summarising multi-document software artefacts containing heterogeneous data within a given time frame. To produce human-like summaries, we employ a range of iterative heuristics to minimise the cosine-similarity between texts and high-dimensional feature vectors. A first study shows that users find the automatically generated summaries the most useful when they are generated using word similarity and based on the eight most relevant software artefacts.

Mahfouth Alghamdi, Christoph Treude, Markus Wagner
A Search for Additional Structure: The Case of Cryptographic S-boxes

We investigate whether it is possible to evolve cryptographically strong S-boxes that have additional constraints on their structure. We investigate two scenarios: where S-boxes additionally have a specific sum of values in rows, columns, or diagonals and the scenario where we check that the difference between the Hamming weights of inputs and outputs is minimal. The first case represents an interesting benchmark problem, while the second one has practical ramifications as such S-boxes could offer better resilience against side-channel attacks.We explore three solution representations by using the permutation, integer, and cellular automata-based encoding. Our results show that it is possible to find S-boxes with excellent cryptographic properties (even optimal ones) and reach the required sums when representing S-box as a square matrix. On the other hand, for the most promising S-box representation based on trees and cellular automata rules, we did not succeed in finding S-boxes with small differences in the Hamming weights between the inputs and outputs, which opens an interesting future research direction. Our results for this scenario and different encodings inspired a mathematical proof that the values reached by evolutionary algorithms are the best possible ones.

Claude Carlet, Marko Djurasevic, Domagoj Jakobovic, Stjepan Picek
Evolutionary Multi-objective Design of SARS-CoV-2 Protease Inhibitor Candidates

Computational drug design based on artificial intelligence is an emerging research area. At the time of writing this paper, the world suffers from an outbreak of the coronavirus SARS-CoV-2. A promising way to stop the virus replication is via protease inhibition. We propose an evolutionary multi-objective algorithm (EMOA) to design potential protease inhibitors for SARS-CoV-2 ’s main protease. Based on the SELFIES representation the EMOA maximizes the binding of candidate ligands to the protein using the docking tool QuickVina 2, while at the same time taking into account further objectives like drug-likeness or the fulfillment of filter constraints. The experimental part analyzes the evolutionary process and discusses the inhibitor candidates.

Tim Cofala, Lars Elend, Philip Mirbach, Jonas Prellberg, Thomas Teusch, Oliver Kramer
Generic Relative Relations in Hierarchical Gene Expression Data Classification

Relative Expression Analysis (RXA) plays an important role in biomarker discovery and disease prediction from gene expression profiles. It deliberately ignores raw data values and investigates only the relative ordering relationships between a small group of genes. The classifiers constituted on that concept are therefore robust to small data perturbations and normalization procedures, but above all, they are easy to interpret and analyze.In this paper, we propose a novel globally induced decision tree in which node splits are based on the RXA methodology. We have extended a simple ordering with a more generic concept that also explores fractional relative relations between the genes. To face up to the newly arisen computational complexity, we have replaced the typical brute force approach with an evolutionary algorithm. As this was not enough, we boosted our solution with the OpenMP parallelization, local search components calculated on the GPU and embedded ranking of genes to improve the evolutionary convergence. This way we managed to explore in a reasonable time a much larger solution space and search for more complex but still comprehensible gene-gene interactions. An empirical investigation carried out on 8 cancer-related datasets shows the potential of the proposed algorithm not only in the context of accuracy improvement but also in finding biologically meaningful patterns.

Marcin Czajkowski, Krzysztof Jurczuk, Marek Kretowski
A Variable Neighborhood Search for the Job Sequencing with One Common and Multiple Secondary Resources Problem

In this work we consider a scheduling problem where a set of non-preemptive jobs needs to be scheduled such that the makespan is minimized. Each job requires two resources: (1) a common resource, shared by all jobs and (2) a secondary resource, shared with only a subset of the other jobs. The secondary resource is required during the job’s entire processing time whereas the common resource is only required during a part of a job’s execution. The problem models, for instance, the scheduling of patients during one day in a particle therapy facility for cancer treatment. We heuristically tackle the problem by a general variable neighborhood search (GVNS) based on move and exchange neighborhoods and an efficient evaluation scheme to scan the neighborhoods of the current incumbent solution. An experimental evaluation on two benchmark instance sets, including instances with up to 2000 jobs, shows the effectiveness of the GVNS. In particular for larger instances our GVNS outperforms an anytime A $$^*$$ algorithm that was the so far leading method in heuristic terms as well as a constrained programming model solved by ILOG CP optimizer.

Thomas Kaufmann, Matthias Horn, Günther R. Raidl
Evolutionary Graph-Based V+E Optimization for Protection Against Epidemics

Protection against spreading threats in networks gives rise to a variety of interesting optimization problems. Among others, vertex protection problems such as the Firefighter Problem and vaccination optimization problem can be tackled. Interestingly, in some cases a networked system can be made more resilient to threats, by changing its connectivity, which motivates the study of another type of optimization problems focused on adapting graph connectivity.In this paper the above-mentioned approaches are combined, that is both vertex and edge protection is applied in order to stop the threat from spreading. Solutions to the proposed problem are evaluated using different cost functions for protected vertices and edges, motivated by real-life observations regarding the costs of epidemics control.Instead of making decisions for each of the vertices and edges a decision model is used (based on rules or a neural network) with parameters optimized using an evolutionary algorithm. In the experiments the model using rules was found to perform better than the one based on a neural network.

Krzysztof Michalak
Human-Derived Heuristic Enhancement of an Evolutionary Algorithm for the 2D Bin-Packing Problem

The 2D Bin-Packing Problem (2DBPP) is an NP-Hard combinatorial optimisation problem with many real-world analogues. Fully deterministic methods such as the well-known Best Fit and First Fit heuristics, stochastic methods such as Evolutionary Algorithms (EAs), and hybrid EAs that combine the deterministic and stochastic approaches have all been applied to the problem. Combining derived human expertise with a hybrid EA offers another potential approach. In this work, the moves of humans playing a gamified version of the 2DBPP were recorded and four different Human-Derived Heuristics (HDHs) were created by learning the underlying heuristics employed by those players. Each HDH used a decision tree in place of the mutation operator in the EA. To test their effectiveness, these were compared against hybrid EAs utilising Best Fit or First Fit heuristics as well as a standard EA using a random swap mutation modified with a Next Fit heuristic if the mutation was infeasible. The HDHs were shown to outperform the standard EA and were faster to converge than – but ultimately outperformed by – the First Fit and Best Fit heuristics. This shows that humans can create competitive heuristics through gameplay and helps to understand the role that heuristics can play in stochastic search.

Nicholas Ross, Ed Keedwell, Dragan Savic
Towards Novel Meta-heuristic Algorithms for Dynamic Capacitated Arc Routing Problems

The Capacitated Arc Routing Problem (CARP) is an abstraction for typical real world applications, like waste collection, winter gritting and mail delivery, to allow the development of efficient optimization algorithms. Most research work focuses on the static CARP, where all information in the problem remains unchanged over time. However, in the real world, dynamic changes may happen when the vehicles are in service, requiring routes to be rescheduled. In this paper, we mainly focus on this kind of Dynamic CARP (DCARP). Some meta-heuristics solve (D)CARP by generating individuals that are sequences of tasks to be served as the individual representation. The split of this sequence into sub-sequences to be served by different vehicles needs to be decided to generate an executable solution, which is necessary for calculating individual’s fitness. However, the existing split schemes for static CARP and DCARP are not capable of getting high quality feasible solutions for DCARP. Therefore, we propose two different split schemes in this paper – an optimal and a greedy split scheme. The optimal split scheme, assisted by A-star algorithm, can obtain the best vehicle routes from an ordered list. The greedy split scheme is not guaranteed to obtain optimal splits, but it is much more efficient. More importantly, it can keep the rank information between different individuals. Our experiments show that the greedy split scheme has good relative accuracy with respect to the optimal split scheme and that the two proposed split schemes are better than the existing DCARP split scheme in terms of the obtained solutions’ quality.

Hao Tong, Leandro L. Minku, Stefan Menzel, Bernhard Sendhoff, Xin Yao
Robust Evolutionary Bi-objective Optimization for Prostate Cancer Treatment with High-Dose-Rate Brachytherapy

We address the real-world problem of automating the design of high-quality prostate cancer treatment plans in case of high-dose-rate brachytherapy, a form of internal radiotherapy. For this, recently a bi-objective real-valued problem formulation was introduced. With a GPU parallelization of the Multi-Objective Real-Valued Gene-pool Optimal Mixing Evolutionary Algorithm (MO-RV-GOMEA), good treatment plans were found in clinically acceptable running times. However, optimizing a treatment plan and delivering it to the patient in practice is a two-stage decision process and involves a number of uncertainties. Firstly, there is uncertainty in the identified organ boundaries due to the limited resolution of the medical images. Secondly, the treatment involves placing catheters inside the patient, which always end up (slightly) different from what was optimized. An important factor is therefore the robustness of the final treatment plan to these uncertainties. In this work, we show how we can extend the evolutionary optimization approach to find robust plans using multiple scenarios without linearly increasing the amount of required computation effort, as well as how to deal with these uncertainties efficiently when taking into account the sequential decision-making moments. The performance is tested on three real-world patient cases. We find that MO-RV-GOMEA is equally well capable of solving the more complex robust problem formulation, resulting in a more realistic reflection of the treatment plan qualities.

Marjolein C. van der Meer, Arjan Bel, Yury Niatsetski, Tanja Alderliesten, Bradley R. Pieters, Peter A. N. Bosman

Open Access

A Hybrid Evolutionary Algorithm for Reliable Facility Location Problem

The reliable facility location problem (RFLP) is an important research topic of operational research and plays a vital role in the decision-making and management of modern supply chain and logistics. Through solving RFLP, the decision-maker can obtain reliable location decisions under the risk of facilities’ disruptions or failures. In this paper, we propose a novel model for the RFLP. Instead of assuming allocating a fixed number of facilities to each customer as in the existing works, we set the number of allocated facilities as an independent variable in our proposed model, which makes our model more close to the scenarios in real life but more difficult to be solved by traditional methods. To handle it, we propose EAMLS, a hybrid evolutionary algorithm, which combines a memorable local search (MLS) method and an evolutionary algorithm (EA). Additionally, a novel metric called l3-value is proposed to assist the analysis of the algorithm’s convergence speed and exam the process of evolution. The experimental results show the effectiveness and superior performance of our EAMLS, compared to a CPLEX solver and a Genetic Algorithm (GA), on large-scale problems.

Han Zhang, Jialin Liu, Xin Yao

Reinforcement Learning

Frontmatter
Optimality-Based Analysis of XCSF Compaction in Discrete Reinforcement Learning

Learning classifier systems (LCSs) are population-based predictive systems that were originally envisioned as agents to act in reinforcement learning (RL) environments. These systems can suffer from population bloat and so are amenable to compaction techniques that try to strike a balance between population size and performance. A well-studied LCS architecture is XCSF, which in the RL setting acts as a Q-function approximator. We apply XCSF to a deterministic and stochastic variant of the FrozenLake8x8 environment from OpenAI Gym, with its performance compared in terms of function approximation error and policy accuracy to the optimal Q-functions and policies produced by solving the environments via dynamic programming. We then introduce a novel compaction algorithm (Greedy Niche Mass Compaction—GNMC) and study its operation on XCSF’s trained populations. Results show that given a suitable parametrisation, GNMC preserves or even slightly improves function approximation error while yielding a significant reduction in population size. Reasonable preservation of policy accuracy also occurs, and we link this metric to the commonly used steps-to-goal metric in maze-like environments, illustrating how the metrics are complementary rather than competitive.

Jordan T. Bishop, Marcus Gallagher
Hybridizing the 1/5-th Success Rule with Q-Learning for Controlling the Mutation Rate of an Evolutionary Algorithm

It is well known that evolutionary algorithms (EAs) achieve peak performance only when their parameters are suitably tuned to the given problem. Even more, it is known that the best parameter values can change during the optimization process. Parameter control mechanisms are techniques developed to identify and to track these values.Recently, a series of rigorous theoretical works confirmed the superiority of several parameter control techniques over EAs with best possible static parameters. Among these results are examples for controlling the mutation rate of the $$(1+\lambda )$$ EA when optimizing the OneMax problem. However, it was shown in [Rodionova et al., GECCO’19] that the quality of these techniques strongly depends on the offspring population size $$\lambda $$ .We introduce in this work a new hybrid parameter control technique, which combines the well-known one-fifth success rule with Q-learning. We demonstrate that our HQL mechanism achieves equal or superior performance to all techniques tested in [Rodionova et al., GECCO’19] and this – in contrast to previous parameter control methods – simultaneously for all offspring population sizes $$\lambda $$ . We also show that the promising performance of HQL is not restricted to OneMax, but extends to several other benchmark problems.

Arina Buzdalova, Carola Doerr, Anna Rodionova
Fitness Landscape Features and Reward Shaping in Reinforcement Learning Policy Spaces

Reinforcement learning (RL) algorithms have received a lot of attention in recent years. However, relatively little work has been dedicated to analysing RL problems; which are thought to contain unique challenges, such as sparsity of the reward signal. Reward shaping is one approach that may help alleviate the sparse reward problem.In this paper we use fitness landscape features to study how reward shaping affects the underlying optimisation landscape of RL problems. Our results indicate that features such as deception, ruggedness, searchability, and symmetry can all be greatly affected by reward shaping; while neutrality, dispersion, and the number of local optima remain relatively invariant. This may provide some guidance as to the potential effectiveness of reward shaping for different algorithms, depending on what features they are sensitive to. Additionally, all of the reward functions we studied produced policy landscapes that contain a single local optimum and very high neutrality. This suggests that algorithms that explore spaces globally, rather than locally, may perform well on RL problems; and may help explain the success of evolutionary methods on RL problems. Furthermore, we suspect that the high neutrality of these landscapes is connected to the issue of reward sparsity in RL.

Nathaniel du Preez-Wilkinson, Marcus Gallagher
ClipUp: A Simple and Powerful Optimizer for Distribution-Based Policy Evolution

Distribution-based search algorithms are a powerful approach for evolutionary reinforcement learning of neural network controllers. In these algorithms, gradients of the reward function with respect to the policy parameters are estimated using a population of solutions drawn from a search distribution, and then used for policy optimization with stochastic gradient ascent. A common choice is to use the Adam optimization algorithm for obtaining an adaptive behavior during gradient ascent, due to its success in a variety of supervised learning settings. As an alternative to Adam, we propose to enhance classical momentum-based gradient ascent with two simple-yet-effective techniques: gradient normalization and update clipping. We argue that the resulting optimizer called ClipUp (short for clipped updates) is a better choice for distribution-based policy evolution because its working principles are simple and easy to understand and its hyperparameters can be tuned more intuitively in practice. Moreover, it avoids the need to re-tune hyperparameters if the reward scale changes. Experiments show that ClipUp is competitive with Adam despite its simplicity and is effective at some of the most challenging continuous control benchmarks, including the Humanoid control task based on the Bullet physics simulator.

Nihat Engin Toklu, Paweł Liskowski, Rupesh Kumar Srivastava
Warm-Start AlphaZero Self-play Search Enhancements

Recently, AlphaZero has achieved landmark results in deep reinforcement learning, by providing a single self-play architecture that learned three different games at super human level. AlphaZero is a large and complicated system with many parameters, and success requires much compute power and fine-tuning. Reproducing results in other games is a challenge, and many researchers are looking for ways to improve results while reducing computational demands. AlphaZero’s design is purely based on self-play and makes no use of labeled expert data or domain specific enhancements; it is designed to learn from scratch. We propose a novel approach to deal with this cold-start problem by employing simple search enhancements at the beginning phase of self-play training, namely Rollout, Rapid Action Value Estimate (RAVE) and dynamically weighted combinations of these with the neural network, and Rolling Horizon Evolutionary Algorithms (RHEA). Our experiments indicate that most of these enhancements improve the performance of their baseline player in three different (small) board games, with especially RAVE based variants playing strongly.

Hui Wang, Mike Preuss, Aske Plaat

Theoretical Aspects of Nature-Inspired Optimization

Frontmatter
Runtime Analysis of a Heavy-Tailed Genetic Algorithm on Jump Functions

It was recently observed that the $$(1+(\lambda ,\lambda ))$$ genetic algorithm can comparably easily escape the local optimum of the jump functions benchmark. Consequently, this algorithm can optimize the jump function with jump size k in an expected runtime of only $$n^{(k + 1)/2}k^{-k/2}e^{O(k)}$$ fitness evaluations (Antipov, Doerr, Karavaev (GECCO 2020)). This performance, however, was obtained with non-standard parameter setting depending on the jump size k.To overcome this difficulty, we propose to choose two parameters of the $$(1+(\lambda ,\lambda ))$$ genetic algorithm randomly from a power-law distribution. Via a mathematical runtime analysis, we show that this algorithm with natural instance-independent choices of the power-law parameters on all jump functions with jump size at most n/4 has a performance close to what the best instance-specific parameters in the previous work obtained. This price for instance-independence can be made as small as an $$O(n\log (n))$$ factor. Given the difficulty of the jump problem and the runtime losses from using mildly suboptimal fixed parameters (also discussed in this work), this appears to be a fair price.

Denis Antipov, Benjamin Doerr
First Steps Towards a Runtime Analysis When Starting with a Good Solution

The mathematical runtime analysis of evolutionary algorithms traditionally regards the time an algorithm needs to find a solution of a certain quality when initialized with a random population. In practical applications it may be possible to guess solutions that are better than random ones. We start a mathematical runtime analysis for such situations. We observe that different algorithms profit to a very different degree from a better initialization. We also show that the optimal parameterization of the algorithm can depend strongly on the quality of the initial solutions. To overcome this difficulty, self-adjusting and randomized heavy-tailed parameter choices can be profitable. Finally, we observe a larger gap between the performance of the best evolutionary algorithm we found and the corresponding black-box complexity. This could suggest that evolutionary algorithms better exploiting good initial solutions are still to be found. These first findings stem from analyzing the performance of the $$(1+1)$$ evolutionary algorithm and the static, self-adjusting, and heavy-tailed $$(1 + (\lambda ,\lambda ))$$ GA on the OneMax benchmark, but we are optimistic that the question how to profit from good initial solutions is interesting beyond these first examples.

Denis Antipov, Maxim Buzdalov, Benjamin Doerr
Optimal Mutation Rates for the EA on OneMax

The OneMax problem, alternatively known as the Hamming distance problem, is often referred to as the “drosophila of evolutionary computation (EC)”, because of its high relevance in theoretical and empirical analyses of EC approaches. It is therefore surprising that even for the simplest of all mutation-based algorithms, Randomized Local Search and the (1 + 1) EA, the optimal mutation rates were determined only very recently, in a GECCO 2019 poster.In this work, we extend the analysis of optimal mutation rates to two variants of the $$(1+\lambda )$$ EA and to the $$(1+\lambda )$$ RLS. To do this, we use dynamic programming and, for the $$(1+\lambda )$$ EA, numeric optimization, both requiring $$\varTheta (n^3)$$ time for problem dimension n. With this in hand, we compute for all population sizes $$\lambda \in \{2^i \mid 0 \le i \le 18\}$$ and for problem dimension $$n \in \{1000, 2000, 5000\}$$ which mutation rates minimize the expected running time and which ones maximize the expected progress. Our results do not only provide a lower bound against which we can measure common evolutionary approaches, but we also obtain insight into the structure of these optimal parameter choices. For example, we show that, for large population sizes, the best number of bits to flip is not monotone in the distance to the optimum. We also observe that the expected remaining running times are not necessarily unimodal for the $$(1+\lambda )$$ EA $$_{0 \rightarrow 1}$$ with shifted mutation.

Maxim Buzdalov, Carola Doerr
Maximizing Submodular or Monotone Functions Under Partition Matroid Constraints by Multi-objective Evolutionary Algorithms

Many important problems can be regarded as maximizing submodular functions under some constraints. A simple multi-objective evolutionary algorithm called GSEMO has been shown to achieve good approximation for submodular functions efficiently. While there have been many studies on the subject, most of existing run-time analyses for GSEMO assume a single cardinality constraint. In this work, we extend the theoretical results to partition matroid constraints which generalize cardinality constraints, and show that GSEMO can generally guarantee good approximation performance within polynomial expected run time. Furthermore, we conducted experimental comparison against a baseline GREEDY algorithm in maximizing undirected graph cuts on random graphs, under various partition matroid constraints. The results show GSEMO tends to outperform GREEDY in quadratic run time.

Anh Viet Do, Frank Neumann
Lower Bounds for Non-elitist Evolutionary Algorithms via Negative Multiplicative Drift

A decent number of lower bounds for non-elitist population-based evolutionary algorithms has been shown by now. Most of them are technically demanding due to the (hard to avoid) use of negative drift theorems – general results which translate an expected progress away from the target into a high hitting time.We propose a simple negative drift theorem for multiplicative drift scenarios and show that it can simplify existing analyses. We discuss in more detail Lehre’s (PPSN 2010) negative drift in populations method, one of the most general tools to prove lower bounds on the runtime of non-elitist mutation-based evolutionary algorithms for discrete search spaces. Together with other arguments, we obtain an alternative and simpler proof, which also strengthens and simplifies this method. In particular, now only three of the five technical conditions of the previous result have to be verified. The lower bounds we obtain are explicit instead of only asymptotic. This allows to compute concrete lower bounds for concrete algorithms, but also enables us to show that super-polynomial runtimes appear already when the reproduction rate is only a $$(1 - \omega (n^{-1/2}))$$ factor below the threshold. As one particular result, we apply this method and a novel domination argument to show an exponential lower bound for the runtime of the mutation-only simple GA on OneMax for arbitrary population size.

Benjamin Doerr
Exponential Upper Bounds for the Runtime of Randomized Search Heuristics

We argue that proven exponential upper bounds on runtimes, an established area in classic algorithms, are interesting also in evolutionary computation and we prove several such results. We show that any of the algorithms randomized local search, Metropolis algorithm, simulated annealing, and $$(1+1)$$ evolutionary algorithm can optimize any pseudo-Boolean weakly monotonic function under a large set of noise assumptions in a runtime that is at most exponential in the problem dimension n. This drastically extends a previous such result, limited to the $$(1+1)$$ EA, the LeadingOnes function, and one-bit or bit-wise prior noise with noise probability at most 1/2, and at the same time simplifies its proof. With the same general argument, among others, we also derive a sub-exponential upper bound for the runtime of the $$(1,\lambda )$$ evolutionary algorithm on the OneMax problem when the offspring population size $$\lambda $$ is logarithmic, but below the efficiency threshold.

Benjamin Doerr
Analysis on the Efficiency of Multifactorial Evolutionary Algorithms

Many experimental studies have demonstrated the superiority of multifactorial evolutionary algorithms (MFEAs) over traditional methods of solving each task independently. In this paper, we investigate this topic from theoretical analysis aspect. We present a runtime analysis of a (4+2) MFEA on several benchmark pseudo-Boolean functions, which include problems with similar tasks and problems with dissimilar tasks. Our analysis results show that, by properly setting the parameter rmp (i.e., the random mating probability), for the group of problems with similar tasks, the upper bound of expected runtime of the (4+2) MFEA on the harder task can be improved to be the same as on the easier one. As for the group of problems with dissimilar tasks, the expected upper bound of (4+2) MFEA on each task are the same as that of solving them independently. This study theoretically explains why some existing MFEAs perform better than traditional methods in experimental studies and provides insights into the parameter setting of MFEAs.

Zhengxin Huang, Zefeng Chen, Yuren Zhou
Improved Fixed-Budget Results via Drift Analysis

Fixed-budget theory is concerned with computing or bounding the fitness value achievable by randomized search heuristics within a given budget of fitness function evaluations. Despite recent progress in fixed-budget theory, there is a lack of general tools to derive such results. We transfer drift theory, the key tool to derive expected optimization times, to the fixed-budged perspective. A first and easy-to-use statement concerned with iterating drift in so-called greed-admitting scenarios immediately translates into bounds on the expected function value. Afterwards, we consider a more general tool based on the well-known variable drift theorem. Applications of this technique to the LeadingOnes benchmark function yield statements that are more precise than the previous state of the art.

Timo Kötzing, Carsten Witt
On Averaging the Best Samples in Evolutionary Computation

Choosing the right selection rate is a long standing issue in evolutionary computation. In the continuous unconstrained case, we prove mathematically that a single parent $$\mu =1$$ leads to a sub-optimal simple regret in the case of the sphere function. We provide a theoretically-based selection rate $$\mu /\lambda $$ that leads to better progress rates. With our choice of selection rate, we get a provable regret of order $$O(\lambda ^{-1})$$ which has to be compared with $$O(\lambda ^{-2/d})$$ in the case where $$\mu =1$$ . We complete our study with experiments to confirm our theoretical claims.

Laurent Meunier, Yann Chevaleyre, Jeremy Rapin, Clément W. Royer, Olivier Teytaud
Filter Sort Is in the Worst Case

Non-dominated sorting is a crucial operation used in many popular evolutionary multiobjective algorithms. The problem of non-dominated sorting, although solvable in polynomial time, is surprisingly difficult, and no algorithm is yet known which solves any instance on N points and M objectives in time asymptotically smaller than $$MN^2$$ .For this reason, many algorithm designers concentrate on reducing the leading constant and on (implicitly) tailoring their algorithms to inputs typical to evolutionary computation. While doing that, they sometimes forget to ensure that the worst-case running time of their algorithm is still $$O(MN^2)$$ . This is undesirable, especially if the inputs which make the algorithm work too slow can occur spontaneously. However, even if a counterexample is hard to find, the fact that it exists is still a weak point, as this can be exploited and lead to denial of service and other kinds of misbehaving.In this paper we prove that a recent algorithm for non-dominated sorting, called Filter Sort, has the worst-case complexity of $$\varOmega (N^3)$$ . In particular, we present a scenario which requires Filter Sort to perform $$\varTheta (N^3)$$ dominance comparisons, where each comparison, however, needs only O(1) elementary operations. Our scenario contains $$\varTheta (N)$$ non-domination layers, which is a necessary, but by no means a sufficient condition for being difficult for Filter Sort.

Sumit Mishra, Maxim Buzdalov
Approximation Speed-Up by Quadratization on LeadingOnes

We investigate the quadratization of LeadingOnes in the context of the landscape for local search. We prove that a standard quadratization (i.e., its expression as a degree-2 multilinear polynomial) of LeadingOnes transforms the search space for local search in such a way that faster progress can be made. In particular, we prove there is a $$\varOmega (n/\log n)$$ speed-up for constant-factor approximations by RLS when using the quadratized version of the function. This suggests that well-known transformations for classical pseudo-Boolean optimization might have an interesting impact on search heuristics. We derive and present numerical results that investigate the difference in correlation structure between the untransformed landscape and its quadratization. Finally, we report experiments that provide a detailed glimpse into the convergence properties on the quadratized function.

Andrew M. Sutton, Darrell Whitley
Benchmarking a Genetic Algorithm with Configurable Crossover Probability

We investigate a family of $$(\mu +\lambda )$$ Genetic Algorithms (GAs) which creates offspring either from mutation or by recombining two randomly chosen parents. By scaling the crossover probability, we can thus interpolate from a fully mutation-only algorithm towards a fully crossover-based GA. We analyze, by empirical means, how the performance depends on the interplay of population size and the crossover probability.Our comparison on 25 pseudo-Boolean optimization problems reveals an advantage of crossover-based configurations on several easy optimization tasks, whereas the picture for more complex optimization problems is rather mixed. Moreover, we observe that the “fast” mutation scheme with its are power-law distributed mutation strengths outperforms standard bit mutation on complex optimization tasks when it is combined with crossover, but performs worse in the absence of crossover.We then take a closer look at the surprisingly good performance of the crossover-based $$(\mu +\lambda )$$ GAs on the well-known LeadingOnes benchmark problem. We observe that the optimal crossover probability increases with increasing population size $$\mu $$ . At the same time, it decreases with increasing problem dimension, indicating that the advantages of the crossover are not visible in the asymptotic view classically applied in runtime analysis. We therefore argue that a mathematical investigation for fixed dimensions might help us observe effects which are not visible when focusing exclusively on asymptotic performance bounds.

Furong Ye, Hao Wang, Carola Doerr, Thomas Bäck
Backmatter
Metadata
Title
Parallel Problem Solving from Nature – PPSN XVI
Editors
Prof. Thomas Bäck
Mike Preuss
Dr. André Deutz
Hao Wang
Dr. Carola Doerr
Assoc. Prof. Michael Emmerich
Heike Trautmann
Copyright Year
2020
Electronic ISBN
978-3-030-58115-2
Print ISBN
978-3-030-58114-5
DOI
https://doi.org/10.1007/978-3-030-58115-2

Premium Partner