Skip to main content

2022 | Buch

Genetic Programming Theory and Practice XVIII

herausgegeben von: Wolfgang Banzhaf, Leonardo Trujillo, Stephan Winkler, Bill Worzel

Verlag: Springer Nature Singapore

Buchreihe : Genetic and Evolutionary Computation

insite
SUCHEN

Über dieses Buch

This book, written by the foremost international researchers and practitioners of genetic programming (GP), explores the synergy between theoretical and empirical results on real-world problems, producing a comprehensive view of the state of the art in GP. In this year’s edition, the topics covered include many of the most important issues and research questions in the field, such as opportune application domains for GP-based methods, game playing and co-evolutionary search, symbolic regression and efficient learning strategies, encodings and representations for GP, schema theorems, and new selection mechanisms. The book includes several chapters on best practices and lessons learned from hands-on experience. Readers will discover large-scale, real-world applications of GP to a variety of problem domains via in-depth presentations of the latest and most significant results.

Inhaltsverzeichnis

Frontmatter
Chapter 1. Finding Simple Solutions to Multi-Task Visual Reinforcement Learning Problems with Tangled Program Graphs
Abstract
Tangled Program Graphs (TPG) represents a genetic programming framework in which emergent modularity incrementally composes programs into teams of programs into graphs of teams of programs. To date, the framework has been demonstrated on reinforcement learning tasks with stochastic partially observable state spaces or time series prediction. However, evolving solutions to reinforcement tasks often requires agents to demonstrate/ juggle multiple properties simultaneously. Hence, we are interesting in maintaining a population of diverse agents. Specifically, agent performance on a reinforcement learning task controls how much of the task they are exposed to. Premature convergence might therefore preclude solving aspects of a task that the agent only later encounters. Moreover, ‘pointless complexity’ may also result in which graphs largely consist of hitchhikers. In this research we benchmark the utilization of rampant mutation (multiple mutations applied simultaneously for offspring creation) and action programs (multiple actions per state). Several parameterizations are also introduced that potentially penalize the introduction of hitchhikers. Benchmarking over five VizDoom tasks demonstrates that rampant mutation reduces the likelihood of encountering pathologically bad offspring while action programs appears to improve performance in four out of five tasks. Finally, use of TPG parameterizations that actively limit the complexity of solutions appears to result in very efficient low dimensional solutions that generalize best across all combinations of 3, 4 and 5 VizDoom tasks.
Caleidgh Bayer, Ryan Amaral, Robert J. Smith, Alexandru Ianta, Malcolm I. Heywood
Chapter 2. Grammar-Based Vectorial Genetic Programming for Symbolic Regression
Abstract
Vectorial Genetic Programming (GP) is a young branch of GP, where the training data for symbolic models not only include regular, scalar variables, but also allow vector variables. Also, the model’s abilities are extended to allow operations on vectors, where most vector operations are simply performed component-wise. Additionally, new aggregation functions are introduced that reduce vectors into scalars, allowing the model to extract information from vectors by itself, thus eliminating the need of prior feature engineering that is otherwise necessary for traditional GP to utilize vector data. And due to the white-box nature of symbolic models, the operations on vectors can be as easily interpreted as regular operations on scalars. In this paper, we extend the ideas of vectorial GP of previous authors, and propose a grammar-based approach for vectorial GP that can deal with various challenges noted. To evaluate grammar-based vectorial GP, we have designed new benchmark functions that contain both scalar and vector variables, and show that traditional GP falls short very quickly for certain scenarios. Grammar-based vectorial GP, however, is able to solve all presented benchmarks.
Philipp Fleck, Stephan Winkler, Michael Kommenda, Michael Affenzeller
Chapter 3. Grammatical Evolution Mapping for Semantically-Constrained Genetic Programming
Abstract
Search-Based Software Engineering problems frequently have semantic constraints that can be used to deterministically restrict what type of programs can be generated, improving the performance of Genetic Programming. Strongly-Typed and Grammar-Guided Genetic Programming are two examples of using domain-knowledge to improve performance of Genetic Programming by preventing solutions that are known to be invalid from ever being added to the population. However, the restrictions in real world challenges like program synthesis, automated program repair or test generation are more complex than what context-free grammars or simple types can express. We address these limitations with examples, and discuss the process of efficiently generating individuals in the context of Christiansen Grammatical Evolution and Refined-Typed Genetic Programming. We present three new approaches for the population initialization procedure of semantically constrained GP that are more efficient and promote more diversity than traditional Grammatical Evolution.
Alcides Fonseca, Paulo Santos, Guilherme Espada, Sara Silva
Chapter 4. What Can Phylogenetic Metrics Tell us About Useful Diversity in Evolutionary Algorithms?
Abstract
It is generally accepted that “diversity” is associated with success in evolutionary algorithms. However, diversity is a broad concept that can be measured and defined in a multitude of ways. To date, most evolutionary computation research has measured diversity using the richness and/or evenness of a particular genotypic or phenotypic property. While these metrics are informative, we hypothesize that other diversity metrics are more strongly predictive of success. Phylogenetic diversity metrics are a class of metrics popularly used in biology, which take into account the evolutionary history of a population. Here, we investigate the extent to which (1) these metrics provide different information than those traditionally used in evolutionary computation, and (2) these metrics better predict the long-term success of a run of evolutionary computation. We find that, in most cases, phylogenetic metrics behave meaningfully differently from other diversity metrics. Moreover, our results suggest that phylogenetic diversity is indeed a better predictor of success.
Jose Guadalupe Hernandez, Alexander Lalejini, Emily Dolson
Chapter 5. An Exploration of Exploration: Measuring the Ability of Lexicase Selection to Find Obscure Pathways to Optimality
Abstract
Parent selection algorithms (selection schemes) steer populations through a problem’s search space, often trading off between exploitation and exploration. Understanding how selection schemes affect exploitation and exploration within a search space is crucial to tackling increasingly challenging problems. Here, we introduce an “exploration diagnostic” that diagnoses  a selection scheme’s capacity for search space exploration. We use our exploration diagnostic to investigate the exploratory capacity of lexicase selection and several of its variants: epsilon lexicase, down-sampled lexicase, cohort lexicase, and novelty-lexicase. We verify that lexicase selection out-explores tournament selection, and we show that lexicase selection’s exploratory capacity can be sensitive to the ratio between population size and the number of test cases used for evaluating candidate solutions. Additionally, we find that relaxing lexicase’s elitism with epsilon lexicase can further improve exploration. Both down-sampling and cohort lexicase—two techniques for applying random subsampling to test cases—degrade lexicase’s exploratory capacity; however, we find that cohort partitioning better preserves lexicase’s exploratory capacity than down-sampling. Finally, we find evidence that novelty-lexicase’s addition of novelty test cases can degrade lexicase’s capacity for exploration. Overall, our findings provide hypotheses for further exploration and actionable insights and recommendations for using lexicase selection. Additionally, this work demonstrates the value of selection scheme diagnostics as a complement to more conventional benchmarking approaches to selection scheme analysis.
Jose Guadalupe Hernandez, Alexander Lalejini, Charles Ofria
Chapter 6. Feature Discovery with Deep Learning Algebra Networks
Abstract
Deep learning neural networks have produced some notable well publicized successes in several fields. Genetic Programming has also produced well publicized notable successes. Inspired by the deep learning successes with neural nets, we experiment with deep learning algebra networks where the network remains unchanged but where the neurons are replaced with general algebraic expressions. The training algorithms replace back propagation, counter propagation, etc. with a combination of genetic programming to  generate the algebraic expressions and multiple regression, logit regression, and discriminant analysis to train the deep learning algebra network. These enhanced algebra networks are trained on ten theoretical classification problems with good performance advances which show a clear statistical performance improvement as network architecture is expanded.
Michael F. Korns
Chapter 7. Back to the Future—Revisiting OrdinalGP and Trustable Models After a Decade
Abstract
OrdinalGP (2006) [4] embraced a fail-fast philosophy to efficiently model very large data sets. Recently, we realized that it was also effective against small data sets to reward model generalization. ESSENCE (2009) [6] extended the OrdinalGP concept to handle imbalanced data by using the SMITS algorithm to rank data records according to their information content to avoid locking into the behavior of heavily sampled data regions but had the disadvantage of computationally-intensive data conditioning with a corresponding fixed data ranking. With BalancedGP (2019) we shifted to a stochastic sampling to achieve a similar benefit. Trustable models (2007) [3] exploited the diversity of model forms developed during symbolic regression to define ensembles that feature both accurate prediction as well as detection of extrapolation into new regions of parameter space as well as changes in the underlying system. Although the deployed implementation has been effective, the diversity metric used was data-centric so alternatives have been explored to improve the robustness of ensemble definition. This chapter documents our latest thinking, realizations, and benefits of revisiting OrdinalGP and trustable models.
Mark Kotanchek, Nathan Haut
Chapter 8. Fitness First
Abstract
With side effect free terminals and functions it is possible to evaluate the fitness of genetic programming trees from their parents without creating them. This allows selection before forming the next generation. Thus avoiding unfit runt Genetic Algorithm individuals, which will themselves have no children. In highly diverse GA populations with strong selection, more than 50% of children need not be created. Even with two parent crossover, in converged populations, \(e^{-2}\) = 13.5% can be saved. Eliminating bachelors and spinsters and extracting the smaller genetic material of each mating before crossover, reduces storage in an N multi-threaded implementation for a population M to \(\le \)0.63M+N, compared to the usual M+2N. Memory efficient crossover achieves 692 billion GP operations per second, 692 giga GPops, at runtime on a 16 core 3.8 GHz desktop.
W. B. Langdon
Chapter 9. Designing Multiple ANNs with Evolutionary Development: Activity Dependence
Abstract
We use Cartesian genetic programming to evolve developmental programs that construct neural networks. One program represents the neuron soma and the other the dendrite. We show that the evolved programs can build a network from which multiple conventional ANNs can be extracted each of which can solve a different computational problem. We particularly investigate the utility of activity dependence (AD), where the signals passing through dendrites and neurons affect their properties.
Julian Francis Miller
Chapter 10. Evolving and Analyzing Modularity with GLEAM (Genetic Learning by Extraction and Absorption of Modules)
Abstract
General methods for the evolution of programs with modular structure have long been sought by genetic programming researchers, in part because modularity has long been considered to be essential, or at least helpful, for human programmers when they develop large-scale software projects. Multiple efforts have been made in this direction, and while success has been demonstrated in specific contexts, no general scheme has yet been demonstrated to provide benefits for evolutionary program synthesis that are similar in generality and significance to those provided by modularity in human software engineering. In this chapter, we present and analyze a new framework for the study of the evolution of modularity, called GLEAM (Genetic Learning by Extraction and Absorption of Modules). GLEAM’s flexible architecture and tunable parameters allow researchers to test different methods related to the generation, propagation, and use of modules in genetic programming.
Anil Kumar Saini, Lee Spector
Chapter 11. Evolution of the Semiconductor Industry, and the Start of X Law
Abstract
In this paper, we explore the use of evolutionary concepts to predict what-comes-next for the Semiconductor Industry. At its core, evolution is the transition of information. Understanding what causes the transitions paves the way to potentially creating a predictive model for the industry. Prediction is one of the essential functions of research; it is challenging to get right; it is of paramount importance when it comes to determining the next commercial objective and often depends on a single change. The most critical part of the prediction is to explore the components that form the landscape of potential outcomes. With these outcomes, we can decide what careers to take, what areas to dedicate resources towards and further out as a possible method to increase revenue. The Semiconductor Industry  is a complex ecosystem, where many adjacent industries rely on its continued advancements. The human appetite to consume more data puts pressure on the industry. Consumption drives three technology vectors, namely storage, compute, and communication. Under this premise, two thoughts lead to this paper. Firstly, the End of Moore’s Law (EoML) [33], where transistor density growth slows down over time. Either due to costs or technology constraints (thermal and energy restrictions). These factors mean that traditional iterative methods, adopted by the Semiconductor Industry, may fail to satisfy future data demands. Secondly, the quote by Leonard Adleman “Evolution is not the story of life; it is the story of compute” [2], where essentially evolution is used as a method to understand future advancements. Understanding a landscape and its parameterization could lead to a predictive model for the Semiconductor Industry. The plethora of future evolutionary steps available means we should probably discard focusing on EoML and shift our attention to finding the next new law for the industry. The new law is the Start of X Law, where X symbolizes a new beginning. Evolutionary principles show that co-operation and some form of altruism may be the only methods to achieve these forward steps. Future choices end up being a balancing act between conflicting ideas due to the multi-objective nature of the overall requirements.
Andrew N. Sloss
Backmatter
Metadaten
Titel
Genetic Programming Theory and Practice XVIII
herausgegeben von
Wolfgang Banzhaf
Leonardo Trujillo
Stephan Winkler
Bill Worzel
Copyright-Jahr
2022
Verlag
Springer Nature Singapore
Electronic ISBN
978-981-16-8113-4
Print ISBN
978-981-16-8112-7
DOI
https://doi.org/10.1007/978-981-16-8113-4

Premium Partner