nach oben

2015 | Buch

Handbook of Genetic Programming Applications

herausgegeben von: Amir H. Gandomi, Amir H. Alavi, Conor Ryan

Verlag: Springer International Publishing

Enthalten in: Springer Professional "Wirtschaft+Technik" , Springer Professional "Technik" , Springer Professional "Wirtschaft"

Einloggen, um Zugang zu erhalten

Über dieses Buch

This contributed volume, written by leading international researchers, reviews the latest developments of genetic programming (GP) and its key applications in solving current real world problems, such as energy conversion and management, financial analysis, engineering modeling and design, and software engineering, to name a few. Inspired by natural evolution, the use of GP has expanded significantly in the last decade in almost every area of science and engineering. Exploring applications in a variety of fields, the information in this volume can help optimize computer programs throughout the sciences. Taking a hands-on approach, this book provides an invaluable reference to practitioners, providing the necessary details required for a successful application of GP and its branches to challenging problems ranging from drought prediction to trading volatility. It also demonstrates the evolution of GP through major developments in GP studies and applications. It is suitable for advanced students who wish to use relevant book chapters as a basis to pursue further research in these areas, as well as experienced practitioners looking to apply GP to new areas. The book also offers valuable supplementary material for design courses and computation in engineering.

Inhaltsverzeichnis

Frontmatter

Overview of Genetic Programming Applications

Frontmatter

Chapter 1. Graph-Based Evolutionary Art

Abstract

A graph-based approach for the evolution of Context Free Design Grammars is presented. Each genotype is a directed hierarchical graph and, as such, the evolutionary engine employs graph-based crossover and mutation. We introduce six different fitness functions based on evolutionary art literature and conduct a wide set of experiments. We begin by assessing the adequacy of the system and establishing the experimental parameters. Afterwards, we conduct evolutionary runs using each fitness function individually. Finally, experiments where a combination of these functions is used to assign fitness are performed. Overall, the experimental results show the ability of the system to optimize the considered functions, individually and combined, and to evolve images that have the desired visual characteristics.

Penousal Machado, João Correia, Filipe Assunção

Chapter 2. Genetic Programming for Modelling of Geotechnical Engineering Systems

Abstract

Over the last decade or so, artificial intelligence (AI) has proved to provide a high level of competency in solving many geotechnical engineering problems that are beyond the computational capability of classical mathematics and traditional procedures. This chapter presents one of the most interesting AI techniques, i.e. genetic programming (GP), and its applications in geotechnical engineering. In the last few years, GP, which is inspired by natural evolution of the human being, has proved to be successful in modelling several geotechnical engineering problems and has demonstrated superior predictive ability compared to traditional methods. In this chapter, the modelling aspects and formulation of GP are described and explained in some detail and an overview of most successful GP applications in geotechnical engineering are presented and discussed.

Mohamed A. Shahin

Chapter 3. Application of Genetic Programming in Hydrology

Abstract

With increasing complexity and accuracy of different phenomenon modeling, attentions focus on using and improving some tools that extract system equations by simple rules. Commonly, these tools are user-friendly and try to minimize error criterion between real (observed) and obtained values by system rules. An appropriate water resource modeling requires assistance of computer model to provide connections in data sets, management and decision makers. The purpose of this chapter is to review genetic programming (GP) applications in the hydrology and consider future aspects for research and application. Previous applications of GP presented its capabilities to overcome some system characteristics such as the high-dimensional, nonlinearity, and convexity. GP is flexible to set with other systems in both internal and external states.

E. Fallah-Mehdipour, O. Bozorg Haddad

Chapter 4. Application of Gene-Expression Programming in Hydraulic Engineering

Abstract

Open-channel hydraulics, probably, is most important branch of water resources engineering. This sub-discipline has vital and critical importance to human history. Complex and highly nonlinear behavior of most problems in hydraulics leads to use various soft computing techniques for their efficient solution. Genetic programming (GP) is relatively one of the new soft computing techniques which have high ability in developing intelligent systems and providing precise functional relationship solutions to complicated problems. Capability of GP in solving many engineering problems, development and application of GP branches has attracted many researchers’ attention. Gene-Expression Programming (GEP) is becoming an important class of GP, which has found extensive applications for hydraulic engineering. GEP is an evolutionary algorithm that mimics biological evolution to model some complicated real world phenomenon. In this book chapter, attempts were made to present a review of application and development of GEP in many hydraulics phenomena. The literature review categorize in various aspects of hydraulic engineering including flow through hydraulic structures, scouring at control structures and stage-discharge rating curve in compound open channels.

A. Zahiri, A. A. Dehghani, H. Md. Azamathulla

Chapter 5. Genetic Programming Applications in Chemical Sciences and Engineering

Abstract

Genetic programming (GP) (Koza, Genetic programming: a paradigm for genetically breeding populations of computer programs to solve problems, Stanford University, Stanford, 1990) was originally proposed for automatically generating computer programs that would perform pre-defined tasks. There exist two other important GP applications, namely classification and “symbolic regression” that are being utilized widely in pattern recognition and data-driven modeling, respectively. As compared to the classification, GP has found more applications for its capability to effectively perform symbolic regression (SR). Given an input–output data set SR can search and optimize an appropriate linear/non-linear data-fitting function and all its parameters. The GP-based symbolic regression (GPSR) offers an attractive avenue to extract correlations, explore candidate models and provide optimal solutions to the data-driven modeling problems. Despite its novelty and effectiveness, GP—unlike artificial neural networks and support vector regression—has not seen an explosive growth in its applications. Owing to the availability of feature-rich and user-friendly software packages as also faster computers (including parallel computing devices), there has been a spate of research publications in recent years exploiting the significant potential of GP for diverse classification and modeling applications in chemistry and related sciences and engineering. Accordingly, this chapter provides a bird’s eye-view of the ever increasing applications of GP in the chemical sciences and engineering with the objective of bringing out its immense potential in solving diverse problems. The present chapter not only focuses on the important GP-applications but also offers guidelines to develop optimal GP models. Additionally, a non-exclusive list of GP software packages is provided.

Renu Vyas, Purva Goel, Sanjeev S. Tambe

Chapter 6. Application of Genetic Programming for Electrical Engineering Predictive Modeling: A Review

Abstract

The purpose of having computers automatically resolve problems is essential for machine learning, artificial intelligence and a wide area covered by what Turing called‘machine intelligence’. Genetic programming (GP) is an adaptable and strong evolutionary algorithm with some features that can be very priceless and adequate to get computers automatically to address problems starting from a high-level statement of what to do. Using the concept from natural evolution, GP begins from an ooze of random computer programs and improve them progressively through processes of mutation and sexual recombination until solutions appear. All this without the user needing to know or determine the form or structure of solutions in advance. GP has produced a plethora of human-competitive results and applications, involving novel scientific discoveries and patent-able inventions. The goal of this paper is to give an introduction to the quickly developing field of GP. We begin with a gentle introduction to the basic representation, initialization and operators utilized in GP, completed by a step by step description of their utilization and application. Then, we progress to explain the diversity of alternative representations for programs and more advanced specializations of GP. Despite the fact that this paper has been written with beginners and practitioners in mind, for completeness we also provide an outline of the theoretical aspect available to date for GP.

Seyyed Soheil Sadat Hosseini, Alireza Nemati

Chapter 7. Mate Choice in Evolutionary Computation

Abstract

Darwin considered two major theories that account for the evolution of species. Natural Selection was described as the result of competition within or between species affecting its individuals relative survival ability, while Sexual Selection was described as the result of competition within species affecting its individuals relative rate of reproduction. This theory emerged from Darwin’s necessity to explain complex ornamentation and behaviour that while being costly to maintain, bring no apparent survival advantages to individuals. Mate Choice is one of the processes described by Darwin’s theory of Sexual Selection as responsible for the emergence of a wide range of characteristics such as the peacock’s tail, bright coloration in different species, certain bird singing or extravagant courtship behaviours. As the theory attracted more and more researchers, the role of Mate Choice has been extensively discussed and backed up by supporting evidence, showing how a force which adapts individuals not to their habitat but to each other can have a strong impact on the evolution of species. While Mate Choice is highly regarded in many research fields, its role in Evolutionary Computation (EC) is still far from being explored and understood. Following Darwin’s ideas on Mate Choice, as well as Fisher’s contributions regarding the heritability of mating preferences, we propose computational models of Mate Choice, which follow three key rules: individuals choose their mating partners based on their perception mechanisms and mating preferences; mating preferences are heritable the same way as any other trait; Mate Choice introduces its own selection pressure but is subjected to selection pressure itself. The use of self-adaptive methods allows individuals to encode their own mating preferences, use them to evaluate mating candidates and pass preferences on to future generations. Self-adaptive Mate Choice also allows evaluation functions to adapt to the problem at hand as well as to the individuals in the population. In this study we show how Genetic Programming (GP) can be used to represent and evolve mating preferences. In our approach the genotype of each individual is composed of two chromosomes encoding: (1) a candidate solution to the problem at hand (2) a mating partner evaluation function. During the reproduction step of the algorithm, the first parent is chosen based on fitness, as in conventional EC approaches; the mating partner evaluation function encoded on the genotype of this individual is then used to evaluate its potential partners and choose a second parent. Being part of the genotype, the evaluation functions are subjected to evolution and there is an evolutionary pressure to evolve adequate mate evaluation functions. We analyze and discuss the impact of this approach on the evolutionary process, showing how valuable and innovative mate evaluation functions, which would unlikely be designed by humans, arise. We also explain how GP non-terminal and terminal sets can be defined in order to allow the representation of mate selection functions. Finally, we show how self-adaptive Mate Choice can be applied in both academic and real world applications, having achieved encouraging results in both cases. Future venues of research are also proposed such as applications on dynamic environments or multi-objective problems.

António Leitão, Penousal Machado

Specialized Applications

Frontmatter

Chapter 8. Genetically Improved Software

Abstract

Genetic programming (GP) can dramatically increase computer programs’ performance. It can automatically port or refactor legacy code written by domain experts and specialist software engineers. After reviewing SBSE research on evolving software we describe an open source parallel StereoCamera image processing application in which GI optimisation gave a seven fold speedup on nVidia Tesla GPU hardware not even imagined when the original state-of-the-art CUDA GPGPU C++ code was written.

William B. Langdon

Chapter 9. Design of Real-Time Computer-Based Systems Using Developmental Genetic Programming

Abstract

This chapter presents applications of the developmental genetic programming (DGP) to design and optimize real-time computer-based systems. We show that the DGP approach may be efficiently used to solve the following problems: scheduling of real-time tasks in multiprocessor systems, hardware/software codesign of distributed embedded systems, budget-aware real-time cloud computing. The goal of optimization is to minimize the cost of the system, while all real-time constraints will be satisfied. Since the finding of the best solution is very complex, only efficient heuristics may be applied for real-life systems. Unlike the other genetic approaches where chromosomes represent solutions, in the DGP chromosomes represent system construction procedures. Thus, not the system architecture, but the synthesis process evolves. Finally, a tree describing the construction of a (sub-)optimal solution is obtained and the genotype-to-phenotype mapping is applied to create the target system. Some other ideas concerning other applications of the DGP for optimization of computer-based systems also are outlined.

Stanisław Deniziak, Leszek Ciopiński, Grzegorz Pawiński

Chapter 10. Image Classification with Genetic Programming: Building a Stage 1 Computer Aided Detector for Breast Cancer

Abstract

This chapter describes a general approach for image classification using Genetic Programming (GP) and demonstrates this approach through the application of GP to the task of stage 1 cancer detection in digital mammograms. We detail an automated work-flow that begins with image processing and culminates in the evolution of classification models which identify suspicious segments of mammograms. Early detection of breast cancer is directly correlated with survival of the disease and mammography has been shown to be an effective tool for early detection, which is why many countries have introduced national screening programs. However, this presents challenges, as such programs involve screening a large number of women and thus require more trained radiologists at a time when there is a shortage of these professionals in many countries.Also, as mammograms are difficult to read and radiologists typically only have a few minutes allocated to each image, screening programs tend to be conservative—involving many callbacks which increase both the workload of the radiologists and the stress and worry of patients.Fortunately, the relatively recent increase in the availability of mammograms in digital form means that it is now much more feasible to develop automated systems for analysing mammograms. Such systems, if successful could provide a very valuable second reader function.We present a work-flow that begins by processing digital mammograms to segment them into smaller sub-images and to extract features which describe textural aspects of the breast. The most salient of these features are then used in a GP system which generates classifiers capable of identifying which particular segments may have suspicious areas requiring further investigation. An important objective of this work is to evolve classifiers which detect as many cancers as possible but which are not overly conservative. The classifiers give results of 100 % sensitivity and a false positive per image rating of just 0.33, which is better than prior work. Not only this, but our system can use GP as part of a feedback loop, to both select and help generate further features.

Conor Ryan, Jeannie Fitzgerald, Krzysztof Krawiec, David Medernach

Chapter 11. On the Application of Genetic Programming for New Generation of Ground Motion Prediction Equations

Abstract

The ground-motion prediction equations (GMPEs) generally predict ground-motion intensities such as peak ground acceleration (PGA), peak ground velocity (PGV), and response spectral acceleration (SA), as a functional form of magnitude, site-to-source distance, site condition, and other seismological parameters. An adequate prediction of the expected ground motion intensities plays a fundamental role in practical assessment of seismic hazard analysis, thus GMPEs are known as the most potent elements that conspicuously affect the Seismic Hazard Analysis (SHA). Recently, beside two common traditional methodologies, i.e. empirical and physical relationships, the application of Genetic Programming, as an optimization technique based on the Evolutionary Algorithms (EA), has taken on vast new dimensions. During recent decades, the complexity of obtaining an appropriate predictive model leads to different studies that aim to achieve Genetic Programming-based GMPEs. In this chapter, the concepts, methodologies and results of different studies regarding driving new ground motion relationships based on Genetic Programming are discussed.

Mehdi Mousavi, Alireza Azarbakht, Sahar Rahpeyma, Ali Farhadi

Chapter 12. Evaluation of Liquefaction Potential of Soil Based on Shear Wave Velocity Using Multi-Gene Genetic Programming

Abstract

In this chapter, liquefaction potential of soil is evaluated within deterministic as well as probabilistic framework based on the post-liquefaction shear wave velocity (V _s) measurement data using a soft computing technique, multi-gene genetic programming (MGGP), which is a variant genetic programming (GP). On the basis of the developed limit state function by the MGGP, a mapping function is presented to correlate probability of liquefaction (P _L) with factor of safety (F _s) against liquefaction using Bayesian theory of conditional probability. Two examples are presented to compare the developed MGGP-based deterministic as well as probabilistic methods with those of available artificial neural network (ANN)-based methods. The findings from the above two examples confirm that MGGP-based methods are more accurate than the ANN-based methods in predicting the liquefied as well as non-liquefied cases.

Pradyut Kumar Muduli, Sarat Kumar Das

Chapter 13. Site Characterization Using GP, MARS and GPR

Abstract

This article examines the capability of Genetic Programming (GP), Multivariate Adaptive Regression Spline (MARS) and Gaussian Process Regression (GPR) for developing site characterization model of Bangalore (India) based on corrected Standard Penetration Test (SPT) value (N_c). GP, MARS and GPR have been used as regression techniques. GP is developed based on genetic algorithm. MARS does not assume any functional relationship between input and output variables. GPR is a probabilistic, non-parametric model. In GPR, different kinds of prior knowledge can be applied. In three dimensional analysis, the function\( {\mathrm{N}}_{\mathrm{c}}=\mathrm{f}\left(\mathrm{X},\mathrm{Y},\mathrm{Z}\right) \) where X, Y and Z are the coordinates of a point corresponding to N value, is to be approximated with which N value at any half space point in Bangalore can be determined. A comparative study between the developed GP, MARS and GPR has been carried out in the proposed book chapter. The developed GP, MARS and GPR give the spatial variability of N_c values at Bangalore.

Pijush Samui, Yıldırım Dalkiliç, J Jagan

Chapter 14. Use of Genetic Programming Based Surrogate Models to Simulate Complex Geochemical Transport Processes in Contaminated Mine Sites

Abstract

Reactive transport of chemical species in contaminated groundwater systems, especially with multiple species, is a complex and highly non-linear geochemical process. Simulation of such complex geochemical processes using efficient numerical models is generally computationally intensive. In order to increase the model reliability for real field data, uncertainties in hydrogeological parameters and boundary conditions are needed to be considered as well. The development and performance evaluation of ensemble Genetic Programming (GP) models to serve as computationally efficient approximate simulators of complex groundwater contaminant transport process with reactive chemical species under aquifer parameters uncertainties are presented. The GP models are developed by training and testing of the models using sets of random input contaminated sources and the corresponding aquifer responses in terms of resulting spatio-temporal concentrations of the contaminants obtained as solution of the hydrogeological and geochemical numerical simulation model. Three dimensional transient flow and reactive contaminant transport process is considered. Performance evaluation of the ensemble GP models as surrogate models for the reactive species transport in groundwater demonstrates the feasibility of its use and the associated computational advantages. The evaluation results show that it is feasible to use ensemble GP models as approximate simulators of complex hydrogeologic and geochemical processes in a contaminated groundwater aquifer incorporating uncertainties in describing the physical system.

Hamed Koohpayehzadeh Esfahani, Bithin Datta

Chapter 15. Potential of Genetic Programming in Hydroclimatic Prediction of Droughts: An Indian Perspective

Abstract

Past studies have established the presence of hydroclimatic teleconnection between hydrological variables across the world and large-scale coupled oceanic-atmospheric circulation patterns, such as El Niño-Southern Oscillation (ENSO), Equatorial Indian Ocean Oscillation (EQUINOO), Pacific Decadal Oscillation (PDO), Atlantic Multi-decadal Oscillation (AMO), Indian Ocean Dipole (IOD). For the purpose of modelling hydroclimatic teleconnections, Artificial intelligence (AI) tools including Genetic Programming (GP) have been successfully applied in several studies. In this chapter, we attempt to explore the potential of Linear Genetic Programming (LGP) for the prediction of droughts using the local and global climate inputs in the context of Indian hydroclimatology. The global anomaly fields of five different climate variables, namely Sea Surface Temperature (SST), Surface Pressure (SP), Air Temperature (AT), Wind Speed (WS) and Total Precipitable Water (TPW), are explored during extreme rainfall events (isolated by standardizing monthly rainfall from 1959 to 2010 using an anomaly based index) to identify the Global Climate Pattern (GCP). The GCP for the target area is characterized by 14 variables where each variable is designated by a particular climate variable from a distinct zone on the globe. The potential of a LGP-based approach is explored to extract the climate information hidden in the GCP and to predict the ensuing drought status. The LGP based approach is found to produce reasonably good results. Many of the dry and wet events observed during the last few decades are found to be predicted successfully.

Rajib Maity, Kironmala Chanda

Chapter 16. Application of Genetic Programming for Uniaxial and Multiaxial Modeling of Concrete

Abstract

In current chapter, an overview of recently established genetic programming based techniques for strength modeling of concrete has been presented. The comprehensive uniaxial and multiaxial strengths modeling of hardened concrete have been concentrated in this chapter as one of the main area of interests in concrete modeling for structural engineers. For this engineering case the literature has been reviewed and the most applied numerical/analytical/experimental models and national building codes have been introduced. After reviewing the artificial intelligence/machine learning based models, genetic programming based models are presented, with accent on the applicability and efficiency of each model and its suitability. The advantages and weaknesses of the aforementioned models are summarized and compared with existing numerical/analytical/experimental models and national building codes, and a few illustrative examples briefly are presented. The genetic programming based techniques are remarkably straightforward and have enabled reliable, stable, and robust tools for pre-design and design applications.

Saeed K. Babanajad

Chapter 17. Genetic Programming for Mining Association Rules in Relational Database Environments

Abstract

Most approaches for the extraction of association rules look for associations from a dataset in the form of a single table. However, with the growing interest in the storage of information, relational databases comprising a series of relations (tables) and relationships have become essential. We present the first grammar-guided genetic programming approach for mining association rules directly from relational databases. We represent the relational databases as trees by means of genetic programming, preserving the original database structure and enabling rules to be defined in an expressive and very flexible way. The proposed model deals with both positive and negative items, and also with both discrete and quantitative attributes. We exemplify the utility of the proposed approach with an artificial generated database having different characteristics. We also analyse a real case study, discovering interesting students’ behaviors from a moodle database.

J. M. Luna, A. Cano, S. Ventura

Chapter 18. Evolving GP Classifiers for Streaming Data Tasks with Concept Change and Label Budgets: A Benchmarking Study

Abstract

Streaming data classification requires that several additional challenges are addressed that are not typically encountered in offline supervised learning formulations. Specifically, access to data at any training generation is limited to a small subset of the data, and the data itself is potentially generated by a non-stationary process. Moreover, there is a cost to requesting labels, thus a label budget is enforced. Finally, an anytime classification requirement implies that it must be possible to identify a ‘champion’ classifier for predicting labels as the stream progresses. In this work, we propose a general framework for deploying genetic programming (GP) to streaming data classification under these constraints. The framework consists of a sampling policy and an archiving policy that enforce criteria for selecting data to appear in a data subset. Only the exemplars of the data subset are labeled, and it is the content of the data subset that training epochs are performed against. Specific recommendations include support for GP task decomposition/modularity and making additional training epochs per data subset. Both recommendations make significant improvements to the baseline performance of GP under streaming data with label budgets. Benchmarking issues addressed include the identification of datasets and performance measures.

Ali Vahdat, Jillian Morgan, Andrew R. McIntyre, Malcolm I. Heywood, Nur Zincir-Heywood

Hybrid Approaches

Frontmatter

Chapter 19. A New Evolutionary Approach to Geotechnical and Geo-Environmental Modelling

Abstract

In many cases, models based on certain laws of physics can be developed to describe the behaviour of physical systems. However, in case of more complex phenomena with less known or understood contributing parameters or variables the physics-based modelling techniques may not be applicable. Evolutionary Polynomial Regression (EPR) offers a new way of rendering models, in the form of easily interpretable polynomial equations, explicitly expressing the relationship between contributing parameters of a system of complex nature, and the behaviour of the system. EPR is a recently developed hybrid regression method that provides symbolic expressions for models and works with formulae based on pseudo-polynomial expressions. In this chapter the application of EPR to two important geotechnical and geo-environmental engineering systems is presented. These systems include thermo-mechanical behaviour of unsaturated soils and optimisation of performance of an aquifer system subjected to seawater intrusion. Comparisons are made between the EPR model predictions and the actual measured or synthetic data. The results show that the proposed methodology is able to develop highly accurate models with excellent capability of reflecting the real and expected physical effects of the contributing parameters on the performance of the systems. Merits and advantages of the suggested methodology are highlighted.

Mohammed S. Hussain, Alireza Ahangar-asr, Youliang Chen, Akbar A. Javadi

Chapter 20. Application of GFA-MLR and G/PLS Techniques in QSAR/QSPR Studies with Application in Medicinal Chemistry and Predictive Toxicology

Abstract

Quantitative structure–activity/property/toxicity relationship (QSAR/QSPR/QSTR) models enable predictions of activity/property/toxicities to be made directly from the chemical structure. Feature selection is one of the integral parts in the development of QSAR/QSPR models which is also included in the Organization of Economic Co-operation and Development (OECD) principle of “an unambiguous algorithm” for QSAR model development and validation. Genetic algorithm (GA) based on the principle of Darwin’s theory of natural selection and evolutions are being widely used in recent times for the selection of descriptors in the development of predictive models for toxicity assessment and virtual screening of hazardous chemicals and design of drug compounds with therapeutic activity. The GA algorithm can handle a huge number of descriptors and generate a population of models competitive with or superior to the results of standard regression analysis. Genetic function approximation (GFA) involves the combination of multivariate adaptive regression splines (MARS) algorithm of Friedman with genetic algorithm of Holland to evolve population of equations. GFA calculations are based on three operators: selection, crossover and mutation. Using spline based terms in the model construction, GFA can either remove the outlier compounds or identify a range of effect. GFA followed by multiple linear regression (GFA-MLR) or partial least squares (G/PLS) regression is frequently used by different research groups for the development of predictive QSAR/QSPR models. This chapter presents examples of some case studies of the use of GFA-MLR and G/PLS techniques in developing predictive models in medicinal chemistry and predictive toxicology applications.

Partha Pratim Roy, Supratim Ray, Kunal Roy

Chapter 21. Trading Volatility Using Highly Accurate Symbolic Regression

Abstract

Research efforts, directed at increasing the accuracy and dependability of Symbolic Regression (SR), have resulted in significant improvements in symbolic regression’s range, accuracy, and dependability. Previous research has also demonstrated the practicability of estimating corporate forward 12 month earnings, using advanced symbolic regression. In this paper we put these prior results and techniques together to select a 100 stock semi-passive index portfolio, extracted from the Value Line Timeliness stocks (Value Line), which delivers consistent performance in both bull and bear decades and we will compare its performance to the Standard & Poors 100 index.

We intend to produce our 100 stock semi-passive index buy list on a weekly basis using automated forward 12 month EPS (ftmEPS) prediction involving the analysis of many securities, involving multiple training regressions each on hundreds of thousands of training examples. Plus the timeliness issue will require that our analytic tools be strong and thoroughly matured. The 100 stock buy list will be the foundation for a new semi-passive Value Line 100 index fund which should have great appeal to many high net worth clients, enjoy low management costs, and be easily acceptable to the compliance and regulatory authorities.

Valuation of Value Line securities via their forward 12 month price earnings ratio (ftmPE) is a very common securities valuation method in the industry. Obviously the ftmPE valuation depends heavily on the estimate of forward 12 month corporate earnings per share (ftmEPS). Several obvious inputs to the ftmEPS prediction process are the past earnings time series plus one or more analyst predictions.

Valuation via ftmEPS is a necessary but not a sufficient attraction for a semi-passive index fund. So we will introduce the advantages of trading volatility. Our thesis will be that emotional trading patterns tend to make markets less efficient.

The efficient market hypothesis depends upon equal access to information and rational trading patterns. Trading on insider information is illegal in most developed securities markets; however, trading when others are emotional is unregulated. In this paper we will develop a set of factors—all of which incorporate a measure of volatility indicating possible overly emotional trading patterns. The theme of our new semi-passive index fund will be “Buy value from those who are selling in a highly emotional state”.

Michael F. Korns

Tools

Frontmatter

Chapter 22. GPTIPS 2: An Open-Source Software Platform for Symbolic Data Mining

Abstract

GPTIPS is a free, open source MATLAB based software platform for symbolic data mining (SDM). It uses a multigene variant of the biologically inspired machine learning method of genetic programming (MGGP) as the engine that drives the automatic model discovery process. Symbolic data mining is the process of extracting hidden, meaningful relationships from data in the form of symbolic equations. In contrast to other data-mining methods, the structural transparency of the generated predictive equations can give new insights into the physical systems or processes that generated the data. Furthermore, this transparency makes the models very easy to deploy outside of MATLAB.

The rationale behind GPTIPS is to reduce the technical barriers to using, understanding, visualising and deploying GP based symbolic models of data, whilst at the same time remaining highly customisable and delivering robust numerical performance for power users. In this chapter, notable new features of the latest version of the software—GPTIPS 2—are discussed with these aims in mind. Additionally, a simplified variant of the MGGP high level gene crossover mechanism is proposed.

It is demonstrated that the new functionality of GPTIPS 2 (a) facilitates the discovery of compact symbolic relationships from data using multiple approaches, e.g. using novel gene-centric visualisation analysis to mitigate horizontal bloat and reduce complexity in multigene symbolic regression models (b) provides numerous methods for visualising the properties of symbolic models (c) emphasises the generation of graphically navigable libraries of models that are optimal in terms of the Pareto trade off surface of model performance and complexity and (d) expedites real world applications by the simple, rapid and robust deployment of symbolic models outside the software environment they were developed in.

Dominic P. Searson

Chapter 23. eCrash: a Genetic Programming-Based Testing Tool for Object-Oriented Software

Abstract

This paper describes the methodology, architecture and features of the eCrash framework, a Java-based tool which employs Strongly-Typed Genetic Programming to automate the generation of test data for the structural unit testing of Object-Oriented programs. The application of Evolutionary Algorithms to Test Data generation is often referred to as Evolutionary Testing. eCrash implements an Evolutionary Testing strategy developed with three major purposes: improving the level of performance and automation of the Software Testing process; minimising the interference of the tool’s users on the Test Object analysis to a minimum; and mitigating the impact of users decisions in the Test Data generation process.

José Carlos Bregieiro Ribeiro, Ana Filipa Nogueira, Francisco Fernández de Vega, Mário Alberto Zenha-Rela

Titel: Handbook of Genetic Programming Applications
herausgegeben von: Amir H. Gandomi
Amir H. Alavi
Conor Ryan
Verlag: Springer International Publishing
Electronic ISBN: 978-3-319-20883-1
Print ISBN: 978-3-319-20882-4
DOI: https://doi.org/10.1007/978-3-319-20883-1

Springer Professional

Über dieses Buch

Inhaltsverzeichnis

Frontmatter

Overview of Genetic Programming Applications

Frontmatter

Chapter 1. Graph-Based Evolutionary Art

Chapter 2. Genetic Programming for Modelling of Geotechnical Engineering Systems

Chapter 3. Application of Genetic Programming in Hydrology

Chapter 4. Application of Gene-Expression Programming in Hydraulic Engineering

Chapter 5. Genetic Programming Applications in Chemical Sciences and Engineering

Chapter 6. Application of Genetic Programming for Electrical Engineering Predictive Modeling: A Review

Chapter 7. Mate Choice in Evolutionary Computation

Specialized Applications

Frontmatter

Chapter 8. Genetically Improved Software

Chapter 9. Design of Real-Time Computer-Based Systems Using Developmental Genetic Programming

Chapter 10. Image Classification with Genetic Programming: Building a Stage 1 Computer Aided Detector for Breast Cancer

Chapter 11. On the Application of Genetic Programming for New Generation of Ground Motion Prediction Equations

Chapter 12. Evaluation of Liquefaction Potential of Soil Based on Shear Wave Velocity Using Multi-Gene Genetic Programming

Chapter 13. Site Characterization Using GP, MARS and GPR

Chapter 14. Use of Genetic Programming Based Surrogate Models to Simulate Complex Geochemical Transport Processes in Contaminated Mine Sites

Chapter 15. Potential of Genetic Programming in Hydroclimatic Prediction of Droughts: An Indian Perspective

Chapter 16. Application of Genetic Programming for Uniaxial and Multiaxial Modeling of Concrete

Chapter 17. Genetic Programming for Mining Association Rules in Relational Database Environments

Chapter 18. Evolving GP Classifiers for Streaming Data Tasks with Concept Change and Label Budgets: A Benchmarking Study

Hybrid Approaches

Frontmatter

Chapter 19. A New Evolutionary Approach to Geotechnical and Geo-Environmental Modelling

Chapter 20. Application of GFA-MLR and G/PLS Techniques in QSAR/QSPR Studies with Application in Medicinal Chemistry and Predictive Toxicology

Chapter 21. Trading Volatility Using Highly Accurate Symbolic Regression

Tools

Frontmatter

Chapter 22. GPTIPS 2: An Open-Source Software Platform for Symbolic Data Mining

Chapter 23. eCrash: a Genetic Programming-Based Testing Tool for Object-Oriented Software

Premium Partner