Skip to main content



Genetic Algorithms

Design of Multithreaded Estimation of Distribution Algorithms

Estimation of Distribution Algorithms (EDAs) use a probabilistic model of promising solutions found so far to obtain new candidate solutions of an optimization problem. This paper focuses on the design of parallel EDAs. More specifically, the paper describes a method for parallel construction of Bayesian networks with local structures in form of decision trees in the Mixed Bayesian Optimization Algorithm. The proposed Multithreaded Mixed Bayesian Optimization Algorithm (MMBOA) is intended for implementation on a cluster of workstations that communicate by Message Passing Interface (MPI). Communication latencies between workstations are eliminated by multithreaded processing, so in each workstation the high-priority model-building thread, which is communication demanding, can be overlapped by low-priority model sampling thread when necessary. High performance of MMBOA is verified via simulation in TRANSIM tool.

Jiri Ocenasek, Josef Schwarz, Martin Pelikan

Reinforcement Learning Estimation of Distribution Algorithm

This paper proposes an algorithm for combinatorial optimizations that uses reinforcement learning and estimation of joint probability distribution of promising solutions to generate a new population of solutions. We call it Reinforcement Learning Estimation of Distribution Algorithm (RELEDA). For the estimation of the joint probability distribution we consider each variable as univariate. Then we update the probability of each variable by applying reinforcement learning method. Though we consider variables independent of one another, the proposed method can solve problems of highly correlated variables. To compare the efficiency of our proposed algorithm with other Estimation of Distribution Algorithms (EDAs) we provide the experimental results of the two problems: four peaks problem and bipolar function.

Topon Kumar Paul, Hitoshi Iba

Hierarchical BOA Solves Ising Spin Glasses and MAXSAT

Theoretical and empirical evidence exists that the hierarchical Bayesian optimization algorithm (hBOA) can solve challenging hierarchical problems and anything easier. This paper applies hBOA to two important classes of real-world problems: Ising spin-glass systems and maximum satisfiability (MAXSAT). The paper shows how easy it is to apply hBOA to real-world optimization problems—in most cases hBOA can be applied without any prior problem analysis, it can acquire enough problem-specific knowledge automatically. The results indicate that hBOA is capable of solving enormously difficult problems that cannot be solved by other optimizers and still provide competitive or better performance than problem-specific approaches on other problems. The results thus confirm that hBOA is a practical, robust, and scalable technique for solving challenging real-world problems.

Martin Pelikan, David E. Goldberg

ERA: An Algorithm for Reducing the Epistasis of SAT Problems

A novel method, for solving satisfiability (SAT) instances is presented. It is based on two components: a) An Epistasis Reducer Algorithm (ERA) that produces a more suited representation (with lower epistasis) for a Genetic Algorithm (GA) by preprocessing the original SAT problem; and b) A Genetic Algorithm that solves the preprocesed instances.ERA is implemented by a simulated annealing algorithm (SA), which transforms the original SAT problem by rearranging the variables to satisfy the condition that the most related ones are in closer positions inside the chromosome.Results of experimentation demonstrated that the proposed combined approach outperforms GA in all the tests accomplished.

Eduardo Rodriguez-Tello, Jose Torres-Jimenez

Learning a Procedure That Can Solve Hard Bin-Packing Problems: A New GA-Based Approach to Hyper-heuristics

The idea underlying hyper-heuristics is to discover some combination of familiar, straightforward heuristics that performs very well across a whole range of problems. To be worthwhile, such a combination should outperform all of the constituent heuristics. In this paper we describe a novel messy-GA-based approach that learns such a heuristic combination for solving one-dimensional bin-packing problems. When applied to a large set of benchmark problems, the learned procedure finds an optimal solution for nearly 80% of them, and for the rest produces an answer very close to optimal. When compared with its own constituent heuristics, it ranks first in 98% of the problems.

Peter Ross, Javier G. Marín-Blázquez, Sonia Schulenburg, Emma Hart

Population Sizing for the Redundant Trivial Voting Mapping

This paper investigates how the use of the trivial voting (TV) mapping influences the performance of genetic algorithms (GAs). The TV mapping is a redundant representation for binary phenotypes. A population sizing model is presented that quantitatively predicts the influence of the TV mapping and variants of this encoding on the performance of GAs. The results indicate that when using this encoding GA performance depends on the influence of the representation on the initial supply of building blocks. Therefore, GA performance remains unchanged if the TV mapping is uniformly redundant that means on average a phenotype is represented by the same number of genotypes. If the optimal solution is overrepresented, GA performance increases, whereas it decreases if the optimal solution is underrepresented. The results show that redundant representations like the TV mapping do not increase GA performance in general. Higher performance can only be achieved if there is specific knowledge about the structure of the optimal solution that can beneficially be used by the redundant representation.

Franz Rothlauf

Non-stationary Function Optimization Using Polygenic Inheritance

Non-stationary function optimization has proved a difficult area for Genetic Algorithms. Standard haploid populations find it difficult to track a moving target, and tend to converge on a local optimum that appears early in a run.It is generally accepted that diploid GAs can cope with these problems because they have a genetic memory, that is, genes that may be required in the future are maintained in the current population. This paper describes a haploid GA that appears to have this property, through the use of Polygenic Inheritance. Polygenic inheritance differs from most implementations of GAs in that several genes contribute to each phenotypic trait.Two non-stationary function optimization problems from the literature are described, and a number of comparisons performed. We show that Polygenic inheritance enjoys all the advantages normally associated with diploid structures, with none of the usual costs, such as complex crossover mechanisms, huge mutation rates or ambiguity in the mapping process.

Conor Ryan, J. J. Collins, David Wallin

Scalability of Selectorecombinative Genetic Algorithms for Problems with Tight Linkage

Ensuring building-block (BB) mixing is critical to the success of genetic and evolutionary algorithms. This study develops facetwise models to predict the BB mixing time and the population sizing dictated by BB mixing for single-point crossover. The population-sizing model suggests that for moderate-to-large problems, BB mixing – instead of BB decision making and BB supply – bounds the population size required to obtain a solution of constant quality. Furthermore, the population sizing for single-point crossover scales as O (2km1.5), where k is the BB size, and m is the number of BBs.

Kumara Sastry, David E. Goldberg

New Entropy-Based Measures of Gene Significance and Epistasis

A new framework to formulate and quantify the epistasis of a problem is proposed. It is based on Shannon’s information theory. With the framework, we suggest three epistasis-related measures: gene significance, gene epistasis, and problem epistasis. The measures are believed to be helpful to investigate both the individual epistasis of a gene group and the overall epistasis that a problem has. The experimental results on various well-known problems support it.

Dong-Il Seo, Yong-Hyuk Kim, Byung Ro Moon

A Survey on Chromosomal Structures and Operators for Exploiting Topological Linkages of Genes

The building block hypothesis implies that the epistatic property of a given problem must be connected well to the linkage property of the employed representation and crossover operator in the design of genetic algorithms. A good handling of building blocks has much to do with topological linkages of genes in the chromosome. In this paper, we provide a taxonomy of the approaches that exploit topological linkages of genes. They are classified into three models: static linkage model, adaptive linkage model, and evolvable linkage model. We also provide an overview on the chromosomal structures, encodings, and operators supporting each of the models.

Dong-Il Seo, Byung-Ro Moon

Cellular Programming and Symmetric Key Cryptography Systems

The problem of designing symmetric key cryptography algorithms based upon cellular automata (CAs) is considered. The reliability of the Vernam cipher used in the process of encryption highly depends on a quality of used random numbers. One dimensional, nonuniform CAs is considered as a generator of pseudorandom number sequences (PNSs). The quality of PNSs highly depends on a set of applied CA rules. To find such rules nonuniform CAs with two types of rules is considered. The search of rules is based on an evolutionary technique called cellular programming (CP). Resulting from the collective behavior of the discovered set of CA rules very high quality PNSs are generated. The quality of PNSs outperform the quality of known one dimensional CA-based PNS generators used in secret key cryptography. The extended set of CA rules which was found makes the cryptography system much more resistant on breaking a cryptography key.

Franciszek Seredyński, Pascal Bouvry, Albert Y. Zomaya

Mating Restriction and Niching Pressure: Results from Agents and Implications for General EC

This paper presents results and observations from the authors’ continuing explorations of EC systems where population members act as autonomous agents that conduct their own, independent evaluation of (and reproduction with) other agents. In particular, we consider diversity preservation in one such agent-based EC system, applied to the multi-peak functions often used to illustrate and evaluate the effects of fitness-sharing-like schemes in GAs. We show how (somewhat surprisingly) mating restriction alone yields stable niching in agent-based EC. This leads to a consideration of niching as a generalized phenomenon, and the introduction of niching pressure as a concept that parallels selective pressure, and which can yield insight. The utility of the niching pressure concept for general EC is explored, and directions for further research are discussed.

R. E. Smith, Claudio Bonacina

EC Theory: A Unified Viewpoint

In this paper we show how recent theoretical developments have led to a more unified framework for understanding different branches of Evolutionary Computation (EC), as well as distinct theoretical formulations, such as the Schema theorem and the Vose model. In particular, we show how transformations of the configuration space of the model — such as coarse-grainings/projections, embeddings and coordinate transformations — link these different, disparate elements and alter the standard taxonomic classification of EC and different EC theories. As an illustration we emphasize one particular coordinate transformation between the standard string basis and the “Building Block” basis.

Christopher R. Stephens, Adolfo Zamora

Real Royal Road Functions for Constant Population Size

Evolutionary and genetic algorithms (EAs and GAs) are quite successful randomized function optimizers. This success is mainly based on the interaction of different operators like selection, mutation, and crossover. Since this interaction is still not well understood, one is interested in the analysis of the single operators. Jansen and Wegener (2001a) have described so-called real royal road functions where simple steady-state GAs have a polynomial expected optimization time while the success probability of mutation-based EAs is exponentially small even after an exponential number of steps. This success of the GA is based on the crossover operator and a population whose size is moderately increasing with the dimension of the search space. Here new real royal road functions are presented where crossover leads to a small optimization time, although the GA works with the smallest possible population size — namely 2.

Tobias Storch, Ingo Wegener

Two Broad Classes of Functions for Which a No Free Lunch Result Does Not Hold

We identify classes of functions for which a No Free Lunch result does and does not hold, with particular emphasis on the relationship between No Free Lunch and problem description length. We show that a NFL result does not apply to a set of functions when the description length of the functions is sufficiently bounded. We consider sets of functions with non-uniform associated probability distributions, and show that a NFL result does not hold if the probabilities are assigned according either to description length or to a Solomonoff- Levin distribution. We close with a discussion of the conditions under which NFL can apply to sets containing an infinite number of functions.

Matthew J. Streeter

Dimensionality Reduction via Genetic Value Clustering

Feature extraction based on evolutionary search offers new possibilities for improving classification accuracy and reducing measurement complexity in many data mining and machine learning applications. We present a family of genetic algorithms for feature synthesis through clustering of discrete attribute values. The approach uses new compact graph-based encoding for cluster representation, where size of GA search space is reduced exponentially with respect to the number of items in partitioning, as compared to original idea of Park and Song. We apply developed algorithms and study their effectiveness for DNA fingerprinting in population genetics and text categorization.

Alexander Topchy, William Punch

The Structure of Evolutionary Exploration: On Crossover, Buildings Blocks, and Estimation-Of-Distribution Algorithms

Correlations between alleles after selection are an important source of information. Such correlations should be exploited for further search and thereby constitute the building blocks of evolutionary exploration. With this background we analyze the structure of the offspring probability distribution, or exploration distribution, for a simple GA with mutation only and a crossover GA and compare them to Estimation-Of- Distribution Algorithms (EDAs). This will allow a precise characterization of the structure of exploration w.r.t. correlations in the search distribution for these algorithms. We find that crossover transforms, depending on the crossover mask, mutual information between loci into entropy. In total, it can only decrease such mutual information. In contrast, the objective of EDAs is to estimate the correlations between loci and exploit this information during exploration. This may lead to an effective increase of mutual information in the exploration distribution, what we define correlated exploration.

Marc Toussaint

The Virtual Gene Genetic Algorithm

This paper presents the virtual gene genetic algorithm (vgGA) which is a generalization of traditional genetic algorithms that use binary linear chromosomes. In the vgGA, traditional one point crossover and mutation are implemented as arithmetic functions over the integers or reals that the chromosome represents. This implementation allows the generalization to virtual chromosomes of alphabets of any cardinality. Also, the sites where crossover and mutation fall can be generalized in the vgGA to values that do not necessarily correspond to positions between bits or digits of another base, thus implementing generalized digits. Preliminary results that indicate that the vgGA outperforms a GA with binary linear chromosomes on integer and real valued problems where the underlying structure is not binary are presented.

Manuel Valenzuela-Rendón

Quad Search and Hybrid Genetic Algorithms

A bit climber using a Gray encoding is guaranteed to converge to a global optimum in fewer than 2(L2) evaluations on unimodal 1-D functions and on multi-dimensional sphere functions, where L bits are used to encode the function domain. Exploiting these ideas, we have constructed an algorithm we call Quad Search. Quad Search converges to a local optimum on unimodal 1-D functions in not more than 2L + 2 function evaluations. For unimodal 1-D and separable multi-dimensional functions, the result is the global optimum. We empirically assess the performance of steepest ascent local search, next ascent local search, and Quad Search. These algorithms are also compared with Evolutionary Strategies. Because of its rapid convergence time, we also use Quad Search to construct a hybrid genetic algorithm. The resulting algorithm is more effective than hybrid genetic algorithms using steepest ascent local search or the RBC next ascent local search algorithm.

Darrell Whitley, Deon Garrett, Jean-Paul Watson

Distance between Populations

Gene space, as it is currently formulated, cannot provide a solid basis for investigating the behavior of the GA. We instead propose an approach that takes population effects into account. Starting from a discussion of diversity, we develop a distance measure between populations and thereby a population metric space. We finally argue that one specific parameterization of this measure is particularly appropriate for use with GAs.

Mark Wineberg, Franz Oppacher

The Underlying Similarity of Diversity Measures Used in Evolutionary Computation

In this paper we compare and analyze the various diversity measures used in the Evolutionary Computation field. While each measure looks quite different from the others in form, we surprisingly found that the same basic method underlies all of them: the distance between all possible pairs of chromosomes/organisms in the population. This is true even of the Shannon entropy of gene frequencies. We then associate the different varieties of EC diversity measures to different diversity measures used in Biology. Finally we give an O(n) implementation for each of the diversity measures (where n is the population size), despite their basis in an O(n2) number of comparisons.

Mark Wineberg, Franz Oppacher

Implicit Parallelism

This paper assumes a search space of fixed-length strings, where the size of the alphabet can vary from position to position. Structural crossover is mask-based crossover, and thus includes n-point and uniform crossover. Structural mutation is mutation that commutes with a group operation on the search space. This paper shows that structural crossover and mutation project naturally onto competing families of schemata. In other words, the effect of crossover and mutation on a set of string positions can be specified by considering only what happens at those positions and ignoring other positions. However, it is not possible to do this for proportional selection except when fitness is constant on each schema of the family. One can write down an equation which includes selection which generalizes the Holland Schema theorem. However, like the Schema theorem, this equation cannot be applied over multiple time steps without keeping track of the frequency of every string in the search space.

Alden H. Wright, Michael D. Vose, Jonathan E. Rowe

Finding Building Blocks through Eigenstructure Adaptation

A fundamental aspect of many evolutionary approaches to synthesis of complex systems is the need to compose atomic elements into useful higher-level building blocks. However, the ability of genetic algorithms to promote useful building blocks is based critically on genetic linkage — the assumption that functionally related alleles are also arranged compactly on the genome. In many practical problems, linkage is not known a priori or may change dynamically. Here we propose that a problem’s Hessian matrix reveals this linkage, and that an eigenstructure analysis of the Hessian provides a transformation of the problem to a space where first-order genetic linkage is optimal. Genetic algorithms that dynamically transforms the problem space can operate much more efficiently. We demonstrate the proposed approach on a real-valued adaptation of Kaufmann’s NK landscapes and discuss methods for extending it to higher-order linkage.

Danica Wyatt, Hod Lipson

A Specialized Island Model and Its Application in Multiobjective Optimization

This paper discusses a new model of parallel evolutionary algorithms (EAs) called the specialized island model (SIM) that can be used to generate a set of diverse non-dominated solutions to multiobjective optimization problems. This model is derived from the island model, in which an EA is divided into several subEAs that exchange individuals among them. In SIM, each subEA is responsible (i.e., specialized) for optimizing a subset of the objective functions in the original problem. The efficacy of SIM is demonstrated using a three-objective optimization problem. Seven scenarios of the model with a different number of subEAs, communication topology, and specialization are tested, and their results are compared. The results suggest that SIM effectively finds non-dominated solutions to multiobjective optimization problems.

Ningchuan Xiao, Marc P. Armstrong

Adaptation of Length in a Nonstationary Environment

In this paper, we examine the behavior of a variable length GA in a nonstationary problem environment. Results indicate that a variable length GA is better able to adapt to changes than a fixed length GA. Closer examination of the evolutionary dynamics reveals that a variable length GA can in fact take advantage of its variable length representation to exploit good quality building blocks after a change in the problem environment.

Han Yu, Annie S. Wu, Kuo-Chi Lin, Guy Schiavone

Optimal Sampling and Speed-Up for Genetic Algorithms on the Sampled OneMax Problem

This paper investigates the optimal sampling and the speedup obtained through sampling for the sampled OneMax problem. Theoretical and experimental analyses are given for three different population-sizing models: the decision-making model, the gambler’s ruin model, and the fixed population-sizing model. The results suggest that, when the desired solution quality is fixed to a high value, the decision-making model prefers a large sampling size, the fixed population-sizing model prefers a small sampling size, and the gambler’s ruin model has no preference for large or small sizes. Among the three population-sizing models, sampling yields speed-up only when the fixed population-sizing model is valid. The results indicate that when the population is sized appropriately, sampling does not yield speed-up for problems with subsolutions of uniform salience.

Tian-Li Yu, David E. Goldberg, Kumara Sastry

Building-Block Identification by Simultaneity Matrix

We propose a BB identification by simultaneity matrix (BISM) algorithm. The input is a set of ℓ-bit solutions denoted by S. The number of solutions is denoted by n = |S|. The output is a partition of bit positions {0, ... , ℓ − 1}. The BISM is composed of Simultaneity-Matrix-Construction and Fine-Valid-Partition algorithms. Algorithm SMC is outlined as follows (a ij denotes the matrix element at row i and column j, Count S ab (i, j) = |{x ∈ {0, . . . , n − 1} : s x [i] = a and s x [j] = b{| for all (i, j) ∈ {0, ... , ℓ−1}2, (a, b) ∈ {0, 1}2, s x [i] denotes the ith bit of xth solution, Random(0,1) gives a real random number between 0 and 1).

Chatchawit Aporntewan, Prabhas Chongstitvatana

A Unified Framework for Metaheuristics

Over the past decades, a multitude of new search heuristics, often called “metaheuristics” have been proposed, many of them inspired by principles observed in nature. What distinguishes them from random search is primarily that they maintain some sort of memory of the information gathered during the search so far, and that they use this information to select the location where the search space should be tested next. Based on this observation, we propose a general unified framework which is depicted in Fig. 1: A memory is used to construct one or more new solutions which are then evaluated and used to update the memory, after which the cycle repeats.

Jürgen Branke, Michael Stein, Hartmut Schmeck

The Hitting Set Problem and Evolutionary Algorithmic Techniques with ad-hoc Viruses (HEAT-V)

The Weighted Minimum Hitting Set Problem (WMHSP) and the standard Minimum Hitting Set Problem (MHSP), are combinatorial problems of great interest for many applications. Although, these problems lend themselves quite naturally to an evolutionary approach. To our knowledge, there are no significant results of evolutionary algorithms applied to either the WMHSP or the MHSP, except for the results contained in Cutello et al., (2002).

Vincenzo Cutello, Francesco Pappalardo

The Spatially-Dispersed Genetic Algorithm

Spatially structured population models improve the performance of genetic algorithms by assisting the selection scheme in maintaining diversity. A significant concern with these systems is that they need to be carefully configured in order to operate at their optimum. Failure to do so can often result in performance that is significantly under that of an equivalent non-spatial implementation. This paper introduces a GA that uses a population structure that requires no additional configuration. Early experimentation with this paradigm indicates that it is able to improve the searching abilities of the genetic algorithm on some problem domains.

Grant Dick

Non-universal Suffrage Selection Operators Favor Population Diversity in Genetic Algorithms

State-of-the-art concept learning systems based on genetic algorithms evolve a redundant population of individuals, where an individual is a partial solution that covers some instances of the learning set. In this context, it is fundamental that the population be diverse and that as many instances as possible be covered. The universal suffrage selection (US) operator is a powerful selection mechanism that addresses these two requirements. In this paper we compare experimentally the US operator with two variants, called Weighted US (WUS) and Exponentially Weighted US (EWUS), of this operator in the system ECL [1].

Federico Divina, Maarten Keijzer, Elena Marchiori

Uniform Crossover Revisited: Maximum Disruption in Real-Coded GAs

A detailed comparison is presented of five well-known crossover operators on a range of continuous optimization tasks to test longheld beliefs concerning disruption, with respect to real-coded GAs. It is suggested that, contrary to traditional assumptions, disruption is useful.

Stephen Drake

The Master-Slave Architecture for Evolutionary Computations Revisited

The recent availability of cheap Beowulf clusters has generated much interest for Parallel and Distributed Evolutionary Computations (PDEC) [1]. Another often neglected source of CPU power for PDEC are networks of PCs, in many case very powerful workstations, that run idle each day for long periods of time. To exploit efficiently both Beowulfs and networks of heterogeneous workstations we argue that the classic master-slave distribution model is superior to the currently more popular island-model [1].

Christian Gagné, Marc Parizeau, Marc Dubreuil

Genetic Algorithms — Posters

Using Adaptive Operators in Genetic Search

In this paper, we provided an extension of our previous work on adaptive genetic algorithm [1]. Each individual encodes the probability (rate) of its genetic operators. In every generation, each individual is modified by only one operator. This operator is selected according to its encoded rates. The rates are updated according to the performance achieved by the offspring (compared to its parents) and a random learning rate. The proposed approach is augmented with a simple transposition operator and tested on a number of benchmark functions.

Jonatan Gómez, Dipankar Dasgupta, Fabio González

A Kernighan-Lin Local Improvement Heuristic That Solves Some Hard Problems in Genetic Algorithms

We present a Kernighan-Lin style local improvement heuristic for genetic algorithms. We analyze the run-time cost of the heuristic. We demonstrate through experiments that the heuristic provides very quick solutions to several problems which have been touted in the literature as especially hard ones for genetic algorithms, such as hierarchical deceptive problems. We suggest why the heuristic works well.

William A. Greene

GA-Hardness Revisited

Ever since the invention of Genetic Algorithms (GAs), researchers have put a lot of efforts into understanding what makes a function or problem instance hard for GAs to optimize. Many measures have been proposed to distinguish so- called GA-hard from GA-easy problems. None of these, however, has yet achieved the goal of being a reliable predictive GA-hardness measure. In this paper, we first present a general, abstract theoretical framework of instance hardness and algorithm performance based on Kolmogorov complexity. We then list several major misconceptions of GA-hardness research in the context of this theory. Finally, we propose some future directions.

Haipeng Guo, William H. Hsu

Barrier Trees For Search Analysis

The development of genetic algorithms has been hampered by the lack of theory describing their behaviour, particularly on complex fitness landscapes. Here we propose a method for visualising and analysing the progress of a search algorithm on some hard optimisation problems using barrier trees. A barrier tree is a representation of the search space as a tree where the leaves of the tree represent local optima and nodes of the tree show the fittest saddle points between subtrees. The depth of the leaves and nodes corresponds to the fitness of the local optima and saddle points. Each configuration in search space can be mapped to a position on the tree depending on which optima are accessible without ever decreasing the fitness below its starting value. Having computed a barrier tree we can easily visualise how any search algorithm explores the search space.

Jonathan Hallam, Adam Prügel-Bennett

A Genetic Algorithm as a Learning Method Based on Geometric Representations

A number of different methods combining the use of neural networks and genetic algorithms have been described [1]. This paper discusses an approach for training neural networks based on the geometric representation of the network. In doing so, the genetic algorithm becomes applicable as a common training method for a number of machine learning algorithms that can be similarly represented. The experiments described here were specifically derived to construct claim regions for Fuzzy ARTMAP Neural Networks [2],[3].

Gregory A. Holifield, Annie S. Wu

Solving Mastermind Using Genetic Algorithms

The MasterMind game involves decoding a secret code. The classic game is a code of six possible colors in four slots. The game has been analyzed and optimal strategies have been posed by computer scientists and mathematicians. In this paper we will survey previous work done on solving MasterMind, including several approaches using Genetic Algorithms. We will also analyze the solution sets and compare our results using a novel scoring system inside a GA against previous work using Genetic and Heuristic algorithms. Our GA is performing closer to optimal then previously published work. The GA we present is a Steady State GA using Fitness Proportional Reproduction (FPR), where the fitness function incorporates a simple heuristic algorithm. We also present a scoring method that is simpler then those used by other researchers. In larger games such as 10 colors and 8 slots our GA clearly outperform the heuristic algorithm. In fact if one wishes to tradeoff a higher average number of guesses to a faster running time, extremely large games such as 12 × 10 can be solved in a reasonable time (i.e. minutes) of run time.

Tom Kalisker, Doug Camens

Evolutionary Multimodal Optimization Revisited

We revisit a class of multimodal function optimizations using evolutionary algorithms reformulated into a multiobjective framework where previous implementations have needed niching/sharing to ensure diversity. In this paper, we use a steady-state multiobjective algorithm which preserves diversity without niching to produce diverse sampling of the Pareto-front with significantly lower computational effort.

Rajeev Kumar, Peter Rockett

Integrated Genetic Algorithm with Hill Climbing for Bandwidth Minimization Problem

In this paper, we propose an integrated Genetic Algorithm with Hill Climbing to solve the matrix bandwidth minimization problem, which is to reduce bandwidth by permuting rows and columns resulting in the nonzero elements residing in a band as close as possible to the diagonal. Experiments show that this approach achieves the best solution quality when compared with the GPS [1] algorithm, Tabu Search [3], and the GRASP with Path Relinking methods [4], while being faster than the latter two newly-developed heuristics.

Andrew Lim, Rodrigues Brian, Fei Xiao

A Fixed-Length Subset Genetic Algorithm for the p-Median Problem

In this paper, we review some classical recombination operations and devise new heuristic recombinations for the fixed-length subset. Our experimental results on the classical p-median problem indicate that our method is superior and very close to the optimal solution.

Andrew Lim, Zhou Xu

Performance Evaluation of a Parameter-Free Genetic Algorithm for Job-Shop Scheduling Problems

The job-shop scheduling problem (JSSP) is well known as one of the most difficult NP-hard combinatorial optimization problems. Genetic Algorithms (GAs) for solving the JSSP have been proposed, and they perform well compared with other approaches [1].

Shouichi Matsui, Isamu Watanabe, Ken-ichi Tokoro

SEPA: Structure Evolution and Parameter Adaptation in Feed-Forward Neural Networks

In developing algorithms that dynamically changes the structure and weights of ANN (Artificial Neural Networks), there must be a proper balance between network complexity and its generalization capability. SEPA addresses these issues using an encoding scheme where network weights and connections are encoded in matrices of real numbers. Network parameters are locally encoded and locally adapted with fitness evaluation consisting mainly of fast feed-forward operations. Experimental results in some well-known classification problems demonstrate SEPA’s high consistency performance in classification, fast convergence, and good optimality of structure.

Paulito P. Palmes, Taichi Hayasaka, Shiro Usui

Real-Coded Genetic Algorithm to Reveal Biological Significant Sites of Remotely Homologous Proteins

Discovering biological importance from protein structures needs to utilize heuristic approaches. Since three-dimensional(3D) protein structures play a crucial role in biological reactions, comparing new proteins with well-studied proteins is vital for understanding the native functions. A few residues forming in geometrically close positions activate such biological functions.

Sung-Joon Park, Masayuki Yamamura

Understanding EA Dynamics via Population Fitness Distributions

This paper introduces a new tool to be used in conjunction with existing ones for a more comprehensive understanding of the behavior of evolutionary algorithms. Several research groups including [1],[3],[4] have shown how deeper insights into EA behavior can be obtained by focusing on the changes to the entire population fitness distribution rather than just ”best-so-far” curves. But characterizing how repeated applications of selection and reproduction modify this distribution over time proved to be very difficult to achieve analytically and was done successfully for only a few very specialized EAs and/or very simple fitness landscapes.

Elena Popovici, Kenneth De Jong

Evolutionary Feature Space Transformation Using Type-Restricted Generators

Data preprocessing, especially in terms of feature selection and generation, is an important issue in data mining and knowledge discovery tasks. Genetic algorithms proved to work well on feature selection problems where the search space produced by the initial feature set already contains the target hypothesis. In cases where this precondition is not fulfilled, one needs to construct new features to adequately extend the search space. As a solution to this representation problem, we introduce a framework combining feature selection and type-restricted feature generation in a wrapper-based approach using a modified canonical genetic algorithm for the feature space transformation and an inductive learner for the evaluation of the constructed feature set.

Oliver Ritthoff, Ralf Klinkenberg

On the Locality of Representations

It is well known that using high-locality representations is important for efficient evolutionary search. This paper discusses how the locality of a representation influences the difficulty of a problem when using mutation-based search approaches. The results show that high-locality representations do not change problem difficulty. In contrast, low-locality representations randomize the search process and make problems that are phenotypically easy for mutation-based search more difficult and phenotypically difficult problems more easy.

Franz Rothlauf

New Subtour-Based Crossover Operator for the TSP

Genetic algorithm (GA) is a very useful method for the global search of large search space and has been applied to various problems. It has two kinds of important search mechanisms, crossover and mutation. Because the performance of GA depends on these operators, a large number of operators have been developed for improving the performance of GA. Especially many researchers have more interested in crossover operator than mutation operator because crossover operator has charge of the responsibility of local search. We only deal with crossover operator.

Sang-Moon Soak, Byung-Ha Ahn

Is a Self-Adaptive Pareto Approach Beneficial for Controlling Embodied Virtual Robots?

A self-adaptive Pareto Evolutionary Multi-objective Optimization (EMO) algorithm is proposed for evolving controllers for a virtually embodied robot. The main contribution of the self-adaptive Pareto approach is its ability to produce controllers with different locomotion capabilities in a single run, therefore reducing the evolutionary computational cost significantly. The aim of this paper is to verify this hypothesis.

Jason Teo, Hussein A. Abbass

A Genetic Algorithm for Energy Efficient Device Scheduling in Real-Time Systems

Power is becoming a critical constraint for designing embedded applications, because the amount of power available to these systems is limited due to limitation of battery life [1]. For this reason energy consumption is an important parameter in evaluating performance of embedded systems.DPM (dynamic power management) has gained considerable attention over the last few years as a way to save energy in device that can be turned on and off by operating system control [2].Some successful designs have been made using this technique [3],[4].

Lirong Tian, Tughrul Arslan

Metropolitan Area Network Design Using GA Based on Hierarchical Linkage Identification

Network design is a difficult combinatorial optimization problem which searches an optimal network configuration that satisfies geographical constraints, user traffic constraints etc. from a huge number of candidates. In solving the problem with GAs, building block (BB) destructions occur frequently which misleads the algorithms to unfavorable local optima, because appropriate encoding only with prior knowledge is not usually an easy task. To overcome this difficulty, linkage identification techniques that identify linkage groups - sets of loci tightly linked to form BBs - must be employed. In addition, some real-world problems, especially large and complex problems, seem to have interactions not only among loci but among linkage groups in a hierarchical manner. Therefore we propose a hierarchical version the LIEM2 [1] (Linkage Identification with Epistasis Measure considering Monotonicity) and apply it to the network design problem.

Miwako Tsuji, Masaharu Munetomo, Kiyoshi Akama

Statistics-Based Adaptive Non-uniform Mutation for Genetic Algorithms

A statistics-based adaptive non-uniform mutation (SANUM) is presented for genetic algorithms (GAs), within which the probability that each gene will subject to mutation is learnt adaptively over time and over the loci. SANUM uses the statistics of the allele distribution in each locus to adaptively adjust the mutation probability of that locus. The experiment results demonstrate that SANUM performs persistently well over a range of typical test problems while the performance of traditional mutation operators with fixed rates greatly depends on the problems. SANUM represents a robust adaptive mutation that needs no advanced knowledge about the problem landscape.

Shengxiang Yang

Genetic Algorithm Design Inspired by Organizational Theory: Pilot Study of a Dependency Structure Matrix Driven Genetic Algorithm

This study proposes a dependency structure matrix driven genetic algorithm (DSMDGA) which utilizes the dependency structure matrix (DSM) clustering to extract building block (BB) information and use the information to accomplish BB-wise crossover. Three cases: tight, loose, and random linkage, are tested on both a DSMDGA and a simple genetic algorithm (SGA). Experiments showed that the DSMDGA is able to correctly identify BBs and outperforms a SGA.

Tian-Li Yu, David E. Goldberg, Ali Yassine, Ying-Ping Chen

Are the “Best” Solutions to a Real Optimization Problem Always Found in the Noninferior Set? Evolutionary Algorithm for Generating Alternatives (EAGA)

Evolutionary algorithms (EAs) continue to offer an effective, powerful, and sometimes exclusive way to search for solutions to real optimization problems. While these algorithms can help solve a complex optimization problem, whether the results represent the “best” choices for making decisions about a solution to a real problem is questionable. In decision-making problems that are ill posed, all objectives may not be defined clearly and therefore not quantitatively captured in the optimization model [1]. The noninferior set of solutions to the optimization model being solved may not necessarily contain the best solution to the actual problem.

Emily M. Zechman, S. Ranji Ranjithan

Population Sizing Based on Landscape Feature

Population size for Evolutionary Algorithms is usually an empirical parameter. We study the population size from aspects of fitness landscapes’ ruggedness and Probably Approximately Correct (PAC) learning theory.

Jian Zhang, Xiaohui Yuan, Bill P. Buckles

Genetic Programming

Structural Emergence with Order Independent Representations

This paper compares two grammar based Evolutionary Automatic Programming methods, Grammatical Evolution (GE) and Chorus. Both systems evolve sequences of derivation rules which can be used to produce computer programs, however, Chorus employs a position independent representation, while GE uses polymorphic codons, the meaning of which depends on the context in which they are used.We consider issues such as the order in which rules appear in individuals, and demonstrate that an order always emerges with Chorus, which is similar to that of GE, but more flexible.The paper also examines the final step of evolution, that is, how perfect individuals are produced, and how they differ from their immediate neighbours.We demonstrate that, although Chorus appears to be more flexible structure-wise, GE tends to produce individuals with a higher neutrality, suggesting that its representation can, in some cases, make finding the perfect solution easier.

R. Muhammad Atif Azad, Conor Ryan

Identifying Structural Mechanisms in Standard Genetic Programming

This paper presents a hypothesis about an undiscovered class of mechanisms that exist in standard GP. Rather than being intentionally designed, these mechanisms would be an unintended consequence of using trees as information structures. A model is described that predicts outcomes in GP that would arise solely from such mechanisms. Comparisons with empirical results from GP lend support to the existence of these mechanisms.

Jason M. Daida, Adam M. Hilss

Visualizing Tree Structures in Genetic Programming

This paper presents methods to visualize the structure of trees that occur in genetic programming. These methods allow for the inspection of structure of entire trees of arbitrary size. The methods also scale to allow for the inspection of structure for an entire population. Examples are given from a typical problem. The examples indicate further studies that might be enabled by visualizing structure at these scales.

Jason M. Daida, Adam M. Hilss, David J. Ward, Stephen L. Long

What Makes a Problem GP-Hard? Validating a Hypothesis of Structural Causes

This paper provides an empirical test of a hypothesis, which describes the effects of structural mechanisms in genetic programming. In doing so, the paper offers a test problem anticipated by this hypothesis. The problem is tunably difficult, but has this property because tuning is accomplished through changes in structure. Content is not involved in tuning. The results support a prediction of the hypothesis — that GP search space is significantly constrained as an outcome of structural mechanisms.

Jason M. Daida, Hsiaolei Li, Ricky Tang, Adam M. Hilss

Generative Representations for Evolving Families of Designs

Since typical evolutionary design systems encode only a single artifact with each individual, each time the objective changes a new set of individuals must be evolved. When this objective varies in a way that can be parameterized, a more general method is to use a representation in which a single individual encodes an entire class of artifacts. In addition to saving time by preventing the need for multiple evolutionary runs, the evolution of parameter-controlled designs can create families of artifacts with the same style and a reuse of parts between members of the family. In this paper an evolutionary design system is described which uses a generative representation to encode families of designs. Because a generative representation is an algorithmic encoding of a design, its input parameters are a way to control aspects of the design it generates. By evaluating individuals multiple times with different input parameters the evolutionary design system creates individuals in which the input parameter controls specific aspects of a design. This system is demonstrated on two design substrates: neural-networks which solve the 3/5/7-parity problem and three-dimensional tables of varying heights.

Gregory S. Hornby

Evolutionary Computation Method for Promoter Site Prediction in DNA

This paper develops an evolutionary method that learns inductively to recognize the makeup and the position of very short consensus sequences, which are a typical feature of promoters in eukaryotic genomes. This class of method can be used to discover candidate promoter sequences in primary sequence data. If further developed, it has the potential to discover genes which are regulated together.

Daniel Howard, Karl Benson

Convergence of Program Fitness Landscapes

Point mutation has no effect on almost all linear programs. In two genetic programming (GP) computers (cyclic and bit flip) we calculate the fitness evaluations needed using steepest ascent and first ascent hill climbers and evolutionary search. We describe how the average fitness landscape scales with program length and give general bounds.

W. B. Langdon

Multi-agent Learning of Heterogeneous Robots by Evolutionary Subsumption

Many multi-robot systems are heterogeneous cooperative systems, systems consisting of different species of robots cooperating with each other to achieve a common goal. This paper presents the emergence of cooperative behaviors of heterogeneous robots by means of GP. Since directly using GP to generate a controller for complex behaviors is inefficient and intractable, especially in the domain of multi-robot systems, we propose an approach called Evolutionary Subsumption, which applies GP to subsumption architecture. We test our approach in an “eye”-“hand” cooperation problem. By comparing our approach with direct GP and artificial neural network (ANN) approaches, our experimental results show that ours is more efficient in emergence of complex behaviors.

Hongwei Liu, Hitoshi Iba

Population Implosion in Genetic Programming

With the exception of a small body of adaptive-parameter literature, evolutionary computation has traditionally favored keeping the population size constant through the course of the run. Unfortunately, genetic programming has an aging problem: for various reasons, late in the run the technique become less effective at optimization. Given a fixed number of evaluations, allocating many of them late in the run may thus not be a good strategy. In this paper we experiment with gradually decreasing the population size throughout a genetic programming run, in order to reallocate more evaluations to early generations. Our results show that over four problem domains and three different numbers of evaluations, decreasing the population size is always as good as, and frequently better than, various fixed-sized population strategies.

Sean Luke, Gabriel Catalin Balan, Liviu Panait

Methods for Evolving Robust Programs

Many evolutionary computation search spaces require fitness assessment through the sampling of and generalization over a large set of possible cases as input. Such spaces seem particularly apropos to Genetic Programming, which notionally searches for computer algorithms and functions. Most existing research in this area uses ad-hoc approaches to the sampling task, guided more by intuition than understanding. In this initial investigation, we compare six approaches to sampling large training case sets in the context of genetic programming representations. These approaches include fixed and random samples, and adaptive methods such as coevolution or fitness sharing. Our results suggest that certain domain features may lead to the preference of one approach to generalization over others. In particular, coevolution methods are strongly domain-dependent. We conclude the paper with suggestions for further investigations to shed more light onto how one might adjust fitness assessment to make various methods more effective.

Liviu Panait, Sean Luke

On the Avoidance of Fruitless Wraps in Grammatical Evolution

Grammatical Evolution (GE) is an evolutionary system that employs variable length linear chromosomes to represent computer programs. GE uses the individuals to produce derivation trees that adhere to a Backus Naur Form grammar, which are then mapped onto a program. One unusual characteristic of the system is the manner in which chromosomes can be “wrapped”, that is, if an individual has used up all of its genes before a program is completely mapped, the chromosome is reread. While this doesn’t guarantee that an individual will map, prior work suggested that wrapping is beneficial for the system, both in terms of increased success rates and a reduced number of invalid individuals. However, there has been no research into the number of times an individual should be wrapped before the system gives up, and an arbitrary upper limit is usually chosen.This paper discusses the different types of grammars that could be used with this system, and indicates the circumstances under which individuals will fail. It then presents a heuristic to minimize the number of wraps that have to be made before the system can determine that an individual will fail. It is shown that this can drastically reduce the amount of wrapping on a pathologically difficult problem, as well as on two classes of grammar often used by the system.

Conor Ryan, Maarten Keijzer, Miguel Nicolau

Dense and Switched Modular Primitives for Bond Graph Model Design

This paper suggests dense and switched modular primitives for a bond-graph-based GP design framework that automatically synthesizes designs for multi-domain, lumped parameter dynamic systems. A set of primitives is sought that will avoid redundant junctions and elements, based on preassembling useful functional blocks of bond graph elements and (optionally) using a switched choice mechanism for inclusion of some elements. Motivation for using these primitives is to improve performance through greater search efficiency and thereby to reduce computational effort. As a proof of concept for this approach, an eigenvalue assignment problem, which is to find bond graph models exhibiting minimal distance errors from target sets of eigenvalues, was tested and showed improved performance for various sets of eigenvalues.

Kisung Seo, Zhun Fan, Jianjun Hu, Erik D. Goodman, Ronald C. Rosenberg

Dynamic Maximum Tree Depth

A Simple Technique for Avoiding Bloat in Tree-Based GP

We present a technique, designated as dynamic maximum tree depth, for avoiding excessive growth of tree-based GP individuals during the evolutionary process. This technique introduces a dynamic tree depth limit, very similar to the Koza-style strict limit except in two aspects: it is initially set with a low value; it is increased when needed to accommodate an individual that is deeper than the limit but is better than any other individual found during the run. The results show that the dynamic maximum tree depth technique efficiently avoids the growth of trees beyond the necessary size to solve the problem, maintaining the ability to find individuals with good fitness values. When compared to lexicographic parsimony pressure, dynamic maximum tree depth proves to be significantly superior. When both techniques are coupled, the results are even better.

Sara Silva, Jonas Almeida

Difficulty of Unimodal and Multimodal Landscapes in Genetic Programming

This paper presents an original study of fitness distance correlation as a measure of problem difficulty in genetic programming. A new definition of distance, called structural distance, is used and suitable mutation operators for the program space are defined. The difficulty is studied for a number of problems, including, for the first time in GP, multimodal ones, both for the new hand-tailored mutation operators and standard crossover. Results are in agreement with empirical observations, thus confirming that fitness distance correlation can be considered a reasonable index of difficulty for genetic programming, at least for the set of problems studied here.

Leonardo Vanneschi, Marco Tomassini, Manuel Clergue, Philippe Collard

Genetic Programming — Posters

Ramped Half-n-Half Initialisation Bias in GP

Tree initialisation techniques for genetic programming (GP) are examined in [4],[3], highlighting a bias in the standard implementation of the initialisation method Ramped Half-n-Half (RHH) [1]. GP trees typically evolve to random shapes, even when populations were initially full or minimal trees [2]. In canonical GP, unbalanced and sparse trees increase the probability that bigger subtrees are selected for recombination, ensuring code growth occurs faster and that subtree crossover will have more difficultly in producing trees within specified depth limits. The ability to evolve tree shapes which allow more legal crossover operations, by providing more possible crossover points (by being bushier), and control code growth is critical. The GP community often uses RHH [4]. The standard implementation of the RHH method selects either the grow or full method with 0.5 probability to produce a tree. If the tree is already in the initial population it is discarded and another is created by grow or full. As duplicates are typically not allowed, this standard implementation of RHH favours full over grow and possibly biases the evolutionary process.

Edmund Burke, Steven Gustafson, Graham Kendall

Improving Evolvability of Genetic Parallel Programming Using Dynamic Sample Weighting

This paper investigates the sample weighting effect on Genetic Parallel Programming (GPP) that evolves parallel programs to solve the training samples captured directly from a real-world system. The distribution of these samples can be extremely biased. Standard GPP assigns equal weights to all samples. It slows down evolution because crowded regions of samples dominate the fitness evaluation and cause premature convergence. This paper compares the performance of four sample weighting (SW) methods, namely, Equal SW (ESW), Class-equal SW (CSW), Static SW (SSW) and Dynamic SW (DSW) on five training sets. Experimental results show that DSW is superior in performance on tested problems.

Sin Man Cheang, Kin Hong Lee, Kwong Sak Leung

Enhancing the Performance of GP Using an Ancestry-Based Mate Selection Scheme

The performance of genetic programming relies mostly on population-contained variation. If the population diversity is low then there will be a greater chance of the algorithm being unable to find the global optimum. We present a new method of approximating the genetic similarity between two individuals using ancestry information. We define a new diversity-preserving selection scheme, based on standard tournament selection, which encourages genetically dissimilar individuals to undergo genetic operation. The new method is illustrated by assessing its performance in a well-known problem domain: algebraic symbolic regression.

Rodney Fry, Andy Tyrrell

A General Approach to Automatic Programming Using Occam’s Razor, Compression, and Self-Inspection

Extended Abstract

This paper describes a novel general method for automatic programming which can be seen as a generalization of techniques such as genetic programming and ADATE. The approach builds on the assumption that data compression can be used as a metaphor for cognition and intelligence. The proof-of-concept system is evaluated on sequence prediction problems. As a starting point, the process of inferring a general law from a data set is viewed as an attempt to compress the observed data. From an artificial intelligence point of view, compression is a useful way of measuring how deeply the observed data is understood. If the sequence contains redundancy it exists a shorter description i.e. the sequence can be compressed.

Peter Galos, Peter Nordin, Joel Olsén, Kristofer Sundén Ringnér

Building Decision Tree Software Quality Classification Models Using Genetic Programming

Predicting the quality of software modules prior to testing or system operations allows a focused software quality improvement endeavor. Decision trees are very attractive for classification problems, because of their comprehensibility and white box modeling features. However, optimizing the classification accuracy and the tree size is a difficult problem, and to our knowledge very few studies have addressed the issue. This paper presents an automated and simplified genetic programming (GP) based decision tree modeling technique for calibrating software quality classification models. The proposed technique is based on multi-objective optimization using strongly typed GP. Two fitness functions are used to optimize the classification accuracy and tree size of the classification models calibrated for a real-world high-assurance software system. The performances of the classification models are compared with those obtained by standard GP. It is shown that the gp-based decision tree technique yielded better classification models.

Yi Liu, Taghi M. Khoshgoftaar

Evolving Petri Nets with a Genetic Algorithm

In evolutionary computation many different representations (“genomes”) have been suggested as the underlying data structures, upon which the genetic operators act. Among the most prominent examples are the evolution of binary strings, real-valued vectors, permutations, finite automata, and parse trees. In this paper the use of place-transition nets, a low-level Petri net (PN) class [1],[2], as the structures that undergo evolution is examined. We call this approach “Petri Net Evolution” (PNE). Structurally, Petri nets can be considered as specialized bipartite graphs. In their extended version (adding inhibitor arcs) PNs are as powerful as Turing machines. PNE is therefore a form of Genetic Programming (GP). Preliminary results obtained by evolving variable-size place-transition nets show the success of this approach when applied to the problem areas of boolean function learning and classification.

Holger Mauch

Diversity in Multipopulation Genetic Programming

In the past few years, we have done a systematic experimental investigation of the behavior of multipopulation GP [2] and we have empirically observed that distributing the individuals among several loosely connected islands allows not only to save computation time, due to the fact that the system runs on multiple machines, but also to find better solution quality. These results have often been attributed to better diversity maintenance due to the periodic migration of groups of “good” individuals among the subpopulations. We also believe that this might be the case and we study the evolution of diversity in multi-island GP. All the diversity measures that we use in this paper are based on the concept of entropy of a population P, defined as H(P) = -∑ j=1 NF j log(F j ). If we are considering phenotypic diversity, we define F j as the fraction n j /N of individuals in P having a certain fitness j, where N is the total number of fitness values in P. In this case, the entropy measure will be indicated as H p (P) or simply H p . To define genotypic diversity, we use two different techniques. The first one consists in partitioning individuals in such a way that only identical individuals belong to the same group. In this case, we have considered F j as the fraction of trees in the population P having a certain genotype j, where N is the total number of genotypes in P and the entropy measure will be indicated as H G (P) or simply H G . The second technique consists in defining a distance measure, able to quantify the genotypic diversity between two trees. In this case, F j is the fraction of individuals having a given distance j from a fixed tree (called origin), where N is the total number of distance values from the origin appearing in P and the entropy measure will be indicated as H g (P) or simply H g . The tree distance used is Ekárt's and Németh's definition [1].

Marco Tomassini, Leonardo Vanneschi, Francisco Fernández, Germán Galeano

An Encoding Scheme for Generating λ-Expressions in Genetic Programming

To apply genetic programming (GP) to evolve λ-expressions, we devised an encoding scheme that encodes λ-expressions into trees. This encoding has closure property, i.e., any combination of terminal and non-terminal symbols forms a valid λ-expression. We applied this encoding to a simple symbolic regression problem over Church numerals and the objective function f(x) = 2x was successfully obtained. This encoding scheme will provide a good foothold for exploring fundamental properties of GP by making use of lambda-calculus.

Kazuto Tominaga, Tomoya Suzuki, Kazuhiro Oka

AVICE: Evolving Avatar’s Movernent

In recent years, the widespread penetration of the Internet has led to the use of diverse expressions over the Web. Among them, many appear to have strong video elements. However, few expressions are based on human beings, with whom we are most familiar. This is deemed to be attributable to the fact that it is not easy to generate human movements. This is attributable to the fact that the creation of movements by hand requires intuition and technology.With this in mind, this study proposes a system that supports the creation of 3D avatars’ movements based on interactive evolutionary computation, as a system that facilitates creativity and enables ordinary Users with no special skills to generate dynamic animations easily, and as a method of movement description.

Hiromi Wakaki, Hitoshi Iba

Learning Classifier Systems

Evolving Multiple Discretizations with Adaptive Intervals for a Pittsburgh Rule-Based Learning Classifier System

One of the ways to solve classification problems with real-value attributes using a Learning Classifier System is the use of a discretization algorithm, which enables traditional discrete knowledge representations to solve these problems. A good discretization should balance losing the minimum of information and having a reasonable number of cut points. Choosing a single discretization that achieves this balance across several domains is not easy. This paper proposes a knowledge representation that uses several discretization (both uniform and non-uniform ones) at the same time, choosing the correct method for each problem and attribute through the iterations. Also, the intervals proposed by each discretization can split and merge among them along the evolutionary process, reducing the search space where possible and expanding it where necessary. The knowledge representation is tested across several domains which represent a broad range of possibilities.

Jaume Bacardit, Josep Maria Garrell

Limits in Long Path Learning with XCS

The development of the XCS Learning Classifier System [26] has produced a stable implementation, able to consistently identify the accurate and optimally general population of classifiers mapping a given reward landscape [15],[16],[29]. XCS is particularly powerful within direct-reward environments, and notably within problems suitable for commercial application [3]. The application of XCS within delayed reward environments has also shown promise, although early investigations were within enviroments with a comparatively short delay to reward (e.g. [28],[19]). Subsequent systematic investigation [19],[20],[1],[2] have suggested that XCS has difficulty accurately mapping and exploiting even simple environments with moderate reward delays. This paper summarises these results and presents new results that identify some limits and their implications. A modification to the error computation within XCS is introduced that allows the minimum error parameter to be applied relative to the magnitude of the payoff to each classifier. First results demonstrate that this modification enables XCS to successfully map longer delayed-reward enviroments.

Alwyn Barry

Bounding the Population Size in XCS to Ensure Reproductive Opportunities

Despite several recent successful comparisons and applications of the accuracy-based learning classifier system XCS, it is hardly understood how crucial parameters should be set in XCS nor how XCS can be expect to scale up in larger problems. Previous research identified a covering challenge in XCS that needs to be obeyed to ensure that the genetic learning process takes place. Furthermore, a schema challenge was identified that, once obeyed, ensures the existence of accurate classifiers. This paper departs from these challenges deriving a reproductive. opportunity bound. The bound assures that more accurate classifiers get a chance for reproduction. The relation to the previous bounds as well as to the specificity pressure in XCS are discussed as well. The derived bound shows that XCS scales in a machine learning competitive way.

Martin V. Butz, David E. Goldberg

Tournament Selection: Stable Fitness Pressure in XCS

Although it is known from GA literature that proportionate selection is subject to many pitfalls, the LCS community somewhat adhered to proportionate selection. Also in the accuracy-based learning classifier system XCS, introduced by Wilson in 1995, proportionate selection is used. This paper identifies problem properties in which performance of proportionate selection is impaired. Consequently, tournament selection is introduced which makes XCS more parameter independent, noise independent, and more efficient in exploiting fitness guidance.

Martin V. Butz, Kumara Sastry, David E. Goldberg

Improving Performance in Size-Constrained Extended Classifier Systems

Extended Classifier Systems, or XCS, have been shown to be successful at developing accurate, complete and compact mappings of a problem’s payoff landscape. However, the experimental results presented in the literature frequently utilize population sizes significantly larger than the size of the search space. This resource requirement may limit the range of problem/hardware combinations to which XCS can be applied. In this paper two sets of modifications are presented that are shown to improve performance in small sizeconstrained 6-Multiplexer and Woods-2 problems.

Devon Dawson

Designing Efficient Exploration with MACS: Modules and Function Approximation

MACS (Modular Anticipatory Classifier System) is a new Anticipatory Classifier System. With respect to its predecessors, ACS, ACS2 and YACS, the latent learning process in MACS is able to take advantage of new regularities. Instead of anticipating all attributes of the perceived situations in the same classifier, MACS only anticipates one attribute per classifier. In this paper we describe how the model of the environment represented by the classifiers can be used to perform active exploration and how this exploration policy is aggregated with the exploitation policy. The architecture is validated experimentally. Then we draw more general principles from the architectural choices giving rise to MACS. We show that building a model of the environment can be seen as a function approximation problem which can be solved with Anticipatory Classifier Systems such as MACS, but also with accuracy-based systems like XCS or XCSF, organized into a Dyna architecture.

Pierre Gérard, Olivier Sigaud

Estimating Classifier Generalization and Action’s Effect: A Minimalist Approach

We present an online technique to estimate the generality of classifiers conditions. We show that this technique can be extended to gather some basic information about the effect of classifier actions in the environment. The approach we present is minimalist in that it is aimed at obtaining as much information as possible from online experience, with as few modifications as possible to the classifier structure. Because of its plainness, the method we propose can be applied virtually to any classifier system model.

Pier Luca Lanzi

Towards Building Block Propagation in XCS: A Negative Result and Its Implications

The accuracy-based classifier system XCS is currently the most successful learning classifier system. Several recent studies showed that XCS can produce machine-learning competitive results. Nonetheless, until now the evolutionary mechanisms in XCS remained somewhat ill-understood. This study investigates the selectorecombinative capabilities of the current XCS system. We reveal the accuracy dependence of XCS’s evolutionary algorithm and identify a fundamental limitation of the accuracy-based fitness approach in certain problems. Implications and future research directions conclude the paper.

Kurian K. Tharakunnel, Martin V. Butz, David E. Goldberg

Learning Classifier Systems — Posters

Data Classification Using Genetic Parallel Programming

A novel Linear Genetic Programming (LGP) paradigm called Genetic Parallel Programming (GPP) has been proposed to evolve parallel programs based on a Multi-ALU Processor. It is found that GPP can evolve parallel programs for Data Classification problems. In this paper, five binary-class UCI Machine Learning Repository databases are used to test the effectiveness of the proposed GPP-classifier. The main advantages of employing GPP for data classification are: 1) speeding up evolutionary process by parallel hardware fitness evaluation; and 2) discovering parallel algorithms automatically. Experimental results show that the GPP-classifier evolves simple classification programs with good generalization performance. The accuracies of these evolved classifiers are comparable to other existing classification algorithms.

Sin Man Cheang, Kin Hong Lee, Kwong Sak Leung

Dynamic Strategies in a Real-Time Strategy Game

Most modern real-time strategy computer games have a sophisticated but fixed ‘AI’ component that controls the computer’s actions. Once the user has learned how such a game will react, the game quickly loses its appeal. This paper describes an example of how a learning classifier system (based on Wilson’s ZCS [1]) can be used to equip the computer with dynamically-changing strategies that respond to the user’s strategies, thus greatly extending the games playability for serious gamers.

William Joseph Falke, Peter Ross

Using Raw Accuracy to Estimate Classifier Fitness in XCS

In XCS classifier fitness is based on the relative accuracy of the classifier prediction [3]. A classifier is more fit if its prediction of the expected payoff is more accurate than the prediction given by the other classifiers that are applied in the same situations. The use of relative accuracy has two major implications. First, because the evaluation of fitness is based on the relevance that classifiers have in some situations, classifiers that are the only ones applying in a certain situation have a high fitness, even if they are inaccurate. As a consequence, inaccurate classifiers might be able to reproduce so to cause reduced performance; as already noted by Wilson (personal communication reported in [1]). In addition, because the computation of classifier fitness is based both (i) on the classifier accuracy and (ii) on the classifier relevance in situations in which it applies, in XCS, classifier fitness does not provide information about the problem solution, but rather an indication of the classifier relevance in the encountered situations. Accordingly, it is not generally possible to tell whether a classifier with a high fitness is accurate or not, just looking at the fitness. To have this kind of information, we need the prediction error ε which provides an indication of the raw classifier accuracy.

Pier Luca Lanzi

Towards Learning Classifier Systems for Continuous-Valued Online Environments

Previous work has studied the use of interval representations in XCS to allow its use in continuous-valued environments. Here we compare the speed of learning of continuous-valued versions of ZCS and XCS with a simple model of an online environment.

Christopher Stone, Larry Bull

Real World Applications

Artificial Immune System for Classification of Gene Expression Data

DNA microarray experiments generate thousands of gene expression measurement simultaneously. Analyzing the difference of gene expression in cell and tissue samples is useful in diagnosis of disease. This paper presents an Artificial Immune System for classifying microarray-monitored data. The system evolutionarily selects important features and optimizes their weights to derive classification rules. This system was applied to two datasets of cancerous cells and tissues. The primary result found few classification rules which correctly classified all the test samples and gave some interesting implications for feature selection.

Shin Ando, Hitoshi Iba

Automatic Design Synthesis and Optimization of Component-Based Systems by Evolutionary Algorithms

A novel approach for automatic design synthesis and optimization using evolutionary algorithms (EA) is introduced in the paper. The approach applies to component-based systems in general and is demonstrated with a heating, ventilating and air-conditioning (HVAC) systems problem. The whole process of the system design, including the initial stages that usually entail significant human involvement, is treated as a constraint satisfaction problem. The formulation of the optimization process realizes the complex nature of the design problem using different types of variables (real and integer) that represent both the physical and the topological properties of the system; the objective is to design a feasible and efficient system. New evolutionary operators tailored to the component-based, spatially distributed system design problem have been developed. The process of design has been fully automated. Interactive supervision of the optimization process by a human-designer is possible using a specialized GUI. An example of automatic design of HVAC system for two-zone buildings is presented.

P. P. Angelov, Y. Zhang, J. A. Wright, V. I. Hanby, R. A. Buswell

Studying the Advantages of a Messy Evolutionary Algorithm for Natural Language Tagging

The process of labeling each word in a sentence with one of its lexical categories (noun, verb, etc) is called tagging and is a key step in parsing and many other language processing and generation applications. Automatic lexical taggers are usually based on statistical methods, such as Hidden Markov Models, which works with information extracted from large tagged available corpora. This information consists of the frequencies of the contexts of the words, that is, of the sequence of their neighbouring tags. Thus, these methods rely on the assumption that the tag of a word only depends on its surrounding tags. This work proposes the use of a Messy Evolutionary Algorithm to investigate the validity of this assumption. This algorithm is an extension of the fast messy genetic algorithms, a variety of Genetic Algorithms that improve the survival of high quality partial solutions or building blocks. Messy GAs do not require all genes to be present in the chromosomes and they may also appear more than one time. This allows us to study the kind of building blocks that arise, thus obtaining information of possible relationships between the tag of a word and other tags corresponding to any position in the sentence. The paper describes the design of a messy evolutionary algorithm for the tagging problem and a number of experiments on the performance of the system and the parameters of the algorithm.

Lourdes Araujo

Optimal Elevator Group Control by Evolution Strategies

Efficient elevator group control is important for the operation of large buildings. Recent developments in this field include the use of fuzzy logic and neural networks. This paper summarizes the development of an evolution strategy (ES) that is capable of optimizing the neuro-controller of an elevator group controller. It extends the results that were based on a simplified elevator group controller simulator. A threshold selection technique is presented as a method to cope with noisy fitness function values during the optimization run. Experimental design techniques are used to analyze first experimental results.

Thomas Beielstein, Claus-Peter Ewald, Sandor Markon

A Methodology for Combining Symbolic Regression and Design of Experiments to Improve Empirical Model Building

A novel methodology for empirical model building using GP-generated symbolic regression in combination with statistical design of experiments as well as undesigned data is proposed. The main advantage of this methodology is the maximum data utilization when extrapolation is necessary. The methodology offers alternative non-linear models that can either linearize the response in the presence of Lack or Fit or challenge and confirm the results from the linear regression in a cost effective and time efficient fashion. The economic benefit is the reduced number of additional experiments in the presence of Lack of Fit.

Flor Castillo, Kenric Marshall, James Green, Arthur Kordon

The General Yard Allocation Problem

The General Yard Allocation Problem (GYAP) is a resource allocation problem faced by the Port of Singapore Authority. Here, space allocation for cargo is minimized for all incoming requests for space required in the yard within time intervals. The GYAP is NP-hard for which we propose several heuristic algorithms, including Tabu Search, Simulated Annealing, Genetic Algorithms and the recently emerged “Squeaky Wheel” Optimization (SWO). Extensive experiments give solutions to the problem while comparisons among approaches developed show that the Genetic Algorithm method gives best results.

Ping Chen, Zhaohui Fu, Andrew Lim, Brian Rodrigues

Connection Network and Optimization of Interest Metric for One-to-One Marketing

With the explosive growth of data in electronic commerce, rule finding becomes a crucial part in marketing. In this paper, we discuss the essential limitations of the existing metrics to quantify the interests of rules, and present the need of optimizing the interest metric. We describe the construction of the connection network that represents the relationships between items and propose a natural marketing model using the network. Although simple interest metrics were used, the connection network model showed stable performance in the experiment with field data. By constructing the network based on the optimized interest metric, the performance of the model was significantly improved.

Sung-Soon Choi, Byung-Ro Moon

Parameter Optimization by a Genetic Algorithm for a Pitch Tracking System

The emergence of multimedia data in databases requires adequate methods for information retrieval. In a music data retrieval system by humming, the first stage is to extract exact pitch periods from a flow of signals. Due to the complexity of speech signals, it is difficult to make a robust and practical pitch tracking system. We adopt genetic algorithm in optimizing the control parameters for note segmentation and pitch determination. We applied the results to HumSearch, a commercialized product, as a pitch tracking engine. Experimental results showed that the proposed engine notably improved the performance of the existing engine in HumSearch.

Yoon-Seok Choi, Byung-Ro Moon

Secret Agents Leave Big Footprints: How to Plant a Cryptographic Trapdoor, and Why You Might Not Get Away with It

This paper investigates whether optimisation techniques can be used to evolve artifacts of cryptographic significance which are apparently secure, but which have hidden properties that may facilitate cryptanalysis. We show how this might be done and how such sneaky use of optimisation may be detected.

John A. Clark, Jeremy L. Jacob, Susan Stepney

GenTree: An Interactive Genetic Algorithms System for Designing 3D Polygonal Tree Models

The creation of individual 3D models to include within a virtual world can be a time-consuming process. The standard approach to streamline this is to use procedural modeling tools, where the user adjusts a set of parameters that defines the tree. We have designed Gen- Tree, an interactive system that uses a genetic algorithms (GA) approach to evolve procedural 3D tree models. GenTree is a hybrid system, combining the standard parameter adjustment and an interactive GA. The parameters may be changed by the user directly or via the GA process, which combines parameters from pairs of parent trees into those that describe novel trees. The GA component enables the system to be used by someone who is ignorant of the parameters that define the trees. Furthermore, combining the standard interactive design process with GA design decreases the time and patience required to design realistic 3D polygonal trees by either method alone.

Clare Bates Congdon, Raymond H. Mazza

Optimisation of Reaction Mechanisms for Aviation Fuels Using a Multi-objective Genetic Algorithm

In this study a multi-objective genetic algorithm approach is developed for determining new reaction rate parameters for the combustion of kerosene/air mixtures. The multi-objective structure of the genetic algorithm employed allows for the incorporation of both perfectly stirred reactor and laminar premixed flame data into the inversion process, thus producing more efficient reaction mechanisms.

Lionel Elliott, Derek B. Ingham, Adrian G. Kyne, Nicolae S. Mera, Mohamed Pourkashanian, Chritopher W. Wilson

System-Level Synthesis of MEMS via Genetic Programming and Bond Graphs

Initial results have been achieved for automatic synthesis of MEMS system-level lumped parameter models using genetic programming and bond graphs. This paper first discusses the necessity of narrowing the problem of MEMS synthesis into a certain specific application domain, e.g., RF MEM devices. Then the paper briefly introduces the flow of a structured MEMS design process and points out that system-level lumped-parameter model synthesis is the first step of the MEMS synthesis process. Bond graphs can be used to represent a system-level model of a MEM system. As an example, building blocks of RF MEM devices are selected carefully and their bond graph representations are obtained. After a proper and realizable function set to operate on that category of building blocks is defined, genetic programming can evolve both the topologies and parameters of corresponding RF MEM devices to meet predefined design specifications. Adaptive fitness definition is used to better direct the search process of genetic programming. Experimental results demonstrate the feasibility of the approach as a first step of an automated MEMS synthesis process. Some methods to extend the approach are also discussed.

Zhun Fan, Kisung Seo, Jianjun Hu, Ronald C. Rosenberg, Erik D. Goodman

Congressional Districting Using a TSP-Based Genetic Algorithm

The drawing of congressional districts by legislative bodies in the United States creates a great deal of controversy each decade as political parties and special interest groups attempt to divide states into districts beneficial to their candidates. The genetic algorithm presented in this paper attempts to find a set of compact and contiguous congressional districts of approximately equal population. This genetic algorithm utilizes a technique based on an encoding and genetic operators used to solve Traveling Salesman Problems (TSP). This encoding forces near equality of district population and uses the fitness function to promote district contiguity and compactness. A post-processing step further refines district population equality. Results are provided for three states (North Carolina, South Carolina, and Iowa) using 2000 census data.

Sean L. Forman, Yading Yue

Active Guidance for a Finless Rocket Using Neuroevolution

Finless rockets are more efficient than finned designs, but are too unstable to fly unassisted. These rockets require an active guidance system to control their orientation during flight and maintain stability. Because rocket dynamics are highly non-linear, developing such a guidance system can be prohibitively costly, especially for relatively small-scale rockets such as sounding rockets. In this paper, we propose a method for evolving a neural network guidance system using the Enforced SubPopulations (ESP) algorithm. Based on a detailed simulation model, a controller is evolved for a finless version of the Interorbital Systems RSX-2 sounding rocket. The resulting performance is compared to that of an unguided standard full-finned version. Our results show that the evolved active guidance controller can greatly increase the final altitude of the rocket, and that ESP can be an effective method for solving real-world, non-linear control tasks.

Faustino J. Gomez, Risto Miikkulainen

Simultaneous Assembly Planning and Assembly System Design Using Multi-objective Genetic Algorithms

This paper aims to demonstrate the application of multi-objective evolutionary optimization, namely an adaptation of NSGA-II, to simultaneously optimize the assembly sequence plan as well as selection of the type and number of assembly stations for a production shop that produces three different models of wind propelled ventilators. The decision variables, which are the assembly sequences of each product and the machine selection at each assembly station, are encoded in a manner that allows efficient implementation of a repair operator to maintain the feasibility of the offspring. Test runs are conducted for the sample assembly system using a crossover operator tailored for the proposed encoding and some conventional crossover schemes. The results show overall good performance for all schemes with the best performance achieved by the tailored crossover, which illustrates the applicability of multi-objective GA’s. The presented framework proposed is generic to be applicable to other products and assembly systems.

Karim Hamza, Juan F. Reyes-Luna, Kazuhiro Saitou

Multi-FPGA Systems Synthesis by Means of Evolutionary Computation

Multi-FPGA systems (MFS) are used for a great variety of applications, for instance, dynamically re-configurable hardware applications, digital circuit emulation, and numerical computation. There are a great variety of boards for MFS implementation. In this paper a methodology for MFS design is presented. The techniques used are evolutionary programs and they solve all of the design tasks (partitioning placement and routing). Firstly a hybrid compact genetic algorithm solves the partitioning problem and then genetic programming is used to obtain a solution for the two other tasks.

J. I. Hidalgo, F. Fernández, J. Lanchares, J. M. Sánchez, R. Hermida, M. Tomassini, R. Baraglia, R. Perego, O. Garnica

Genetic Algorithm Optimized Feature Transformation — A Comparison with Different Classifiers

When using a Genetic Algorithm (GA) to optimize the feature space of pattern classification problems, the performance improvement is not only determined by the data set used, but also depends on the classifier. This work compares the improvements achieved by GA-optimized feature transformations on several simple classifiers. Some traditional feature transformation techniques, such as Principal Components Analysis (PCA) and Linear Discriminant Analysis (LDA) are also tested to see their effects on the GA optimization. The results based on some real-world data and five benchmark data sets from the UCI repository show that the improvements after GA-optimized feature transformation are in reverse ratio with the original classification rate if the classifier is used alone. It is also shown that performing the PCA and LDA transformations on the feature space prior to the GA optimization improved the final result.

Zhijian Huang, Min Pei, Erik Goodman, Yong Huang, Gaoping Li

Web-Page Color Modification for Barrier-Free Color Vision with Genetic Algorithm

In this paper, we propose a color modification scheme for web-pages described by HTML markup language in order to realize barrier-free color vision on the internet. First, we present an abstracted image model, which describes a color image as a combination of several regions divided with color information, and define some mutual color relations between regions. Next, based on fundamental research on the anomalous color vision, we design some fitness functions to modify colors in a web-page properly and effectively. Then we solve the color modification problem, which contains complex mutual color relations, by using Genetic Algorithm. Experimental results verify that the proposed scheme can make the colors in a web-page more recognizable for anomalous vision users through not only computer simulation but also psychological experiments with them.

Manabu Ichikawa, Kiyoshi Tanaka, Shoji Kondo, Koji Hiroshima, Kazuo Ichikawa, Shoko Tanabe, Kiichiro Fukami

Quantum-Inspired Evolutionary Algorithm-Based Face Verification

Face verification is considered to be the main part of the face detection system. To detect human faces in images, face candidates are extracted and face verification is performed. This paper proposes a new face verification algorithm using Quantum-inspired Evolutionary Algorithm (QEA). The proposed verification system is based on Principal Components Analysis (PCA). Although PCA related algorithms have shown outstanding performance, the problem lies in the selection of eigenvectors. They may not be the optimal ones for representing the face features. Moreover, a threshold value should be selected properly considering the verification rate and false alarm rate. To solve these problems, QEA is employed to find out the optimal distance measure under the predetermined threshold value which distinguishes between face images and non-face images. The proposed verification system is tested on the AR face database and the results are compared with the previous works to show the improvement in performance.

Jun-Su Jang, Kuk-Hyun Han, Jong-Hwan Kim

Minimization of Sonic Boom on Supersonic Aircraft Using an Evolutionary Algorithm

The aerospace community has an increasing interest in developing super sonic transport class vehicles for civil aviation. One of the concerns in such a project is to minimize the sonic boom produced by the aircraft as demonstrated by its ground signature. One approach being considered is to attach a spike/keel on the front of the aircraft to attenuate the magnitude of an aircraft’s ground signature. This paper describes an effort to develop an automatic method for designing the spike/keel area distribution that satisfies constraints on the ground signature of a specified aircraft. In this work a genetic algorithm is used to perform the design optimization. A modified version of Whitham’s theory is used to generate the near field pressure signature. The ground signature is computed with the NFBOOM atmospheric propagation code. Results indicate that genetic algorithms are effective tools for solving the design problem presented.

Charles L. Karr, Rodney Bowersox, Vishnu Singh

Optimizing the Order of Taxon Addition in Phylogenetic Tree Construction Using Genetic Algorithm

Phylogenetics has gained in public favor for the analysis of DNA sequence data as molecular biology has advanced. Among a number of algorithms for phylogenetics, the fastDNAml is considered to have reasonable computational cost and performance. However, it has a defect that its performance is likely to be significantly affected by the order of taxon addition. In this paper, we propose a genetic algorithm for optimizing the order of taxon addition in the fastDNAml. Experimental results show that the fastDNAml with the optimized order of taxon addition constructs more probable evolutionary trees in terms of the maximum likelihood.

Yong-Hyuk Kim, Seung-Kyu Lee, Byung-Ro Moon

Multicriteria Network Design Using Evolutionary Algorithm

In this paper, we revisit a general class of multi-criteria multi-constrained network design problems and attempt to solve, in a novel way, with Evolutionary Algorithms (EAs). A major challenge to solving such problems is to capture possibly all the (representative) equivalent and diverse solutions. In this work, we formulate, without loss of generality, a bi-criteria bi- constrained communication network topological design problem. Two of the primary objectives to be optimized are network delay and cost subject to satisfaction of reliability and flow-constraints. This is a NP-hard problem so we use a hybrid approach (for initialization of the population) along with EA. Furthermore, the two-objective optimal solution front is not known a priori. Therefore, we use a multiobjective EA which produces diverse solution space and monitors convergence; the EA has been demonstrated to work effectively across complex problems of unknown nature. We tested this approach for designing networks of different sizes and found that the approach scales well with larger networks. Results thus obtained are compared with those obtained by two traditional approaches namely, the exhaustive search and branch exchange heuristics.

Rajeev Kumar, Nilanjan Banerjee

Control of a Flexible Manipulator Using a Sliding Mode Controller with Genetic Algorithm Tuned Manipulator Dimension

The tip position control of a single-link flexible manipulator is considered in this paper. The cross-sectional dimension of the manipulator is tuned by genetic algorithms such that the first vibration mode dominates the higher order vibrations. A sliding mode controller of reduced dimension is employed to control the manipulator using the rigid and vibration measurements as feedback. Control effort is shared dynamically between the rigid mode tracking and vibration suppression. Performance and effectiveness of the proposed reduced dimension sliding mode controller in vibration suppression and against payload variations are demonstrated with simulations.

N. M. Kwok, S. Kwong

Daily Stock Prediction Using Neuro-genetic Hybrids

We propose a neuro-genetic daily stock prediction model. Traditional indicators of stock prediction are utilized to produce useful input features of neural networks. The genetic algorithm optimizes the neural networks under a 2D encoding and crossover. To reduce the time in processing mass data, a parallel genetic algorithm was used on a Linux cluster system. It showed notable improvement on the average over the buy-and-hold strategy. We also observed that some companies were more predictable than others.

Yung-Keun Kwon, Byung-Ro Moon

Finding the Optimal Gene Order in Displaying Microarray Data

The rapid advances of genome-scale sequencing have brought out the necessity of developing new data processing techniques for enormous genomic data. Microarrays, for example, can generate such a large number of gene expression data that we usually analyze them with some clustering algorithms. However, the clustering algorithms have been ineffective for visualization in that they are not concerned about the order of genes in each cluster. In this paper, a hybrid genetic algorithm for finding the optimal order of microarray data, or gene expression profiles, is proposed. We formulate our problem as a new type of traveling salesman problem and apply a hybrid genetic algorithm to the problem. To use the 2D natural crossover, we apply the Sammon’s mapping to the microarray data. Experimental results showed that our algorithm found improved gene orders for visualizing the gene expression profiles.

Seung-Kyu Lee, Yong-Hyuk Kim, Byung-Ro Moon

Learning Features for Object Recognition

Features represent the characteristics of objects and selecting or synthesizing effective composite features are the key factors to the performance of object recognition. In this paper, we propose a co-evolutionary genetic programming (CGP) approach to learn composite features for object recognition. The motivation for using CGP is to overcome the limitations of human experts who consider only a small number of conventional combinations of primitive features during synthesis. On the other hand, CGP can try a very large number of unconventional combinations and these unconventional combinations may yield exceptionally good results in some cases. Our experimental results with real synthetic aperture radar (SAR) images show that CGP can learn good composite features. We show results to distinguish objects from clutter and to distinguish objects that belong to several classes.

Yingqiang Lin, Bir Bhanu

An Efficient Hybrid Genetic Algorithm for a Fixed Channel Assignment Problem with Limited Bandwidth

We need an efficient channel assignment algorithm for increasing channel re-usability, reducing call-blocking rate and reducing interference in any cellular systems with limited bandwidth and a large number of subscribers. We propose an efficient hybrid genetic algorithm for a fixed channel assignment problem with limited bandwidth constraint. The proposed GA finds a good sequence of codes for a virtual machine that produces channel assignment. Results are given which show that our GA produces far better solutions to several practical problems than existing GAs.

Shouichi Matsui, Isamu Watanabe, Ken-ichi Tokoro

Using Genetic Algorithms for Data Mining Optimization in an Educational Web-Based System

This paper presents an approach for classifying students in order to predict their final grade based on features extracted from logged data in an education web-based system. A combination of multiple classifiers leads to a significant improvement in classification performance. Through weighting the feature vectors using a Genetic Algorithm we can optimize the prediction accuracy and get a marked improvement over raw classification. It further shows that when the number of features is few; feature weighting is works better than just feature selection.

Behrouz Minaei-Bidgoli, William F. Punch

Improved Image Halftoning Technique Using GAs with Concurrent Inter-block Evaluation

In this paper we propose a modified evaluation method to improve the performance of an image halftoning technique using GAs. We design the algorithm to avoid noise in the fitness function by evolving all image blocks concurrently, exploiting the inter-block correlation, and sharing information between neighbor image blocks. The effectiveness of the method when the population and image block size are reduced, and the configuration of selection and genetic operators are investigated in detail. Simulation results show that the proposed method can remarkably reduce the entire processing time to generate high quality bi-level halftone images.

Emi Myodo, Hernán Aguirre, Kiyoshi Tanaka

Complex Function Sets Improve Symbolic Discriminant Analysis of Microarray Data

Our ability to simultaneously measure the expression levels of thousands of genes in biological samples is providing important new opportunities for improving the diagnosis, prevention, and treatment of common diseases. However, new technologies such as DNA microarrays are generating new challenges for variable selection and statistical modeling. In response to these challenges, a genetic programming-based strategy called symbolic discriminant analysis (SDA) for the automatic selection of gene expression variables and mathematical functions for statistical modeling of clinical endpoints has been developed. The initial development and evaluation of SDA has focused on a function set consisting of only the four basic arithmetic operators. The goal of the present study is to evaluate whether adding more complex operators such as square root to the function set improves SDA modeling of microarray data. The results presented in this paper demonstrate that adding complex functions to the terminal set significantly improves SDA modeling by reducing model size and, in some cases, reducing classification error and runtime. We anticipate SDA will be an important new evolutionary computation tool to be added to the repertoire of methods for the analysis of microarray data.

David M. Reif, Bill C. White, Nancy Olsen, Thomas Aune, Jason H. Moore

GA-Based Inference of Euler Angles for Single Particle Analysis

Single particle analysis is one of the methods for structural studies of protein and macromolecules developed in image analysis on electron microscopy. Reconstructing 3D structure from microscope images is not an easy analysis because of the low resolution of images and lack of the directional information of images in 3D structure. To improve the resolution, different projections are aligned, classified and averaged. Inferring the orientations of these images is so difficult that the task of reconstructing 3D structures depends upon the experience of researchers. But recently, a method to reconstruct 3D structures is automatically devised [6]. In this paper, we propose a new method for determining Euler angles of projections by applying Genetic Algorithms (i.e., GAs). We empirically show that the proposed approach has improved the previous one in terms of computational time and acquired precision.

Shusuke Saeki, Kiyoshi Asai, Katsutoshi Takahashi, Yutaka Ueno, Katsunori Isono, Hitoshi Iba

Mining Comprehensible Clustering Rules with an Evolutionary Algorithm

In this paper, we present a novel evolutionary algorithm, called NOCEA, which is suitable for Data Mining (DM) clustering applications. NOCEA evolves individuals that consist of a variable number of non-overlapping clustering rules, where each rule includes d intervals, one for each feature. The encoding scheme is non-binary as the values for the boundaries of the intervals are drawn from discrete domains, which reflect the automatic quantization of the feature space. NOCEA uses a simple fitness function, which is radically different from any distance-based criterion function suggested so far. A density-based merging operator combines adjacent rules forming the genuine clusters in data. NOCEA has been evaluated on challenging datasets and we present results showing that it meets many of the requirements for DM clustering, such as ability to discover clusters of different shapes, sizes, and densities. Moreover, NOCEA is independent of the order of input data and insensitive to the presence of outliers, and to initialization phase. Finally, the discovered knowledge is presented as a set of non-overlapping clustering rules, contributing to the interpretability of the results.

Ioannis Sarafis, Phil Trinder, Ali Zalzala

Evolving Consensus Sequence for Multiple Sequence Alignment with a Genetic Algorithm

In this paper we present an approach that evolves the consensus sequence [25] for multiple sequence alignment (MSA) with genetic algorithm (GA). We have developed an encoding scheme such that the number of generations needed to find the optimal solution is approximately the same regardless the number of sequences. Instead it only depends on the length of the template and similarity between sequences. The objective function gives a sum-of-pairs (SP) score as the fitness values. We conducted some preliminary studies and compared our approach with the commonly used heuristic alignment program Clustal W. Results have shown that the GA can indeed scale and perform well.

Conrad Shyu, James A. Foster

A Linear Genetic Programming Approach to Intrusion Detection

Page-based Linear Genetic Programming (GP) is proposed and implemented with two-layer Subset Selection to address a two-class intrusion detection classification problem as defined by the KDD-99 benchmark dataset. By careful adjustment of the relationship between subset layers, over fitting by individuals to specific subsets is avoided. Moreover, efficient training on a dataset of 500,000 patterns is demonstrated. Unlike the current approaches to this benchmark, the learning algorithm is also responsible for deriving useful temporal features. Following evolution, decoding of a GP individual demonstrates that the solution is unique and comparative to hand coded solutions found by experts.

Dong Song, Malcolm I. Heywood, A. Nur Zincir-Heywood

Genetic Algorithm for Supply Planning Optimization under Uncertain Demand

Supply planning optimization is one of the most important issues for manufacturers and distributors. Supply is planned to meet the future demand. Under the uncertainty involved in demand forecasting, profit is maximized and risk is minimized. In order to simulate the uncertainty and evaluate the profit and risk, we introduced Monte Carlo simulation. The fitness function of GA used the statistics of the simulation. The supply planning problems are multi-objective, thus there are several Pareto optimal solutions from high-risk and high-profit to low-risk and low-profit. Those solutions are very helpful as alternatives for decision-makers. For the purpose of providing such alternatives, a multi-objective genetic algorithm was employed. In practice, it is important to obtain good enough solutions in an acceptable time. So as to search the solutions in a short time, we propose Boundary Initialization which initializes population on the boundary of constrained space. The initialization makes the search efficient. The approach was tested on the supply planning data of an electric appliances manufacturer, and has achieved a remarkable result.

Tezuka Masaru, Hiji Masahiro

Genetic Algorithms: A Fundamental Component of an Optimization Toolkit for Improved Engineering Designs

Optimization is being increasing applied to engineering design problems throughout the world. iSIGHT is a generic engineering design environment that provides engineers with an optimization toolkit of leading optimization algorithms and an optimization advisor to solve their optimization needs. This paper focuses on the key role played by the toolkit’s genetic algorithm in providing a robust, general purpose solution to nonlinear continuous, mixed integer nonlinear and integer combinatorial problems. The robustness of the genetic algorithm is demonstrated on successful application to 30 engineering benchmark problems and the following three real world problems: a marine naval propeller, a heart pacemaker and a jet engine turbine airfoil.

Siu Tong, David J. Powell

Spatial Operators for Evolving Dynamic Bayesian Networks from Spatio-temporal Data

Learning Bayesian networks from data has been studied extensively in the evolutionary algorithm communities [Larranaga96], [Wong99]. We have previously explored extending some of these search methods to temporal Bayesian networks [Tucker01]. A characteristic of many datasets from medical to geographical data is the spatial arrangement of variables. In this paper we investigate a set of operators that have been designed to exploit the spatial nature of such data in order to learn dynamic Bayesian networks more efficiently. We test these operators on synthetic data generated from a Gaussian network where the architecture is based upon a Cartesian coordinate system, and real-world medical data taken from visual field tests of patients suffering from ocular hypertension.

Allan Tucker, Xiaohui Liu, David Garway-Heath

An Evolutionary Approach for Molecular Docking

We have developed an evolutionary approach for the flexible docking that is now an important component of a rational drug design. This automatic docking tool, referred to as the GEMDOCK (Generic Evolutionary Method for DOCKing molecules), combines both global and local search strategies search mechanisms. GEMDOCK used a simple scoring function to recognize compounds by minimizing the energy of molecular interactions. The interactive types of atoms between ligands and proteins of our linear scoring function consist only hydrogen bonding and steric terms. GEMDOCK has been tested on a diverse dataset of 100 protein-ligand complexes from Protein Data Bank. In total 76% of these complexes, it obtained docked ligand conformations with root mean square derivations (RMSD) to the crystal ligand structures less than 2.0 Å when the ligand was docked back into the binding site. Experiments shows that the scoring function is simple and efficiently discriminates between native and non-native docked conformations. This study suggests that GEMDOCK is a useful tool for molecular recognition and is a potential docking tool for protein structure variations.

Jinn-Moon Yang

Evolving Sensor Suites for Enemy Radar Detection

Designing optimal teams of sensors to detect the enemy radars for military operations is a challenging design problem. Many applications require the need to manage sensor resources. There is a tradeoff between the need to decrease the cost and to increase the capabilities of a sensor suite. In this paper, we address this design problem using genetic algorithms. We attempt to evolve the characteristics, size, and arrangement of a team of sensors, focusing on minimizing the size of sensor suite while maximizing its detection capabilities. The genetic algorithm we have developed has produced promising results for different environmental configurations as well as varying sensor resources.

Ayse S. Yilmaz, Brian N. McQuay, Han Yu, Annie S. Wu, John C. Sciortino

Real World Applications — Posters

Optimization of Spare Capacity in Survivable WDM Networks

A network with restoration capability requires spare capacity to be used in the case of failure. Optimization of spare capacity is to find the minimum amount of spare capacity for the network to survive from network component failures. In this paper, this problem is investigated for wavelength division multiplexing (WDM) mesh networks without wavelength conversion. We propose a hybrid genetic algorithm approach (GA) for the problem. Simulated Annealing (SA) and Tabu Search (TS) are also applied to this problem for comparison purpose. Simulation results show very favorable results for the Genetic Algorithm approach.

H. W. Chong, Sam Kwong

Partner Selection in Virtual Enterprises by Using Ant Colony Optimization in Combination with the Analytical Hierarchy Process

Within the scope of the collaborative research centre “Non-hierarchical regional production networks” at Chemnitz University of Technology, a virtual enterprise model is developed. This is based on very small performance units - the competence cells (CCs). The precondition of the efficient and thereby competitive operation of those networks is a network controlling for objectively selecting the best suitable CCs for every order. This problem gains complexity because the manufacture of a product can be carried through in different ways.

Marco Fischer, Hendrik Jähn, Tobias Teich

Quadrilateral Mesh Smoothing Using a Steady State Genetic Algorithm

This paper investigates the use of a steady state genetic algorithm (GA) to perform quadrilateral finite element mesh smoothing. GAS short for genetic algorithm smoother moves one to 64 nodes at the same time. GAs smooth as well as untangle (removing twisted and inverted elements), which has been a separate operation in the past.

Mike Holder, Charles L. Karr

Evolutionary Algorithms for Two Problems from the Calculus of Variations

A brachistochrone is the path along which a weighted particle falls most quickly from one point to another, and a catenary is the smooth curve connecting two points whose surface of revolution has minimum area. Two evolutionary algorithms find piecewise linear curves that closely approximate brachistochrones and catenaries.

Bryant A. Julstrom

Genetic Algorithm Frequency Domain Optimization of an Anti-Resonant Electromechanical Controller

The new standard for actuators in the aerospace industry is electromechanical actuators. This paper describes an approach whereby PID and DFF controllers are used in unison to effectively manipulate a thrust vectoring system.

Charles L. Karr, Douglas A. Scott

Genetic Algorithm Optimization of a Filament Winding Process

This paper describes a research effort to improve the efficiency of a filament winding process by combining the simulation capabilities of the WITNESS modeling program with the search capabilities of a genetic algorithm. Results show that the genetic algorithm is able to reduce the cost of producing filament wound mandrels used in the defense industry.

Charles L. Karr, Eric Wilson, Sherri Messimer

Circuit Bipartitioning Using Genetic Algorithm

In this paper, we propose a hybrid genetic algorithm for partitioning a VLSI circuit graph into two disjoint graphs of minimum cut size. The algorithm includes a local optimization heuristic which is a modification of Fiduccia-Matheses algorithm. Using well-known benchmarks (including ACM/SIGDA benchmarks), the combination of genetic algorithm and the local heuristic outperformed hMetis [3], a representative circuit partitioning algorithm.

Jong-Pil Kim, Byung-Ro Moon

Multi-campaign Assignment Problem and Optimizing Lagrange Multipliers

Customer relationship management is crucial in acquiring and maintaining royal customers. To maximize revenue and customer satisfaction, companies try to provide personalized services for customers. A representative effort is one-to-one marketing. The fast development of Internet and mobile communication boosts up the market of one-to-one marketing. A personalized campaign targets the most attractive customers with respect to the subject of the campaign. So it is important to expect customer preferences for campaigns. Collaborative Filtering (CF) and various data mining techniques are used to expect customer preferences for campaigns. Especially, since CF is fast and simple, it is widely used for personalization in e-commerce. There have been a number of customer-preference estimation methods based on CF. As personalized campaigns are frequently performed, several campaigns often happen to run simultaneously. It is often the case that an attractive customer for a specific campaign tends to be attractive for other campaigns. If we perform separate campaigns without considering this problem, some customers may be bombarded by a considerable number of campaigns. We call this overlapped recommendation problem. The larger the number of recommendations for a customer, the lower the customer interest for campaigns. In the long run, the customer response for campaigns drops. It lowers the marketing efficiency as well as customer satisfaction. Unfortunately, traditional methods only focused on the effectiveness of a single campaign and did not consider the problem with respect to the overlapped recommendations. In this paper, we define the multi-campaign assignment problem (MCAP) considering the overlapped recommendation problem and propose a number of methods for the issue including a genetic approach. We also verify the effectiveness of the proposed methods with field data.

Yong-Hyuk Kim, Byung-Ro Moon

Grammatical Evolution for the Discovery of Petri Net Models of Complex Genetic Systems

We propose here a grammatical evolution approach for the automatic discovery of Petri net models of biochemical systems that are consistent with population level genetic models of disease susceptibility. We demonstrate the grammatical evolution approach routinely identifies interesting and useful Petri net models in a human-competitive manner. This study opens the door for hierarchical systems modeling of the relationship between genes, biochemistry, and measures of health.

Jason H. Moore, Lance W. Hahn

Evaluation of Parameter Sensitivity for Portable Embedded Systems through Evolutionary Techniques

Power consumption and portability issues are becoming increasingly significant in embedded system architectures. Therefore, it is important that chip architects and integrated circuit designers focus on power and portability, they must also be conscious of choosing the right set of parameters for an embedded processor. An evolutionary approach to configure the parameters of an embedded processor can be used to address these issues. From a given set of parameters the design space is explored using a simple genetic algorithm. This paper investigates the sensitivity of these parameters as they change during the process of the genetic algorithm.

James Northern, Michael Shanblatt

An Evolutionary Algorithm for the Joint Replenishment of Inventory with Interdependent Ordering Costs

The joint replenishment of inventory problem (JRP) requires independence of minor ordering costs. In this paper we propose an evolutionary algorithm (EA) for a modification of the JRP. The modified JRP allows for interdependence of minor ordering costs. Our proposed EA (EARP) is a nested EA that searches for a solution to minimize the total cost of inventory replenishment. It combines an EA which uses a direct grouping method with an EA that uses an indirect grouping approach (EA ind) by nesting EA ind inside EARP. We test EARP against partial enumeration and show that it provides close to optimal results for some problems. We know of no other algorithm to solve this problem.

Anne Olsen

Benefits of Implicit Redundant Genetic Algorithms for Structural Damage Detection in Noisy Environments

A robust structural damage detection method that can handle noisy frequency response function information is discussed. The inherent unstructured nature of damage detection problems is exploited by applying an implicit redundant representation (IRR) genetic algorithm. The unbraced frame structure results obtained show that the IRR GA is less sensitive to noise than a SGA.

Anne Raich, Tamás Liszkai

Multi-objective Traffic Signal Timing Optimization Using Non-dominated Sorting Genetic Algorithm II

This paper presents the application of Non- dominated Sorting Genetic Algorithm II (NSGA II) in solving multiple-objective signal timing optimization problem (MOSTOP). Some recent researches on intersection signal timing design optimization and multi-objective evolutionary algorithms are summarized. NSGA II, which can find more of the Pareto Frontiers and maintain the diversity of the population, is applied to solve three signal timing optimization problems with 2-objective and 3-constraint, which account for both deterministic and stochastic traffic patterns. Mathematical approximation of the resulting Pareto Frontiers are developed to provide more insight into the trade-off between different objectives. GAs experimental design and result analysis are presented with some recommendations for prospective applications.

Dazhi Sun, Rahim F. Benekohal, S. Travis Waller

Exploration of a Two Sided Rendezvous Search Problem Using Genetic Algorithms

The problem of searching for a walker that wants to be found, when the walker moves toward the helicopter when it can hear it, is an example of a two sided search problem which is intrinsically difficult to solve. Thomas et al [1] considered the effectiveness of three standard NATO search paths [2] for this type of problem. In this paper a genetic algorithm is used to show that more effective search paths exist. In addition it is shown that genetic algorithms can be effective in finding a near optimal path of length 196 when searching a 14×14 cell area, that is a search space of 10100. . . .

T. Q. S. Truong, A. Stacey

Taming a Flood with a T-CUP — Designing Flood-Control Structures with a Genetic Algorithm

This paper describes the use of a genetic algorithm to solve a hydrology design problem - determining an optimal or near-optimal prescription of Best Management Practices in a flood-prone watershed. The model has proved capable of discovering design prescriptions that are more cost-effective than existing designs, promising significant financial benefits in a shorter time period. The approach is flexible enough to be applied to any watershed with basic precipitation and soil data.

Jeff Wallace, Sushil J. Louis

Assignment Copy Detection Using Neuro-genetic Hybrids

It is difficult for teachers to find out copies in their program assignments. We propose a method which detects the similarity of assignments. We first transform a program into a sequence of tokens and provide it to an artificial neural network. The neural network is optimized by a hybrid genetic algorithm. Experimental results showed considerable improvement on artificial neural networks.

Seung-Jin Yang, Yong-Geon Kim, Yung-Keun Kwon, Byung-Ro Moon

Search Based Software Engineering

Structural and Functional Sequence Test of Dynamic and State-Based Software with Evolutionary Algorithms

Evolutionary Testing (ET) has been shown to be very successful for testing real world applications [10]. The original ET approach focuses on searching for a high coverage of the test object by generating separate inputs for single function calls.We have identified a large set of real world application for which this approach does not perform well because only sequential calls of the tested function can reach a high structural coverage (white box test) or can check functional behavior (black box tests). Especially, control software which is responsible for controlling and constraining a system cannot be tested successfully with ET. Such software is characterized by storing internal data during a sequence of calls.In this paper we present the Evolutionary Sequence Testing approach for white box and black box tests. For automatic sequence testing, a fitness function for the application of ET will be introduced, which allows the optimization of input sequences that reach a high coverage of the software under test. The authors also present a new compact description for the generation of real-world input sequences for functional testing. A set of objective functions to evaluate the test output of systems under test have been developed. These approaches are currently used for the structural and safety testing of car control systems.

André Baresel, Hartmut Pohlheim, Sadegh Sadeghipour

Evolutionary Testing of Flag Conditions

Evolutionary Testing (ET) has been shown to be very successful in testing real world applications [16]. However, it has been pointed out [11], that further research is necessary if flag variables appear in program expressions. The problems increase when ET is used to test state-based applications where the encoding of states hinders successful evolutionary tests. This is because the ET performance is reduced to a random test in case of the use of flag variables or variables that encode an enumeration type.The authors have developed an ET System to provide easy access to automatic testing. An extensive set of programs has been tested using this system [4], [16]. This system is extended for new areas of software testing and research has been carried out to improve its performance. This paper introduces a new approach for solving ET problems with flag conditions. The problematic constructs are explained with the help of code examples originally found in large real world applications.

André Baresel, Harmen Sthamer

Predicate Expression Cost Functions to Guide Evolutionary Search for Test Data

Several researchers are using evolutionary search methods to search for test data with which to test a program. The fitness or cost function depends on the test goal but almost invariably an important component of the cost function is an estimate of the cost of satisfying a predicate expression as might occur in branches, exception conditions, etc. This paper reviews the commonly used cost functions and points out some deficiencies. Alternative cost functions are proposed to overcome these deficiencies. The evidence from an experiment is that they are more reliable.

Leonardo Bottaci

Extracting Test Sequences from a Markov Software Usage Model by ACO

The aim of the paper is to investigate methods for deriving a suitable set of test paths for a software system. The design and the possible uses of the software system are modelled by a Markov Usage Model which reflects the operational distribution of the software system and is enriched by estimates of failure probabilities, losses in case of failure and testing costs. Exploiting this information, we consider the tradeoff between coverage and testing costs and try to find an optimal compromise between both. For that purpose, we use a heuristic optimization procedure inspired by nature, Ant Colony Optimization, which seems to fit very well to the problem structure under consideration. A real world software system is studied to demonstrate the applicability of our approach and to obtain first experimental results.

Karl Doerner, Walter J. Gutjahr

Using Genetic Programming to Improve Software Effort Estimation Based on General Data Sets

This paper investigates the use of various techniques including genetic programming, with public data sets, to attempt to model and hence estimate software project effort. The main research question is whether genetic programs can offer ‘better’ solution search using public domain metrics rather than company specific ones. Unlike most previous research, a realistic approach is taken, whereby predictions are made on the basis of the data available at a given date. Experiments are reported, designed to assess the accuracy of estimates made using data within and beyond a specific company. This research also offers insights into genetic programming’s performance, relative to alternative methods, as a problem solver in this domain. The results do not find a clear winner but, for this data, GP performs consistently well, but is harder to configure and produces more complex models. The evidence here agrees with other researchers that companies would do well to base estimates on in house data rather than incorporating public data sets. The complexity of the GP must be weighed against the small increases in accuracy to decide whether to use it as part of any effort prediction estimation.

Martin Lefley, Martin J. Shepperd

The State Problem for Evolutionary Testing

This paper shows how the presence of states in test objects can hinder or render impossible the search for test data using evolutionary testing. Additional guidance is required to find sequences of inputs that put the test object into some necessary state for certain test goals to become feasible. It is shown that data dependency analysis can be used to identify program statements responsible for state transitions, and then argued that an additional search is needed to find required transition sequences. In order to be able to deal with complex examples, the use of ant colony optimization is proposed. The results of a simple initial experiment are reported

Phil McMinn, Mike Holcombe

Modeling the Search Landscape of Metaheuristic Software Clustering Algorithms

Software clustering techniques are useful for extracting architectural information about a system directly from its source code structure. This paper starts by examining the Bunch clustering system, which uses metaheuristic search techniques to perform clustering. Bunch produces a subsystem decomposition by partitioning a graph formed from the entities (e.g., modules) and relations (e.g., function calls) in the source code, and then uses a fitness function to evaluate the quality of the graph partition. Finding the best graph partition has been shown to be a NP-hard problem, thus Bunch attempts to find a sub-optimal result that is “good enough” using search algorithms. Since the validation of software clustering results often is overlooked, we propose an evaluation technique based on the search landscape of the graph being clustered. By gaining insight into the search landscape, we can determine the quality of a typical clustering result. This paper defines how the search landscape is modeled and how it can be used for evaluation. A case study that examines a number of open source systems is presented.

Brian S. Mitchell, Spiros Mancoridis

Search Based Software Engineering — Posters

Search Based Transformations

Program transformation [1],[2],[3] can be described as the act of changing one program to another. The aim being an alteration of the program syntax but not its semantics, hence leaving both source and target programs functionally equivalent. Consider as examples of transformations the following program fragments:

Deji Fatiregun, Mark Harman, Robert Hierons

Finding Building Blocks for Software Clustering

It is generally believed that good modularization of software leads to systems which are easier to design, develop, test, maintain and evolve [1].Software clustering using search-based techniques has been well studied using a hill climbing approach [2],[4],[5],[6]. Hill climbing suffers from the problem of local optima, so some improvement may be expected by by considering more sophisticated search-based techniques. However, hitherto, the use of other techniques to overcome this problem such as Genetic Algorithms (GA) [3] and Simulated Annealing [7] have been disappointing.This poster paper looks at the possibility of using results from multiple hill climbs to form a basis for subsequent search. The findings will be presented in the poster created for the poster session in GECCO 2003.

Kiarash Mahdavi, Mark Harman, Robert Hierons


Weitere Informationen