Genetic programming: principles and applications

https://doi.org/10.1016/S0952-1976(02)00013-1Get rights and content

Abstract

Genetic algorithms (GA) has given rise to two new fields of research where (global) optimisation is of crucial importance: ‘genetic based machine learning’ (GBML) and ‘genetic programming’ (GP). An introduction by the authors to GA and GBML was given in two previous papers (Eng. Appl. Artif. Intell. 9(6) (1996) 681; Eng. Appl. Artif. Intell. 13(4) (2000) 381). In this paper, the last domain (GP) will be introduced, thereby making up a trilogy which gives a general overview of the whole field. In this third part, an overview will be given of the basic concepts of GP as defined by Koza. A first (educational) example of GP is given by solving a simple symbolic regression of a sinus function. Finally, a more complex application is presented in which GP is used to construct the mathematical equations for an industrial process. To this end, the case study ‘fibre-to-yarn production process’ is introduced. The goal of this study is the automatic development of mathematical equations for the prediction of spinnability and (possible) resulting yarn strength. It is shown that (relatively) simple equations can be obtained which describe accurately 90% of the fibre-to-yarn database.

Introduction

Genetic algorithms (GAs) are generally used as an optimisation technique to search the global optimum of a function. However, this is not the only possible use for GAs. Other fields of applications where robustness and global optimisation are needed could also benefit greatly from GAs. The two most important alternative domains applying GAs are genetic based machine learning (GBML) and genetic programming (GP).

GBML is a GA driven implementation of rule based machine learning (RBML). The goal in RBML is to generate a set of rules using an automated learning algorithm. This set of rules should allow the machine to perform optimally in its environment. A well-known GBML architecture is the so-called ‘learning classifier system’ (LCS) developed by Goldberg (1989), Holland (1968) and Holland and Reitman (1978). More recent GBML architectures are for example the XCS developed by Wilson (1994), ELF by Bonnarini (1997) and (F)ECS by Sette and Boullart (2000). An introduction to GBML was given by Boullart and Sette (1998).

GP is basically a GA applied to a population of computer programs (CP). While a GA usually operates on (coded) strings of numbers a GP has to operate on CP. This demands a special representation of the operands which can, for example, be implemented by using the programming language LISP. GP allows, in comparison with GA, the optimisation of much more complicated structures and can therefore be applied to a greater diversity of problems. Koza has extensively described GP in his book ‘Genetic Programming, on the programming of computers by means of natural selection’ (1992).

An introduction to GAs and GBML was given by the authors in Sette et al. (1996) and Sette and Boullart (2000). Both papers illustrated the use of these algorithms by applying them to an industrial production process in textiles: the (cotton) fibre-to-yarn production process. The GA was used to search for a new yarn with optimal strength and elongation characteristics. GBML was used to generate a number of rules for predicting the spinnability of the yarn (based upon fibre blend and spinning machine settings).

This paper is the last paper in this introductory series and will illustrate the concepts of GP (as described more extensively by Koza) and apply it to the same aforementioned real life industrial production problem, generating equations for the spinnability and the yarn strength. Finally, some conclusions based upon the comparison of the three different (but related) techniques will be presented.

Section snippets

Introducing GP

The first part of this introduction will give a schematical overview of the GP algorithm. The operation is very similar to the steps performed in a GA. The second part will focus on the most essential feature of GP: the representation of the solution/operand in order to apply GAs to CP. Due to the obvious restriction in scope, the underlying description cannot be complete, but will concentrate on the important aspects in view of the application at hand.

Simple symbolic regression application

This (educational) experiment has the (limited) goal of finding a mathematical expression for a sin(x), 0⩽x⩽180, using only the function set {+, −, *,/} and the terminal T={x,[0...9]).

To this end, the software package GPCPP4 (freeware) from Adam Fraser has been used to demonstrate the use of GP for modelling the sinus function. The software was only slightly modified with regard to the function (using the symbolic regression module) and the corresponding fitness calculation. The function, which

Industrial application: constructing mathematical equations for the fibre-to-yarn process by means of GP

One of the important production processes in the textile industry is the spinning process. Starting from cotton fibres, yarns are (usually) created on a rotor-spinning machine. The spinnability of a fibre and the yarn characteristics (if spinnable) are dependant on the fibre quality and on the machine settings of the spinning machine. It would be a great benefit to be able to predict the spinnability and yarn strength departing from a certain quality and from machine settings. To this end, two

General conclusions

This paper presented the third and last part of an overview/introduction to the domain of evolutionary computing and its major algorithms. While the first paper (Sette et al., 1996) introduced the general concepts of genetic algorithms, the second paper (Sette and Boullart (2000)) illustrated the use of genetic based machine learning. In this final paper, an introduction was given to the second field of study derived from genetic algorithms: Genetic programming as defined by Koza.

Genetic

Stefan Sette is a doctoral scientific researcher, and has been working at the Department of Textiles, Ghent University, since 1990. He holds university degrees in theoretical physics (1987), computer science (1989) and received his Ph.D. in 1998. His research concentrated first on (conventional) image analysis applied to textile problems. During the last six years this has evolved into a study of neural networks, genetic algorithms en learning classifier systems, supported by the Department of

References (15)

There are more references available in the full text version of this article.

Cited by (0)

Stefan Sette is a doctoral scientific researcher, and has been working at the Department of Textiles, Ghent University, since 1990. He holds university degrees in theoretical physics (1987), computer science (1989) and received his Ph.D. in 1998. His research concentrated first on (conventional) image analysis applied to textile problems. During the last six years this has evolved into a study of neural networks, genetic algorithms en learning classifier systems, supported by the Department of Control Engineering and Automation. He is currently researching implementations of these models for a paper mill industrial process.

Luc Boullart graduated in Electromechanical Engineering at the Ghent University (Belgium) in 1970, where he also received his Ph.D. in 1975. Currently, he is full professor at the Control Engineering and Automation department of the same university, where he teaches computer science, systems simulation and computer systems in process control. His research interest is in the field of artificial intelligence, more especially in neural networks and genetic algorithms. He is also honorary president of the Belgian Institute for Automatic Control (BIRA), and is chairman of IFAC's committee on Computer Aided Control System Design. Furthermore, he is director of the Institute for Permanent Education of the engineering department of the Ghent University. He is a board member of the Ghent University Senate.

View full text