Genetic Programming and Data Structures
W. B. Langdon,
University College, London, 1996, 350 pages
- Building Block A pattern of genes in a contiguous section of
a chromosome which, if present, confers a high fitness to the
individual. According to the building block hypothesis, a complete
solution can be constructed by crossover joining together in a
single individual many building blocks which where originally spread
throughout the population.
- Cellular Automata A regular array of identical finite state
automata whose next state is determined solely by their current state
and the state of their neighbours. The most widely seen is the game
of Life in which complex patterns emerge from a (supposedly infinite)
square lattice of simple two state (living and dead)
automata whose next state is
determined solely by the current states of its four closes neighbours
- Classifiers An extension of genetic algorithms in which the
population consists of a co-operating set of rules (i.e. a rulebase)
which are to
learn to solve a problem given a number of test cases. Between each
generation the population as a whole is evaluated and a fitness is
assigned to each rule using the bucket-brigade algorithm or other
credit sharing scheme (e.g. the Pitt scheme). These schemes aims to
reward or punish rules which contribute to a test case according to
how good the total solution is by adjusting the individual rules
At the end of the test data a new generation is created using a
genetic algorithm as if each rule were independent using its own
fitness (measures may be taken are taken to ensure a given rule only
appears once in the new population).
- Coevolution Two or more populations are evolved at the same
time. Often the separate populations compete against each other.
- Convergence Tendency of members of the population to be the
same. May be used to mean either their representation or behaviour
are identical. Loosely a genetic algorithm solution has been reached.
- Chromosome Normally, in genetic algorithms the bit string
which represents the individual. In genetic programming the
individual and its representation are usually the same, both being the
program parse tree. In nature many species store their genetic
information on more than one chromosome.
- Crossover Creating a new individual's representation from parts
of its parents' representations.
- Deme A separately evolving subset of the whole population.
The subsets may be evolved on a different computers. Emigration
between subset may be used (see Panmixia).
- Elitist An elitist genetic algorithm is one that always
retains in the population the best individual found so far.
Tournament selection is naturally elitist.
- Epistasis A term from biology used to denote that the fitness of
an individual depends upon the interaction of a number of their
genes. In genetic algorithms this would be indicated by the fitness
containing a non-linear combination of components of the string.
- Evolution Programming A population containing a number of
trial solutions each of which is evaluated to yield an error.
Typically, at the end of each generation, the best half of the
population is retained and a new solution is produced from each
survivor. The process is continued with the aim that the population
should evolve to contain an acceptable solution.
Evolutionary programming like Evolution Strategy produces new
children by mutating at random from a single parent solution. The
analogue components (e.g. the connection weights when applied to
networks) are changed by a gaussian function whose standard
deviation is given by a function of the parent's error called its
temperature. Digital components (e.g. presence of a hidden node) are
created and destroyed at random.
- Evolution Strategy or Evolutionsstrategie A search
technique first developed in
Berlin. Each point in the search space is represented by a vector
of real values. In the original Evolution Strategy, (1+1)-ES, the
next point to search is given by adding gaussian random noise to the
current search point. The new point is evaluated and if better the
search continues from it. If not the search continues from the
original point. The level of noise is automatically adjusted as the
Evolutionary Strategies can be thought of as like an analogue version
of genetic algorithms. In (1+1)-ES, 1 parent is used to create 1
offspring. In (u+l)-ES and (u,l)-ES m
parents are used to create l children (perhaps using
- Finite State Automaton (FSA) or Finite State Machine (FSM)
A machine which can be totally described by a finite set of states,
it being in one these at any one time, plus a set of rules which
determine when it moves from one state to another.
- Fitness Function A process which evaluates a member of a
population and gives it a score or fitness. In most cases the goal
is to find an individual with the maximum (or minimum) fitness.
- Function Set The set of operators used in a genetic program,
e.g. + - * divide. These act as the branch points in the
parse tree, linking other functions or terminals. See also
- Generation When the children of one population replace their
parents in that population. Where some part of the original
population is retained, as in steady state GAs, generation typically
refers to the interval during which the number of new individuals
created is equal to the population size.
- Generation Equivalent In a steady state GA, the time taken to
create as many new individuals as there are in the population.
- Genetic Algorithm A population containing a number of trial
solutions each of which is evaluated (to yield a fitness) and a new
generation is created from the better of them. The process is
continued through a number of generations with the aim that the
population should evolve to contain an acceptable solution.
GAs are characterised by representing the solution as an (often fixed
length) string of digital symbols, selecting parents from the
current population in proportion to their fitness (or some
approximation of this) and the use of crossover as the dominate
means of creating new members of the population. The initial
population may be created at random or from some known starting
- GA Deceptive A gene pattern which confers high fitness but is
not present in the optimal solution is said to be deceptive; in that
it may lead the genetic algorithm away from the global optimum
- Genetic Operator An operator in a genetic algorithm or genetic
programming, which acts upon the chromosome to produce a new individual.
Example operators are mutation and crossover.
- Genetic Program A program produced by genetic programming.
- Genetic Programming A subset of genetic algorithms. The
members of the populations are the parse trees of computer programs
whose fitness is evaluated by running them. The reproduction
operators (e.g. crossover) are refined to ensure that the child is
syntactically correct (some protection may be given against semantic
errors too). This is achieved by acting upon subtrees.
Genetic programming is most easily implemented where the computer
language is tree structured so there is no need to explicitly
evaluated its parse tree. This is one of the reasons why Lisp is
often used for genetic programming.
This is the common usage of the term genetic programming
however it has also been used to refer to the programming of
cellular automata and neural networks using a genetic algorithm.
- Hits The number of hits an individual scores is the number
of test cases for which it returns the correct answer (or close
enough to it). This may or may not be a component of the fitness
function. When an individual gains the maximum number of hits this
may terminate the run.
- Infix Notation
Notation in which the operator separates its operands.
E.g. (a + b) * c. Infix notation requires the use of brackets
to specify the order of evaluation, unlike either prefix or postfix
- Non-Terminal Functions used to link parse tree together. This
name may be used to avoid confusion with functions with no
parameters which can only act as end points of the parse tree (i.e.
leafs) and so are part of the terminal set.
- Mutation Arbitrary change to representation, often at random. In
genetic programming, a subtree is replaced by another, some or all of which
is created at random.
- Panmixia When a population is split into a number of separately
evolving populations (demes) but the level of emigration is
sufficiently high that they continue to evolve as if a single
- Parsimony Brevity. In GP, this is measured by counting the
nodes in the tree. The smaller the program, the smaller the tree,
the lower the count and the more parsimonious it is.
- Postfix Notation Reverse Polish Notation or Suffix
Notation Notation in which the operator follows its operands.
E.g. a + b * c represented as abc*+.
- Prefix Notation Polish Notation Notation in which the
operator comes before its operands. E.g. a + b represented as +ab.
- Premature Convergence When a genetic algorithm's population
converges to something which is not the solution you wanted.
- Recombination as crossover.
- Reproduction Production of new member of population from
existing members. May be used to mean an exact copy of the original
- Simulated Annealing Search technique where a single trial
solution is modified at random. An energy is defined which
represents how good the solution is. The goal is to find the best
solution by minimising the energy. Changes which lead to a lower
energy are always accepted; an increase is probabilistically
accepted. The probability is given by exp(-Delta E/kT).
Where Delta E is the change in energy, k is a constant and
T is the Temperature. Initially the temperature is high
corresponding to a liquid or molten state where large changes are
possible and it is progressively reduced using a cooling
schedule so allowing smaller changes until the system
solidifies at a low energy solution.
- Stochastic Random or probabilistic but with some direction. For
example the arrival of people at a post office might be random but
average properties (such as the queue length) can be predicted.
- Terminal Set A set from which all end (leaf) nodes in the parse
trees representing the programs must be drawn. A terminal might be
a variable, a constant or a function with no arguments.
- Tournament Selection A mechanism for choosing individuals from
a population. A group (typically between 2 and 7 individuals) are
selected at random from the population and the best (normally only
one, but possibly more) is chosen.
Glossary of alife terms from
"Adaptive Individuals in Evolving Populations: Models and Algorithms".
2 March 1997