Introduction to AI - Week 5

Genetic Algorithms
Evolutionary Computation (also called evolutionary programming) copied from nature:
• evolving successful organisms by reproduction of fit elements.
• Fittest elements reproduce most.
• random mutation

Principles of Darwinism

• Set of individuals which can reproduce themselves
• Offsprings are similar to their parents.
• Each new generation has organisms that are similar to the fit members of the previous generation. Hence adaptation to slowly changing environment

• random mutations

Why Genetic Algorithms

• "Evolution is known to be a successful, robust method for adaptation within biological systems.
• GAs can search spaces of hypotheses containing complex interacting parts, where the impact of each part on overall hypothesis fitness may be difficult to model.
• Genetic algorithms are easily parallelized and can take advantage of the decreasing costs of powerful computer hardware."
Taken from [Tom Mitchell, p.250] (from where also most of the rest of this lecture is adapted.)

Genetic Algorithms

• measurement of the behaviour by a fitness function
• can be seen as form of reinforcement learning (reward = offspring), but search in the space of individuals, no learning of relationship between rewards and actions/states of the environment.
• form of hill-climbing, risk getting stuck at local maxima, hence spend only most (but not all) of the resources to the most promising individuals (random selection, mutation)
In order to apply genetic algorithms, answer
• What is the fitness function?
• How is an individual represented?
• How are the individuals selected?
• How do individuals reproduce?

Components of Genetic Algorithms

• Fitness function is problem dependent, maps individual to real number.

• Representation of individuals as genes, i.e., string over a finite alphabet, e.g., alphabet {A,G,T,C} or {0,1}.
• Selection for reproduction is randomised, probability of selection is proportional to the fitness. Usually, selection is done with replacement, so that very fit individuals are selected several times.

• Reproduction is accomplished by cross-over and mutation. All selected individuals are randomly paired. For each pair, a cross-over point is randomly chosen. One offspring gets the first part from the first parent and the second part from the second, for the second offspring it is the other way around. Then each gene can be randomly mutated with small independent probability.

Pseudo Code for Genetic Algorithms

GA(Fitness,Threshold,p,r,m)
;;; Fitness: a function that assign a score given a hypothesis
;;; Threshold: specifies the termination criterion
;;; p: the population size
;;; r: fraction of population replaced at each step
;;; m: mutation rate
Randomly initialize population P;
While maxh in P fitness(h) < Threshold do
Create new generation PS
end_do
Return h from P with highest fitness.

How to create the next generation

Create new generation PS
1. Probabilistically select (1-r)p members of P to add to PS. Probability of an element to be selected in one step is proportional to its fitness, i.e. ratio of its fitness divided by the sum of all fitness values.

2. Probabilistically select [r* p]/[2] pairs of P for crossover. Probability of an element to be selected in one step is proportional to its fitness. For each pair generate two offsprings. Add all offsprings to PS.
3. Randomly flip each bit in PS in the representation with (small) probability m.

4. Update P as PS

How to create the next generation (Cont'd)

1. The survival from one generation to the next is proportional to the fitness of an element. (A fit gazelle is less likely to be eaten by a lion than a less fit one.)

2. Propagation of elements is proportional to fitness. Fit elements are more likely to succeed in reproduction than non-fit ones.

3. Totally new features can be generated by random mutations, which occur only with very small probability. They are important in order to avoid local maxima.

Example

Learn decision list for the restaurant waiting problem
• Fitness function: number of examples an individual is consistent with.
• representation of the attribute/value pairs by 5 bits:

 00000: Alternate(x) 00001: ~ Alternate(x) ... 10111: WaitEstimate(x,0-10) 11000: WaitEstimate(x,10-30) 11001: WaitEstimate(x,30-60) 11010: WaitEstimate(x,60+)

One further bit for outcome. Representation of decision lists of length k with up to t tests, we need t(5k+1) bits. Standard selection and reproduction approaches.

Genetic Operators

Single-point crossover
 11101001000 11101010101 00001010101 00001001000
The procedure takes two bitstrings of length n, randomly selects a number i between 1 and n and breaks up the strings at position i and puts together the first part of the first string with the second part of the second as well as the first part of the second string with the second part of the first string.

Genetic Operators (Cont'd)

Two-point crossover
 11101001000 11001011000 00001010101 00101000101
The procedure takes two bitstrings of length n, randomly selects two numbers i and j between 1 and n and breaks up the strings at positions at i and j and puts together the first and last part of the first string with the middle part of the second as well as the first and last part of the second string with the middle part of the first string.

Genetic Operators (Cont'd)

Uniform Crossover
 11101001000 10001000100 00001010101 01101011001
The procedure takes two bitstrings of length n, randomly decides for each position of whether the bit of the first or the second should go in the first or in the second child.

Point Mutation

 11101001000 11101011000

Further Aspects

• How should the fitness function look like?
E.g. number of e-mails classified correctly compared to all:
[correct]/[all].
Or a more complicated measure such as:

[correctly-classified-spam]/[spam] - 100*[wrongly-classified-spam]/[genuine].

In general, difficult question how to choose a fitness function.

• Elitism: Protect fittest elements from death in order not to lose the fittest elements in the population.

Genetic Programming

• Extension of genetic algorithms by data structures and general program structures (i.e., not just bit strings).
• Automated generation of programs by similar techniques (i.e., randomly generated initial population and evolution by cross-over and mutation).
• In principle it is possible to generate automatically arbitrary programs.

Tree Representation of Programs

Example:
Representation of
sin(x)+sqrt(x2+y)
in form of a tree.

General programs can be represented in a similar form, assumed the corresponding primitives are available, e.g., conditionals, loops, boolean expressions.

Example: Stacking Blocks

Stack blocks which are on a table to spell in a tower the word "universal." Use a set of 166 such initial configurations to evaluate the fitness of a candidate program.

Program primitives

• CS, current stack, refers to the top block on the stack, or F if there is no current stack.
• TB, top correct block, under which all blocks are correct.
• NN next necessary block to put on the stack.
• MS(x), move x to the current stack.
• MT(x), move x to the table.
• EQ(x,y), x is equal to y.
• NOT(x), swaps T and F.
• DU(x,y): Do x until y.

Generated Program

• The system was initialised with 300 random programs.
• After 10 generations the system came up with a program that solves all 166 reference problems.
• Generated program:
EQ(DU(MT(CS),NOT(CS)), DU(MS(NN),NOT(NN)))
The program reads as follows: First move repeatedly the current top of the stack onto the table, until the stack becomes empty. The second "Do Until" statement repeatedly moves the next necessary block form the table onto the stack. EQ is only used to form a sequence of the programs.

Co-Evolution

• The examples assumed that a fixed environment is provided, with respect to which an optimal member of the population is generated.
• There is typically not one single optimal species. In a complicated system different species have different niches. E.g., the lion and the gazelle are very well adapted and evolved.
• In co-evolution not only one species is evolved, but different ones simultaneously.
• In the context of e-mail spam filtering, we can imagine that a spam filter evolves and simultaneously evolves a system to trick the spam filter.

Further Issues

• Lamarck had the idea that the learning an individual does during its lifetime shouldn't be lost, but should be inherited to its offsprings. Scientific evidence speaks for the fact that this is not the case in biological evolution. Still it can be useful in artificial evolution.
• Genotype vs Phenotype: The mapping from the genotype (the bitstrings/trees on which the evolutionary operations act) to the phenotype (that is the actual individual) is assumed to be trivial in the examples above, but can be complicated in general.

Summary

Important decisions for a genetic approach:

• Size of the population
• Reproduction rate
• mutation rate

Wide range of application, for certain problems very good results (poor on others).

Can be computationally very expensive, but is also be rather robust, e.g. the fitness function is not necessarily invariant over time.