Introduction to AI - Week 5 Genetic Algorithms Evolutionary Computation (also called evolutionary programming) copied from nature: * evolving successful organisms by reproduction of fit elements. * Fittest elements reproduce most. * random mutation _________________________________________________________________________________________________________ Principles of Darwinism * Set of individuals which can reproduce themselves * Offsprings are similar to their parents. * Each new generation has organisms that are similar to the fit members of the previous generation. Hence adaptation to slowly changing environment * random mutations _________________________________________________________________________________________________________ Why Genetic Algorithms * "Evolution is known to be a successful, robust method for adaptation within biological systems. * GAs can search spaces of hypotheses containing complex interacting parts, where the impact of each part on overall hypothesis fitness may be difficult to model. * Genetic algorithms are easily parallelized and can take advantage of the decreasing costs of powerful computer hardware." Taken from [Tom Mitchell, p.250] (from where also most of the rest of this lecture is adapted.) _________________________________________________________________________________________________________ Genetic Algorithms * measurement of the behaviour by a fitness function * can be seen as form of reinforcement learning (reward = offspring), but search in the space of individuals, no learning of relationship between rewards and actions/states of the environment. * form of hill-climbing, risk getting stuck at local maxima, hence spend only most (but not all) of the resources to the most promising individuals (random selection, mutation) In order to apply genetic algorithms, answer * What is the fitness function? * How is an individual represented? * How are the individuals selected? * How do individuals reproduce? _________________________________________________________________________________________________________ Components of Genetic Algorithms * Fitness function is problem dependent, maps individual to real number. * Representation of individuals as genes, i.e., string over a finite alphabet, e.g., alphabet {A,G,T,C} or {0,1}. * Selection for reproduction is randomised, probability of selection is proportional to the fitness. Usually, selection is done with replacement, so that very fit individuals are selected several times. * Reproduction is accomplished by cross-over and mutation. All selected individuals are randomly paired. For each pair, a cross-over point is randomly chosen. One offspring gets the first part from the first parent and the second part from the second, for the second offspring it is the other way around. Then each gene can be randomly mutated with small independent probability. _________________________________________________________________________________________________________ Pseudo Code for Genetic Algorithms GA(Fitness,Threshold,p,r,m) ;;; Fitness: a function that assign a score given a hypothesis ;;; Threshold: specifies the termination criterion ;;; p: the population size ;;; r: fraction of population replaced at each step ;;; m: mutation rate Randomly initialize population P; While max[h in P] fitness(h) < Threshold do Create new generation P[S] end_do Return h from P with highest fitness. _________________________________________________________________________________________________________ How to create the next generation Create new generation P[S] 1. Probabilistically select (1-r)p members of P to add to P[S]. Probability of an element to be selected in one step is proportional to its fitness, i.e. ratio of its fitness divided by the sum of all fitness values. 2. Probabilistically select [r* p]/[2] pairs of P for crossover. Probability of an element to be selected in one step is proportional to its fitness. For each pair generate two offsprings. Add all offsprings to P[S]. 3. Randomly flip each bit in P[S] in the representation with (small) probability m. 4. Update P as P[S] _________________________________________________________________________________________________________ How to create the next generation (Cont'd) 1. The survival from one generation to the next is proportional to the fitness of an element. (A fit gazelle is less likely to be eaten by a lion than a less fit one.) 2. Propagation of elements is proportional to fitness. Fit elements are more likely to succeed in reproduction than non-fit ones. 3. Totally new features can be generated by random mutations, which occur only with very small probability. They are important in order to avoid local maxima. _________________________________________________________________________________________________________ Example, Key Steps in Genetic Algorithms [genetic.jpg] _________________________________________________________________________________________________________ Example Learn decision list for the restaurant waiting problem * Fitness function: number of examples an individual is consistent with. * representation of the attribute/value pairs by 5 bits: 00000: Alternate(x) 00001: ~ Alternate(x) . . . 10111: WaitEstimate(x,0-10) 11000: WaitEstimate(x,10-30) 11001: WaitEstimate(x,30-60) 11010: WaitEstimate(x,60+) One further bit for outcome. Representation of decision lists of length k with up to t tests, we need t(5k+1) bits. Standard selection and reproduction approaches. _________________________________________________________________________________________________________ Genetic Operators Single-point crossover 11101001000 11101010101 [cross.jpg] 00001010101 00001001000 The procedure takes two bitstrings of length n, randomly selects a number i between 1 and n and breaks up the strings at position i and puts together the first part of the first string with the second part of the second as well as the first part of the second string with the second part of the first string. _________________________________________________________________________________________________________ Genetic Operators (Cont'd) Two-point crossover 11101001000 11001011000 [cross.jpg] 00001010101 00101000101 The procedure takes two bitstrings of length n, randomly selects two numbers i and j between 1 and n and breaks up the strings at positions at i and j and puts together the first and last part of the first string with the middle part of the second as well as the first and last part of the second string with the middle part of the first string. _________________________________________________________________________________________________________ Genetic Operators (Cont'd) Uniform Crossover 11101001000 10001000100 [cross.jpg] 00001010101 01101011001 The procedure takes two bitstrings of length n, randomly decides for each position of whether the bit of the first or the second should go in the first or in the second child. Point Mutation 11101001000 [arrow.jpg] 11101011000 _________________________________________________________________________________________________________ Further Aspects * How should the fitness function look like? E.g. number of e-mails classified correctly compared to all: [correct]/[all]. Or a more complicated measure such as: [correctly-classified-spam]/[spam] - 100*[wrongly-classified-spam]/[genuine]. In general, difficult question how to choose a fitness function. * Elitism: Protect fittest elements from death in order not to lose the fittest elements in the population. _________________________________________________________________________________________________________ Genetic Programming * Extension of genetic algorithms by data structures and general program structures (i.e., not just bit strings). * Automated generation of programs by similar techniques (i.e., randomly generated initial population and evolution by cross-over and mutation). * In principle it is possible to generate automatically arbitrary programs. _________________________________________________________________________________________________________ Tree Representation of Programs Example: Representation of sin(x)+sqrt(x2+y) in form of a tree. General programs can be represented in a similar form, assumed the corresponding primitives are available, e.g., conditionals, loops, boolean expressions. [genprog1.jpg] _________________________________________________________________________________________________________ Crossover in Genetic Programming [genprog2.jpg] _________________________________________________________________________________________________________ Example: Stacking Blocks [blocks.jpg] Stack blocks which are on a table to spell in a tower the word "universal." Use a set of 166 such initial configurations to evaluate the fitness of a candidate program. _________________________________________________________________________________________________________ Program primitives * CS, current stack, refers to the top block on the stack, or F if there is no current stack. * TB, top correct block, under which all blocks are correct. NN next necessary block to put on the stack. MS(x), move x to the current stack. MT(x), move x to the table. EQ(x,y), x is equal to y. NOT(x), swaps T and F. DU(x,y): Do x until y. _________________________________________________________________________________________________________ Generated Program * The system was initialised with 300 random programs. * After 10 generations the system came up with a program that solves all 166 reference problems. * Generated program: EQ(DU(MT(CS),NOT(CS)), DU(MS(NN),NOT(NN))) The program reads as follows: First move repeatedly the current top of the stack onto the table, until the stack becomes empty. The second "Do Until" statement repeatedly moves the next necessary block form the table onto the stack. EQ is only used to form a sequence of the programs. _________________________________________________________________________________________________________ Co-Evolution * The examples assumed that a fixed environment is provided, with respect to which an optimal member of the population is generated. * There is typically not one single optimal species. In a complicated system different species have different niches. E.g., the lion and the gazelle are very well adapted and evolved. * In co-evolution not only one species is evolved, but different ones simultaneously. * In the context of e-mail spam filtering, we can imagine that a spam filter evolves and simultaneously evolves a system to trick the spam filter. _________________________________________________________________________________________________________ Further Issues * Lamarck had the idea that the learning an individual does during its lifetime shouldn't be lost, but should be inherited to its offsprings. Scientific evidence speaks for the fact that this is not the case in biological evolution. Still it can be useful in artificial evolution. * Genotype vs Phenotype: The mapping from the genotype (the bitstrings/trees on which the evolutionary operations act) to the phenotype (that is the actual individual) is assumed to be trivial in the examples above, but can be complicated in general. _________________________________________________________________________________________________________ Summary Important decisions for a genetic approach: * Size of the population * Reproduction rate * mutation rate Wide range of application, for certain problems very good results (poor on others). Can be computationally very expensive, but is also be rather robust, e.g. the fitness function is not necessarily invariant over time. _________________________________________________________________________________________________________ Further Reading [books-shelf1.jpg] * Tom M. Mitchell. Machine Learning. McGraw-Hill, 1997, Chapter 9. * John Holland. Adaptation in Natural and Artificial Systems. University of Michigan Press, 1975. * John Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, 1992. * S. Russell, P. Norvig. Artificial Intelligence - A Modern Approach. 2nd Edition, Pearson Education, 2003. pp. 116-120. _________________________________________________________________________________________________________ _________________________________________________________________________________________________________ © Manfred Kerber, 2004, Introduction to AI 24.4.2005 The URL of this page is http://www.cs.bham.ac.uk/~mmk/Teaching/AI/Teaching/AI/l5.html. URL of module [1]http://www.cs.bham.ac.uk/~mmk/Teaching/AI/ References 1. http://www.cs.bham.ac.uk/~mmk/Teaching/AI/