Download Presentation
## A Genetic Algorithm

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**A Genetic Algorithm**1. Obtain several creatures! (Spontaneous generation??) 2. Evolve! Perform selective breeding: (a) Run a couple of tournaments (b) Let the winners breed (c) Mutate and test their children (d) Let the children live in the losers' homes**Evolution Runs Until:**A perfect individual appears (if you know what the goal is), Or: improvement appears to be stalled, Or: you give up (your computing budget is exhausted).**Comments**Stochastic algorithm randomness has an essential role in genetic algorithms both selection and reproduction needs random procedures Consider population of solutions evaluates more than a single solution at each iteration assortment, amenable for parallelisation Robustness Ability to perform consistently well on a broad range of problem types no particular requirements on the problems before using GAs**Benefits of Genetic Algorithms**Concept is easy to understand Modular, separate from application Supports multi-objective optimization Good for “noisy” environments Always an answer; answer gets better with time Inherently parallel; easily distributed**Benefits of Genetic Algorithms (cont.)**Many ways to speed up and improve a GA-based application as knowledge about problem domain is gained Easy to exploit previous or alternate solutions Flexible building blocks for hybrid applications Substantial history and range of use**Uses of GAs**GAs (and SAs): the algorithms of despair. Use a GA when you have no idea how to reasonably solve a problem calculus doesn't apply generation of all solutions is impractical but, you can evaluate posed solutions**Problem & Representations**Chromosomes represent problems' solutions as genotypes They should be amenable to: Creation (spontaneous generation) Evaluation (fitness) via development of phenotypes Modification (mutation) Crossover (recombination)**How GAs Represent Problems' Solutions: Genotypes**Bit strings -- this is the most common method Strings on small alphabets (e.g., C, G, A, T) Permutations (Queens, Salesmen) Trees (Lisp programs). Genotypes must allow for: creation, modification, and crossover.**Bit Strings**Bit strings, (B0;B1; ;BN-1), often represent solutions well and permit easy fitness evaluations Individuals are bit strings we call the chromosomes Sample problems: Maximize the ones count = kBk Optimize f(x), letting x = 0.B0B1B2BN-1in binary Map coloring (2 bits per country color) Music (3 or 4 bits per note) Creation, modification, and crossover are easy.**Sample Problems**Search for the best bit sting pattern for some application, such as: "Baby Problem 1" Find the bit string with the largest number of 1's. Not very interesting, but this simple problem can prove that the system works. It's a "stub."**Sample Problems**Find the maximum value of the function f(x) = sin(2x3) sin(25x) for 0x<1 The bit string (b1; b2; ; bn) represents x in binary: x = 0.b1b2 bn =kbk2-k**Sample Problems**Find the maximum value of the function f(x,y) = sin(2x3) cos(cos(42y)) sin(25x) + y2 for 0 x < 1 and. 0 y < 1. The bit string represents both x and y.**Gleason's Problem**Given a “random” matrix, M, with values 1's Change all the signs in some rows, and in some columns. Try to maximize the number of +1's.**Hill Climbing Gets Local Optima**Locate rows or columns with negative sums and invert them. This gets caught in local optima! Example: 5 5 matrix with 17 +1's. +1 +1 +1 1 1 +1 +1 +1 1 1 +1 +1 +1 +1 +1 1 1 +1 +1 +1 1 1 +1 +1 +1 Every row, column has positive sum. No single inversion can improve it!**Solution Representation**The size of M is C R Use a bit string of length C + R: B = (b1, , bC, bC+1, , bC+R) Meaning of B: for 1 j C; bj = 1 invert column j for C + 1 j; bj = 1 invert row j Apply the changes dictated by B to M, to get M’ Then the fitness of B is the sum of all the elements in M’**Four Moves Improves**Invert rows 1 and 2, to lose two +1's. 1 1 1 +1 +1 1 1 1 +1 +1 +1 +1 +1 +1 +1 1 1 +1 +1 +1 1 1 +1 +1 +1 Invert columns 1 and 2, to gain six +1's. +1 +1 1 +1 +1 +1 +1 1 +1 +1 1 1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1 +1**The Structure of My GA Code**Start-up Main program loop Parent and loser selection Initialization subroutine (somewhat problem dependent) “fv” -- the fitness evaluation code (is problem dependent) Crossover Mutation**GA Main Program: Init**main(int argc; char **argv) { int who; params( argc, argv ); init(); for( who = 0; who < POP_SIZE; who++ ) fitness[who] = fv(who); printf( "End of initial pop\n. . . Now evolve!\n" ); . . . main loop goes here . . . . }**Notes on GA Main Program: Init**1. arg gives the name of a param file 2. params( argc, argv ) processes that 3. POP_SIZE is an example of a run-time param**Initialization Subroutine**init( ) { MAX_HERO = N; /* for problem #1 */ for( i = 0; i < POP_SIZE; i++ ) for( j = 0; j < N; j++ ) p[i][j] = random_int( 2 ); }**Fitness Calculation**int fv(int who ) { int i, the_fitness = 0; fitness_evals++; for( i = 0; i < N; i++ ) the_fitness += p[who][i]; if( print_every_fitness ) printf( "%4d fitness: ... " ); if( the_fitness > hero ) { hero = the_fitness; printf( "New hero %4d ... " ); } return( the_fitness ); }**GA Main Program: Loop**for( trial = 0; trial < LOOPS; trial++ ) { if( hero >= MAX_HERO ) printf( "Goal reached: %d\n", hero ), break; selection( ) crossover( ) mutation( ) for( who = 0; who < POP_SIZE; who++ ) fitness[who] = fv(who); } printf( "%d evaluations", fitness_evals); printf( "Hero = %d\n", hero );**Selection**Selects individuals for reproduction • randomly with a probability depending on the relative fitness of the individuals so that the best ones are more often chosen for reproduction rather than poor ones • Proportionate-based selection picks out individuals based upon their fitness values relative to the fitness of the other individuals in the population • Ordinal-based selection selects individuals based upon their rank within the population; independent of the fitness distribution**Roulette Wheel Selection**Here is a common technique: let F = j=1 to popsizefitnessj Select individual k to be a parent with probability fitnessk/F**Roulette Wheel Selection**assigns to each solution a sector of a roulette wheel whose size is proportional to the appropriate fitness measure chooses a random position on the wheel (spin the wheel) c Fitness a:1 b:3 c:5 d:3 e:2 f:2 g:8 b d a e g f**Roulette Wheel Example**For each chromosome evaluate the fitness and the cumulative fitness For N times calculate a random number Select the chromosome where its cumulative fitness is the first value greater than the generated random number Individual Chromosome Fitness Cumulative x1 101100 20 20 x2 111000 7 27 x3 001110 6 33 x4 101010 10 43 x5 100011 12 55 x6 011011 9 64**Roulette Wheel Example**Individual Chromosome Fitness Cumulative Random Individual x1 101100 20 20 42.8 x4 x2 111000 7 27 19.78 x1 x3 001110 6 33 42.73 x4 x4 101010 10 43 58.44 x6 x5 100011 12 55 27.31 x3 x6 011011 9 64 28.31 x3**Roulette Wheel Selection**There are some problems here: fitnesses shouldn't be negative only useful for max problem probabilities should be “right” avoid skewing by super heros.**Parent Selection: Rank**Here is another technique. Order the individuals by fitness rank Worst individual has rank 1. Best individual has rank POPSIZE Let F = 1 + 2 + 3 + + POP_SIZE Select individual k to be a parent with probability rankk/F Benefits of rank selection: the probabilities are all positive the probability distribution is “even”**Parent Selection: Rank Power**Yet another technique. Order the individuals by fitness rank Worst individual has rank 1. Best individual has rank POP_SIZE Let F = 1s + 2s + 3s + + POP_SIZEs Select individual k to be a parent with probability rankks/F benefits: the probabilities are all positive the probabilities can be skewed to use more “elitist” selection**Tournament Selection**Pick k members of the population at random select one of them in some manner that depends on fitness**Tournament Selection**void tournament(int *winner, *loser) { int size = tournament_size, i, winfit, losefit; for( i = 0; i < size; i++ ) { int j = random_int( POP_SIZE );; if( j==0 || fitness[j] > winfit ) winfit = fitness[j],*winner = j; if( j==0 || fitness[j] < losefit ) losefit = fitness[j],*loser = j; } }**Crossover Methods**Crossover is a primary tool of a GA. (The other main tool is selection.) CROSS_RATE: determine if the chromosome attend the crossover Common techniques for bit string representations: One-point crossover: Parents exchange a random prefix Two-point crossover: Parents exchange a random substring Uniform crossover: Each child bit comes arbitrarily from either parent (We need more clever methods for permutations & trees.)**1-point Crossover**Suppose we have 2 strings a and b, each consisting of 6 variables a1, a2, a3, a4, a5, a6 b1, b2, b3, b4, b5, b6 representing two solutions to a problem a crossover point is chosen at random and a new solution is produced by combining the pieces of the original solutions if crossover point was 2 a1, a2, b3, b4, b5, b6 b1, b2, a3, a4, a5, a6**1-point Crossover**Parents Children**2-point Crossover**With one-point crossover the head and the tail of one chromosome cannot be passed together to the offspring If both the head and the tail of a chromosome conatin good generic information, none of the offsprings obtained directly with one-point crossover will share the two good features A 2-point crossover avoids such a drawback Parents Children**Uniform Crossover**Each gene in the offspring is created by copying the corresponding gene from one or the other parent chosen according to a random generated binary crossover mask of the same length as the chromosomes where there is a 1 in the crossover mask the gene is copied from the first parent and where there is a 0 in the mask the gene is copied from the second parent a new crossover mask is randomly generated for each pair of parents**Uniform Crossover**Parents Child Crossover Mask 1 0 0 1 0 1 1 0**Uniform Crossover**make_children(int p1, p2, c1, c2) { int i, j; for( i = 0; i < N; i++ ) { if( random_int(2) ) p[c1][i] = p[p1][i],p[c2][i] = p[p2][i]; else p[c1][i] = p[p2][i],p[c2][i] = p[p1][i]; } }**Another Clever Crossover**Select three individuals, A, B, and C. Suppose A has the highest fitness and C the lowest. Create a child like this. for(i = 0; i < length; i++ ) { if( A[i] == B[i] ) child[i] = A[i]; else child[i] = 1 - C[i]; } We just suppose C is a “bad example.”**Crossover Methods & Schemas**Crossovers try to combine good schemas in the good parents. The schemas are the good genes, building blocks to gather. The simplest schemas are substrings. 1-point & 2-point crossovers preserve short substring schemas. Uniform crossover is uniformly hostile to all kinds of schemas.**Crossover for Permutations (A Tricky Issue)**Small-alphabet techniques fail. Some common methods are: OX: ordered crossover PMX: partially matched crossover CX: cycle crossover We will address these and others later.**Crossover for Trees**These trees often represent computer programs. Think Lisp Interchange randomly chosen subtrees of parents.**Mutation: Preserve Genetic Diversity**Mutation is a minor GA tool . Provides the opportunity to reach parts of the search space which perhaps cannot be reached by crossover alone. Without mutation we may get premature convergence to a population of identical clones • mutation helps for the exploration of the whole search space by maintaining genetic diversity in the population • each gene of a string is examined in turn and with a small probability its current allele is changed • 011001 could become 010011 • if the 3rd and 5th alleles are mutated**Mutate Strings & Permutations**Bit strings (or small alphabets) Flip some bits Reverse a substring (nature does this) Permutations Transpose some pairs Reverse a substring Trees . . .**Mutation**Randomize each bit with probability MUT_ RATE mutate(int who) { int j; for( j = 0; j < N; j++ ) { if( MUT_RATE > drand48( ) ) p[who][j] = random_int(2); } }**Mutation**Mutation rate determines the probability that a mutation will occur. Mutation is employed to give new information to the population and also prevents the population from becoming saturated with similar chromosomes Large mutation rates increase the probability that good schemata will be destroyed, but increase population diversity. The best mutation rate is application dependent but for most applications is between 0.001 and 0.1.**Mutation**Some researchers have published "rules of thumb" for choosing the best mutation rate based on the length of the chromosome and the population size. DeJong suggested that the mutation rate be inversely proportional to the population size. (1/L) Hessner and Manner suggest that the optimal mutation rate is approximately (M * L1/2)-1 where M is the population size and L is the length of the chromosome.**Recombination vs Mutation**Recombination modifications depend on the whole population decreasing effects with convergence exploitation operator Mutation mandatory to escape local optima strong causality principle exploration operator