MAE 552 Heuristic Optimization

MAE 552 Heuristic Optimization Instructor: John Eddy Lecture #12 2/18/02 Intro to Evolutionary Algorithms

So Far: • Up until now, all the algorithms discussed operated on a single “current” design. • Simulated Annealing – randomized perturbations of a single point. • Greedy Algorithms – maximized local improvement.

Now: • Consider operating on and maintaining an entire “population” of points simultaneously. • So what? It would be easier to just run my single point algorithm many times or maybe on multiple processors to save wall clock time.

Advantage • We can now simulate the processes of natural selection and competition within our population. We can have our candidate designs fight for places in the population of future generations (iterations).

Evolutionary Algorithms • Date back to the 1950’s. • Many researchers independently developed different versions. • Examples are: • Genetic Algorithms, Evolution Strategies, Evolutionary Programming.

Basic Terminology Most of the terminology is borrowed from Biology • Phenotype:the "outward, physical manifestation" of an organism. The physical parts, the sum of the atoms, molecules, macromolecules, cells, structures, metabolism, energy utilization, tissues, organs, reflexes and behaviors; anything that is part of the observable structure, function or behavior of a living organism. • Genotype: This is the "internally coded, heritable information" carried by all living organisms. This stored information is used as a "blueprint" or set of instructions for building and maintaining a living creature.

Basic Terminology • Gene: The fundamental physical and functional unit of heredity. A gene is an ordered sequence of nucleotides located in a particular position on a particular chromosome that encodes a specific functional product. • Chromosome: The self-replicating genetic structures of cells each containing the entire genome of an organism.

Basic Terminology • Alleles: Alternative forms of a genetic locus. • Crossing Over:The breaking during meiosis of one maternal and one paternal chromosome, the exchange of corresponding sections of DNA, and the rejoining of the chromosomes. This process can result in an exchange of alleles between chromosomes. • Mutation: A heritable change in the genetic makeup of an organism

Important Note We are not constrained by any of the rules of biological systems. For example, we can have as many parents as we wish contribute to the makeup of our offspring, we can have members that live forever (don’t age). What is important to note here is that we are using nature as a model for our mathematical algorithms.

5 Basic Components • An encoded representation of solutions to the problem. • Ex. binary encoding, real number encoding, integer encoding, data structure encoding. • A means of generating an initial population. • Ex. random initialization, patterned initialization. • A means of evaluating design fitness. • Need a consistent means of determining which designs are “better” than others. • Operators for producing and selecting new designs. • Ex. selection, crossover, mutation. • Values for the parameters of the algorithm. • Ex. How much crossover and mutation, how big is population.

General Approach • General equation describing most evolutionary algorithms is: Where: x[t] is the population at time t v(*) is/are the variation operator(s) s(*) is the selection operator.

Encoding • Vectors of integers. • Useful for TSP, Integer problems. Possible Trips: [ 1 8 6 5 2 3 4 7 ] [ 8 2 5 6 3 1 7 4 ] [ 2 4 6 3 7 5 1 8 ] (where return home is implied). TSP: 1 2 4 3 6 8 5 7

Encoding • Vectors of real numbers. • Useful for continuous problems. Possible Design Configurations: [ 13.65, -1.25, 30.98 ] [ 0.67, 14.81, 67.15 ] [ 53.74, 12.54, -21.32 ] Min:f(x) = x[1]2 + x[2] – x[3]3 – 50 s.t. g(x) ≤ 0 h(x) = 0 xl ≤ x ≤ xu

Encoding • Vectors of binary bits. • Useful for packing and shipping problems. What’s in the bag: [ 0 1 1 0 ] [ 1 0 1 0 ] [ 1 1 1 1 ] 4 2 1 3

Encoding • Combination of the previous types. • Useful for variable length lists. (perhaps a list of continuous numbers where an integer indicates the size of the list). Example: What do we do: form: [ N, a0,a1, a2, … aN ] Possible Solutions: [ 3, 2.65, 4.25, 3.14 ] [ 2, 5.32, 2.81 ] [ 4, 3.21, 4.25, 9.65, 7.28 ] y t

Combine: *, +, -, /, % (mod), sin, cos, tan, etc. With: Possible Solutions: Encoding • Symbolic expressions • Useful for mapping problems (control problems etc. typically requires a parser. Data structure is typically a tree.)

Encoding Common Method It is common to use binary encoding for problems involving integer, real, and binary type variables. Previously we saw that vectors of bits may be useful for problems involving binary state variables (T, F) (like is item 1 in the bag?)

Common Method How in a minute, first why? Primarily because of flexibility (handles many types of variables) and because we can take advantage of the way that computers work. Also, this method of encoding lends itself to a number of common variation operators as we will see.

Common Method A note on flexibility: Flexibility usually comes at the expense of optimality. In our case, this method may not work best for many of the problems we do, but it should work fairly well for many of them. Specialization on a problem by problem basis will usually improve performance.

Binary Encoding: How? Each of our design variable values may be represented as vector of 1’s and 0’s. For example: BinaryDecimal 00000000 0 00101101 45 11111111 255 1101.11 13.75

Binary Encoding: How? Therefore, since our design is defined by the collection of its variables, the design can be written as a long string of bits. Example: For a design [ 4, 6, 2 ] We can equivalently write [ 0100, 0110, 0010 ]

Binary Encoding Back to why. We will not likely be doing this by hand. We will probably us a computer. All numbers in a computer are represented as a string of bits ( 1’s and 0’s). We can take advantage of this.

Binary Encoding Because of this, it is not necessary to explicitly create vectors of bits to represent our design variables. Advantages (assuming we decided on BE): • Memory efficient • 32 bit integer value requires 4 bytes instead of what would be a minimum of 32 bytes otherwise.

Binary Encoding Advantages (vs. explicit vectors of bits): • Code simplification • No explicit conversion from binary to decimal is necessary for use of the design variables. • Most languages support operation directly on the bits of integers.

Binary Encoding Disadvantages: • Most bitwise operators only work on integral types and probably most of our variables are real. • Solution: Specify a precision with which to keep each design variable and convert to a long integer before any bit manipulation.

Binary Encoding Example of conversion using precision. Given X1 = 12.6345 - desired precision = 3 X1-int = (int)[(12.6345)(103)] = 12634 To improve accuracy, perhaps round X1 prior to conversion. To get back the original, simply divide X1-int by 103.

Binary Encoding In general: Xi-int = Xi * 10(prec) (truncated). Xi = Xi-int / 10(prec)

Initial Population • Quite simply, the population must be initialized in any way you wish. • Some Possibilities: Random Patterned x2 x2 x1 x1

Fitness Evaluation • It is necessary to provide a consistent means of evaluating the fitness of a design. A ≥ B ≥ C implies A ≥ C (those who have studied utility theory and preference ranking know that this does not always hold)

Fitness Evaluation • The closer to optimal a point is, the better it’s fitness should be (provides direction). Consider the case of functions of binary bits xi = {0, 1} Maximize: What will be the result of these two functions (what is the difference)?

Fitness Evaluation • How can we avoid this Problem? • In this particular case, perhaps we can make our fitness value a count of the number of DV’s with a value of 1. • This is a very problem specific question and will usually require knowledge about the problem.

Fitness Evaluation • Important concept is that fitness is not limited to the objective function value and commonly is not. • Creative measures of fitness can greatly improve the performance of the algorithm and may have a strong dependency on the choice of encoding and use of operators.

Variation Operators • The variation operators provide means of generating new designs. • They should be set up to leverage information discovered in previous design evaluations.

Variation Operators • Choice of variation operators is tightly coupled with choice of encoding (as we will see as we progress). • Problems are most efficiently solved when the proper operators are chosen and tailored to the problem at hand.

Variation Operators - Crossover • Crossover is the inclusion of or combination of “genetic” material from one or more designs to create new designs. (recall biological definition). • Appropriate choice of a crossover strategy is highly dependent on choice of encoding and evaluation function.

Crossover on Vectors of Integers • Consider a plain integer problem (not TSP) and two possible designs: • Could probabilistically choose values from the vectors: • X1 = [ 10, 15, 9, 7, 19 ] • X2 = [ 17, 2, 14, 31, 3 ] • C1 = [ 10, 2, 14, 7, 19 ] • C2 = [ 17, 15, 9, 31, 3 ] • Could choose 1 or more random crossover point(s): • X1 = [ 10, 15, 9, 7, 19 ] • X2 = [ 17, 2, 14, 31, 3 ] • C1 = [ 10, 15, 14, 31, 3 ] • C2 = [ 17, 2, 9, 7, 19 ] -Could increment or decrement each according to a Gaussian dist. with mean of zero and std dev. would then be a measure of the probability of large changes.

Crossover on Vectors of Reals • Could use same strategies listed for ints. • Another approach is to use arithmetic crossover (taken from convex set theory). • Basic Equations: • C1 = λ1X1 + λ2X2 • C2 = λ1X2 + λ2X1 This is a weighted average approach: Convex Combination: λ1 +λ2 = 1andλ1, λ2 > 0 Affine Combination: λ1 + λ2 = 1 Linear Combination: λ1, λ2 En Note: there are many other approaches you can read about.

Crossover on Vectors of Bits First Define Hamming Distance: Given 2 expressions, the Hamming distance is the number of characters that must be changed to make the expressions equivalent. So DH for 01101 and 10100 is 3 and DH for 01111 and 10000 is 5 We will consider this later.

Crossover on Vectors of Bits • Could use 1st two strategies for ints. (3rd strategy would not make sense). • Could iterate through vector and according to a given probability, change the bits.

Good Crossover for BE • BE lends itself well to Parameterized Crossover (we have already seen a parameterized approach) • Because each variable (parameter) is in itself a string of bits, we can operate on each variable separately.

Good Crossover for BE • Example Single Point Parameterized: • Given two designs [ 0100, 0110, 0010 ] = [ 4, 6, 2 ] [ 1010, 0111, 1111 ] = [ 9, 7, 15 ]

Good Crossover for BE • Example Single Point Parameterized: • Choose a crossover point for each variable [ 0 100, 01 10, 001 0 ] = [ 4, 6, 2 ] [ 1 010, 01 11, 111 1 ] = [ 9, 7, 15 ] • Then perform crossover as before.

Good Crossover for BE • Example Single Point Parameterized: • Results for this case: [ 0 100, 01 10, 001 0 ] = [ 4, 6, 2 ] [ 1 010, 01 11, 111 1 ] = [ 9, 7, 15 ] [ 0010, 0111, 0011] = [ 2, 7, 3 ] [ 1100, 0110, 1110] = [ 12, 6, 14 ] Prnts Cldrn • Notice that one child tends to be like one parent and the other tends to be like the other parent.

Crossover on Combinations A few of the strategies we mentioned can be used as-are (ex. random picking). However, for maximum efficiency, this will likely require a combination of the aforementioned crossover strategies. This case will likely require highly specialized operators (very problem dependent).

Crossover on Symbolic Exps. • Assume using tree structures • Now, crossover can occur by combining the branches of one tree with another.

Crossover on Symbolic Exps. +

MAE 552 Heuristic Optimization