Relationships between Evolutionary Computation & Biology

Relationships between Evolutionary Computation & Biology Focal Points: Combining Evolution & Learning in Hybrid Adaptive Systems Exploiting Coevolution in Evolutionary Computation

Biology in Evolutionary Computation • Basic Neo-Darwinian Evolution • The essence of most EAs • Genotype-Phenotype Distinction • Developmental processes that derive ptype from gtype • Fundamental Theoretical Results • Hardy-Weinberg Law • Fisher’s Theorem • Combinations of Evolution & Learning • Lamarckianism • The Baldwin Effect • Coevolution

ChromosomePair Diploid Genetics A b C d a B C d Heterozygous for genes 1 & 2 Homozygous for genes 3 & 4 • For each gene, the allele pair will interact in some way to produce a protein (phenotypic trait). • In some cases, one allele “dominates” while the other is “recessive” => one allele is completely expressed; the other is ignored. • In other cases, the two alleles have a combined effect.

Generation K+1 Generation K Diploid Recombination Sex* Crossover a b C D w x Y Z Gamete Pool a b c d A B C D A B c d A b c D a B C d a b C D W X Y z w x y z W X Y Z w x y Z w x Y Z W X y z a B C d w x y Z

Population Genetics Notation Gene Pool AA AA AA aa AA aa aa Aa Gene frequencies p = fraction of A’s = 9/16 q = fraction of a’s = 1 - p = 7/16 Population Size N = 8 Genotype Numbers NAA(4), NAa(1),Naa(3) Genotype frequencies D = fraction of AA’s = NAA/N = 4/8 H = fraction of Aa’s = NAa/N = 1/8 R = fraction of aa’s = Naa/N = 3/8 p = (2NAA + NAa) / 2N = D + H/2 q = (2Naa + NAa) / 2N = R + H/2

The Hardy-Weinberg Law • Given: • Initial genotype distribution: D0,H0,R0 => p0, q0 • Assume: • Random mating • No mutation • No selection - all genotypes have same fitness • No genetic drift - population size large enough that the mating probabilities and child genotype distributions equal the statistical estimates. • Results: • D1 = p02 H1 = 2p0q0 R1 = q02 • p1 = p0 q1 = q0 • D, H, R values remain constant after gen 1. • Gene and Genotype freqs at stable equilibrium.

Lessons for Evolutionary Computation • Diploid EAs can be useful • Allow recessive traits to linger in the population without always being expressed. If the environment (problem) changes such that expression of the recessive gene is desirable, then aa’s will have a selective advantage and rise in frequency. • Without diploidy, we would have to wait for the proper mutation to create an a. • We get recessiveness for free in GP, since some subtrees may not be expressed in one individual, but when combined with other code, they get executed. E.g. (if nil SubtreeA SubtreeB) • Don’t forget Hardy, Weinberg & Fisher! • Without selective pressure and mutation, evolution can quickly stagnate, even when random recombination (I.e., crossover) is used. • Evolutionary speed is a function of fitness variance. • Fitness functions need to have a wide range of outputs. • Selection mechanisms need to maintain selection pressure, or create it in cases where all individuals have similar fitness.

Learning + Evolution • Individuals improve during their lifetimes (plasticity/learning) & populations improve (avg fitness increases) over many generations. • Learning speeds up evolution by: • Lamarckianism: Direct inheritance of acquired traits • Results of learning (the improved phenotype) is reverse-encoded into the genome prior to reproduction => children inherit what their parents learned/acquired. • Disprove biologically, but useful for EC. • The Baldwin Effect: Indirect inheritance of acquired traits • Some genotypes are not inherently optimal, but their corresponding phenotypes can become so via learning. Hence, these genotypes have a selective advantage and will remain in the population until a mutation produces optimal genotypes. • No ptype-to-gtype reverse encoding necessary. • Accepted by biologists, but difficult to test. • Useful in EC.

Lamarck -vs- Baldwin • Direct -vs- Indirect inheritance of acquired characteristics • Below, the blue 3-fingered component is an acquired trait. Lamarck Baldwin Phenotype Many generations Selective Advantage 1 generation Genotype Mutation

Lamarckianism • Jean-Baptiste Lamarck (1744-1829) • Earliest advocate of an evolutionary theory • Inheritance of Acquired Characteristics (1809) Basic Idea: If organisms adapt during their lifetimes, then those changes can be DIRECTLY passed on to their offspring. (inheritance of acquired traits). Modern Implications The results of individual plasticity, such as learning or physical changes, can be reverse encoded into the genome prior to reproduction. So children inherit the fruits of their parents plasticity as innate traits. Biologically disproven, except in a few rare cases.

Contributions of Lamarckianism 3 Things that Lamarck may have gotten right (Schull, 1996): 1. Individual plasticity can influence evolution. 2. Similar individual & populational adaptivity. 3. Collective activity is relevant to evolutionary theory. Agents behave (and learn) purposely, so these collective activities could have an emergent effect upon the course of evolution. Lamarck’s main mistake is still very useful: Lamarckian inheritance can improve evolutionary computation!!

Lamarckianism Evolutionary Computation + Local Search Local Search = Learning/Plasticity Fitness Phenotype Genotype

Lamarckianism in EC Basic Process • Convert genotypes to phenotypes • Phenotypes change via learning • At generation end, CONVERT THE UPDATED PHENOTYPES BACK INTO A CORRESPONDING GENOTYPE • Crossover and mutate new genotypes to create the next generation

Partial Lamarckianism (Houck, et. al., 1997) • Only N% of genotypes are changed via the reverse encoding from phenotypes. • N = 20-40% gives best results on benchmarks. • This beats pure Lamarckianism (N = 100) and pure Baldwin Effect (N = 0 with plastic ptypes) • N = 100 is dangerous • Leads to premature convergence in static environments • Kids should NOT inherit too much of what parents learned in dynamic environments, since kids’ world is different from parents’

The Baldwin Effect A new factor in evolution (James Baldwin, 1896) Basic Idea: If organisms adapt during their lifetimes, then those changes can be INDIRECTLY passed on to their descendants, although it may take many generations. Two Phases 1. Plasticity enhances evolution by INCREASING FITNESS DIVERSITY and smoothing the fitness landscape. ************************** Individuals who adapt their way to higher fitness have a selective advantage over those who, despite adaptivity, cannot. 2. The results of plasticity are assimilated into the gene pool via chance mutations.

Biologically plausible, but hard to prove in natural systems. • Hinton & Nowlan (1987) - Trivial evolutionary simulations clearly show it!

Smoother Fitness Landscape D2 Baldwin Effect Phase I: Learning Enhancement • Learning/Plasticity = Local Hill Climbing in Fitness Space • Phenotypes near the base of the peak can achieve fitness increases. Fitness & # Phenotypes A D1 Phenotype Genotype

D1, D2: All phenotypes have the ability to learn, but only those near the base of the peak can achieve fitness increases. This selective advantage moves the genotype/phenotype distribution from D1 to D2, hence closer to the optimal phenotype P*.

Mutation produces genotypes hard-wired for phenotype A. • Learning has a cost => Selection favors Natural Born A’s D4 D3 Baldwin Effect Phase II: Genetic Assimilation Economy of Flexibility 1. Learning accelerates evolution, 2. Evolution removes the need for learning. 3. Evolution removes the ability to learn! Fitness & # Phenotypes D2 A Phenotype Genotype

Genetic operations spread the population distribution from D2 to D3, producing individuals with hard-wired optimal penotypes, P*. Due to the cost of learning. These natural-born P*s have a selective advantage over the learned P*s, and the distribution moves from D3 TO D4.

Preconditions for the Baldwin Effect (Mayley, 1996) • Learning has both benefits (phase I) and costs (phase II) • The genotype and phenotype spaces are correlated • There needs to be an achievable genotypic change for each learned phenotypic change (Observe nature !). • Without this, the mutations that should produce phase II will fail to find a genotype that encodes the previously-learned trait (There should be a possible genetypic change for each acquired trait). It’s the exploitation of the benefits and reduction of the costs that provide the selection pressure for : • The adoption of a learned behaviour. • Its genetic assimilation.

Benefits of Learning (Mayley, 1996) • Adaptive members of the population are able to ”find” new advantageous behaviours that less plastic individuals are unable to perform. This means that these adaptive individuals gain the upper hand and are selected for. • Learning provides individuals with the ability to adapt to an environment that is changing at a faster time scale than that on which evolution operates. • A learning mechanism may be able to provide an individual with behaviours that are simply very hard to evolve. (i.e. acquiring a language)

Costs of Learning (Mayley, 1996) • Cost as function of time in the juvenile (learning) stage • Time wasted in “ignorant” state. • Energy expended during knowledge acquisition • Delayed reproduction (until beyond juvenile stage) • Catastrophic costs • Unreliability - environment may never provide the necessary stimuli for learning • Death due to state of ignorance • Fixed costs • Extra expense to support plasticity (particularly evident in EAs with a learning component) • Population-level costs (don’t affect juveniles directly) • Parental investment in teaching the young. All 4 types of costs can affect The Baldwin Effect • Phase I: Too high costs => no learning => no Baldwin Effect Phase II: Higher costs => more selective pressure for assimilation => faster convergence (Leaning phase is cut short)

Genotype-Phenotype Correlation Correlated: near(G1,G2) <=> near(P1,P2) Fitness & # Phenotypes P1 P2 P1 P2 Phenotype G2 G1 G2 G1 Genotype Easy mutation Harder mutation

Hinton & Nowlan (1987) • First evidence of the Baldwin Effect in a natural or artificial system Needle-in-a-haystack Task: Goal: Find a particular 20-bit string Evolution: GA with 20-gene chromosomes 3 Alleles: 0, 1, * (wildcard) Learning: Max of 1000 random guesses per individual. Guess = replace a * with a 0 or 1. Fitness Function: F = 1 + 19(1000 – NG)/1000 NG = # guesses (low better!) Learning has a cost! Without learning: needle-in-haystack search With learning: smoother, navigable space Fitness Gtype/Ptype

Baldwin Effect Phase I Phase II 1 Correct Allele Ratios Wildcard Wrong Generations Hinton & Nowlan (2) • Without learning: no fitness variance, since all individuals are equally bad (unless, by pure chance, the needle is found). • Thus, no evolutionary progress (avg fitness does not change) • Search is random. • With learning:partial solutions (that can learn their way to the correct solution) get partial credit and move search in the proper direction (up the smooth hill). • Learning smoothes the fitness landscape by making more informative the points in the neighborhood of the optimum. Degree of genetic assimilation = size of Correct-Wildcard gap and is directly proportional to the learning cost. 50

Hinton & Nowlan (3) • PHASE 1 : Larger diversity with learning, hence the probability that some and more of those individuals will get closer to the correct solution is higher. That’s the reason why number of corrects increase while wrongs are low in proportion. • PHASE 2: The ones closer to the correct phenotype are favored and mutated. Wrong ones are almost completely eliminated while the correct ones are even more. Natural born ones are also favored over others.

How does Baldwin effect work in Evolution? • Acquired traits are not coded back into the genotype. However, the genes which could acquire highly fit traits are expected to acquire similar highly fit traits in future if they are allowed to stay in the populations of coming generations. If so, with mutation these genes are expected to have traits acquired by learning until now to be CODED into their genoypes. This the way how learning effects evolution. Ordinary genes are not expected to be genetically mutated in a way to acquire traits that they are not able to do so with learning.

#1 #2 Current State Next State Evaluation Network Action Network State Evaluation Action World Evolutionary Reinforcement Learning (ERL) Ackley & Litmann (1991) Complementary Neural Networks: #1 learns via backprop; original weights from GA. #2 provides state evaluations for #1; static weights from GA. Baldwin Effect: Phase I: Genes for ANN #2 are relatively fixed early on => selective pressure for learning in ANN #1, which depends on ANN#2. Phase II: ANN #2 no need for learning. Good behaviors are assimilated in ANN #1.

Using Learning to Faciliate the Evolution of Features for Recognizing Visual Concepts (Bala et. al. 1996) Objectives: • Assess the ability of a hybrid evolution and learning architecture to identify feature sets which result in good classificiation performance. • Understand clearly the role that learning played in this process. • Understand the extend to which various aspects of the Baldwin effect were present.

Enhancing Machine Learning with Evolution Bala et. al, 1996 Training Data Test Data Feature Set Decision Tree Genetic Algorithm C4.5 Classify Test Fitness (Feature Set) = -1*Size + Entropy + Accuracy • Each GA genome is a subset of the universal feature set. • Size(Feature Set) = Learning Effort/Cost

Optimal Feature Subset Full Feature Set GA MODULE Feature Subsets Fitness Measure EVALUATION MODULE Infomax measure Entropy Acuracy Tree Evaluation Tree Induction Trees Testing data Feature Set Size Training data Feature Mask Image Data Feature Extraction Examples

GA + C4.5 Tests Tasks: Satellite Imaging & Facial Recognition Evolution + Learning beats: Learning alone = C4.5 using universal feature set. Evolution alone =Fitness function without the accuracy term. Baldwin Effect: Stage 1: Learning selected for. Large feature sets dominate initially. Stage 2: GA population converges to small feature sets that match the optimal sets learned by solo C4.5 (Genetic assimilation). First example of the Baldwin Effect on a significant problem.

if > if Pick Move X 5 = Pick Move move north Y 3 Reinforced Genetic Programming (RGP) Downing (2001) • Strongly-typed classify-and-act genetic programs. • Actions = simple moves or monitored choices. • Choice nodes enable Q-Learning by keeping track of previous actions and resulting rewards. • Evolution provides problem-space abstractions for RL to work with. • RL enhances GP evolution via Baldwin Effect or Lamarckianism.

RGP Maze Walkers 10 x 10 maze Goal * Start Baldwin Effect Phase I Phase II

Coevolution • In nature, the playing field is always changing. Populations are never evolving against a static environment, but against an ever-changing world, consisting of many other evolving populations. • So changes to both the environment and to the other populations need to be taken into account w.r.t. design changes for improved fitness. Competitive Coevolution (Arms Race) • Competitors (e.g. predators & prey) must constantly improve to keep pace with one another. • Over time, the individuals have higher absolute fitness (I.e. they are much better than their ancestors), but their relative fitness (vis-à-vis their current competitors) remains unchanged. Cooperative Coevolution • 2 or more species evolve such that each enhances its own selective advantage plus that of the other(s).

Coevolution in EC Dilemma #1: Problem Impossibility The problem can be so difficult, and the fitness function so hard to design, that all individuals in early generations get very low fitness => low fitness variance => slow or no evolution. Dilemma #2: Problem Mastery After many generations, all individuals may have the same high fitness but may not yet have found a solution. Once again, fitness variance is low => evolution may stop, and the optimal solutions may never be found. Competitive Coevolution to the Rescue! • Use 2 competing populations: problems & solutions. • Fitness(problem) = # errors/omissions made by the solution individuals on that particular problem. Early: Problems are easier, so at least some of the early solutions can solve some of the problems => fitness variance => evolution. Late: Problems become more difficult, and all solutions are pretty good. But there are usually tough problems that only some solutions can handle => fitness variance => evolution. • So coevolution keeps selection pressure from getting too low, thus avoiding dilemmas #1 and #2 and maintaining evolutionary progress.

EC in Evolutionary Biology (Mitchell, 1996) Standard Evol Bio techniques (drawbacks/problems in italics) • Fossil records: incomplete • View adaptations in real ecosystems: can’t do controlled experiments. • Lab experiements (e.g. fruit flies): too few generations for major evolutionary changes. • DNA analysis to reconstruct phylogenic trees: ambiguous & junk DNA. • Mathematical models (usually differential equations): extremely simplified - abstract away many individual behaviors. Computer Simulations of Evolution (e.g. Artificial Life) • Controllable (via modifiable parameters) • Repeatable • Many generations can be run. • Individual-based models require fewer abstracting assumptions: but it’s still a model. • Generate lots of data: data interpretation problems

Summary: EC & Biology • Biological metaphors (even failed ones) have useful applications in artificial adaptive systems. • NeoDarwinism • Lamarckianism • Baldwinism • Diploidy • Coevolution • Evolutionary Computation can contribute back to biology • Existence proof of the Baldwin Effect. • Assessing roles of mutation & crossover in evolution • GP for discovering metabolic pathways and solving protein-folding problems. • GA in some algorithms that help to map the human genome. • Evolutionary Simulations for exploring possible complex interactions among evolving species.

Lamarckian Modesty Can there be any more important conclusion...or any to which more attention should be paid that that which I have just set forth? ... Lamarck’s final sentence of an excerpt published in 1914.

Relationships between Evolutionary Computation & Biology