1 / 59

Evolution Strategies - PowerPoint PPT Presentation

Evolution Strategies. Overview. Introduction: Optimization and EAs Evolution Strategies Further Developments Multi-Criteria Optimization Mixed-Integer ES Soft-Constraints Response Surface Approximation Examples. Input: Known (measured). Output: Known (measured). Interrelation: Unknown.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about 'Evolution Strategies' - nijole

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Evolution Strategies

LIACS Natural Computing Group Leiden University

• Introduction: Optimization and EAs

• Evolution Strategies

• Further Developments

• Multi-Criteria Optimization

• Mixed-Integer ES

• Soft-Constraints

• Response Surface Approximation

• Examples

LIACS Natural Computing Group Leiden University

Output: Known (measured)

Interrelation: Unknown

Input: Will be given

How is the result for the input?

Objective: Will be given

How (with which parameter settings)

to achieve this objective?

Introduction

• Modeling

• Simulation

• Optimization

! ! !

???

! ! !

! ! !

! ! !

???

???

! ! !

! ! !

LIACS Natural Computing Group Leiden University

Introduction:OptimizationEvolutionary Algorithms

LIACS Natural Computing Group Leiden University

• f : objective function

• High-dimensional

• Non-linear, multimodal

• Discontinuous, noisy, dynamic

• M M1 M2... Mn heterogeneous

• Restrictions possible over

• Good local, robust optimum desired

• Realistic landscapes are like that!

Local, robust optimum

Global Minimum

LIACS Natural Computing Group Leiden University

• Illustrative Example: Optimize Efficiency

• Initial:

• Evolution:

• 32% Improvement in Efficiency !

LIACS Natural Computing Group Leiden University

• Direct optimization algorithm: Evolutionary Algorithms

• First order optimization algorithm:

• Second order optimization algorithm:

e.g., Newton method

LIACS Natural Computing Group Leiden University

Actual Point

Directional vector

Step size (scalar)

Iterative Optimization Methods

General description:

• At every Iteration:

• Choose direction

• Determine step size

• Direction:

• Random

• Step size:

• 1-dim. optimization

• Random

LIACS Natural Computing Group Leiden University

• Global convergence (with probability 1):

• General statement (holds for all functions)

• Useless for practical situations:

• Time plays a major role in practice

• Not all objective functions are relevant in practice

LIACS Natural Computing Group Leiden University

f(x1,x2)

f(x*1,x*2)

x2

(x*1,x*2)

x1

An Infinite Number of Pathological Cases !

• NFL-Theorem:

• All optimization algorithms perform equally well iff performance is averaged over all possible optimization problems.

• Fortunately: We are not Interested in „all possible problems“

LIACS Natural Computing Group Leiden University

• Convergence velocity:

• Very specific statements

• Convex objective functions

• Describes convergence in local optima

• Very extensive analysis for Evolution Strategies

LIACS Natural Computing Group Leiden University

Evolution Strategies

LIACS Natural Computing Group Leiden University

Evolutionary Algorithms

Evolution Strategies

Genetic Algorithms

Other

• Mixed-integer capabilities

• Emphasis on mutation

• Small population sizes

• Deterministic selection

• Developed in Germany

• Theory focused on convergence velocity

• Discrete representations

• Emphasis on crossover

• Constant parameters

• Larger population sizes

• Probabilistic selection

• Developed in USA

• Theory focused on schema processing

• Evolutionary Progr.

• Differential Evol.

• GP

• PSO

• EDA

• Real-coded Gas

LIACS Natural Computing Group Leiden University

• Mostly real-valued search space IRn

• also mixed-integer, discrete spaces

• Emphasis on mutation

• n-dimensional normal distribution

• expectation zero

• Different recombination operators

• Deterministic selection

• (m, l)-selection: Deterioration possible

• (m+l)-selection: Only accepts improvements

• l >> m, i.e.: Creation of offspring surplus

• Self-adaptation of strategy parameters.

LIACS Natural Computing Group Leiden University

• Simple ES with 1/5 success rule:

• Exogenous adaptation of step size s

• Mutation: N(0, s)

• Self-adaptive ES with single step size:

• One s controls mutation for all xi

• Mutation: N(0, s)

LIACS Natural Computing Group Leiden University

• Self-adaptive ES with individual step sizes:

• One individual si per xi

• Mutation: Ni(0, si)

• Self-adaptive ES with correlated mutation:

• Individual step sizes

• One correlation angle per coordinate pair

• Mutation according to covariance matrix: N(0, C)

LIACS Natural Computing Group Leiden University

Evolution Strategy:AlgorithmsMutation

LIACS Natural Computing Group Leiden University

Individual after mutation

1.: Mutation of step sizes

2.: Mutation of objective variables

Here the new s‘ is used!

Operators: Mutation – one s

• Self-adaptive ES with one step size:

• One s controls mutation for all xi

• Mutation: N(0, s)

LIACS Natural Computing Group Leiden University

Contour lines of objective function

Offspring of parent lies on

the hyper sphere (for n > 10);

Position is uniformly distributed

Operators: Mutation – one s

LIACS Natural Computing Group Leiden University

• Simple adaptation mechanism

• Self-adaptation usually fast and precise

• Bad adaptation in case of complicated contour lines

• Bad adaptation in case of very differently scaled object variables

• -100 < xi < 100 and e.g. -1 < xj < 1

LIACS Natural Computing Group Leiden University

Individual after Mutation

1.: Mutation of

individual step sizes

2.: Mutation of object variables

The new individual si‘ are used here!

Operators: Mutation – individual si

• Self-adaptive ES with individual step sizes:

• One si per xi

• Mutation: Ni(0, si)

LIACS Natural Computing Group Leiden University

Operators: Mutation – individual si

• t, t‘ are learning rates, again

• t‘: Global learning rate

• N(0,1): Only one realisation

• t: local learning rate

• Ni(0,1): n realisations

• Suggested by Schwefel*:

LIACS Natural Computing Group Leiden University

Contour lines

Offspring are located on the

hyperellipsoid (für n > 10);

position equally distributed.

Operators: Mutation – individual si

LIACS Natural Computing Group Leiden University

Pros and Cons: Individual 1995.si

• Individual scaling of object variables

• Increased global convergence reliability

• Slower convergence due to increased learning effort

• No rotation of coordinate system possible

• Required for badly conditioned objective function

LIACS Natural Computing Group Leiden University

Individual after mutation

1.: Mutation of

Individual step sizes

2.: Mutation of rotation angles

3.: Mutation of object variables

New convariance matrix C‘ used here!

Operators: Correlated Mutations

• Self-adaptive ES with correlated mutations:

• Individual step sizes

• One rotation angle for each pair of coordinates

• Mutation according to covariance matrix: N(0, C)

LIACS Natural Computing Group Leiden University

D 1995.x1

s2

a12

s1

Dx2

Operators: Correlated Mutations

• Interpretation of rotation angles aij

• Mapping onto convariances according to

LIACS Natural Computing Group Leiden University

• t, t‘, b are again learning rates

• t, t‘ as before

• b = 0,0873 (corresponds to 5 degree)

• Out of boundary correction:

LIACS Natural Computing Group Leiden University

Contour lines

Offspring is located on the

Rotatable hyperellipsoid

(for n > 10); position equally

distributed.

Correlated Mutations for ES

LIACS Natural Computing Group Leiden University

• How to create ?

• Multiplication of uncorrelated mutation vector with n(n-1)/2 rotational matrices

• Generates only feasible (positiv definite) correlation matrices

LIACS Natural Computing Group Leiden University

• Structure of rotation matrix

LIACS Natural Computing Group Leiden University

mutation vector

Rotations

Operators: Correlated Mutations

• Implementation of correlated mutations

nq := n(n-1)/2;

for i:=1 to n do

su[i] := s[i] * Ni(0,1);

for k:=1 to n-1 do

n1 := n-k;

n2 := n;

for i:=1 to k do

d1 := su[n1]; d2:= su[n2];

su[n2] := d1*sin(a[nq])+ d2*cos(a[nq]);

su[n1] := d1*cos(a[nq])- d2*sin(a[nq]);

n2 := n2-1;

nq := nq-1;

od

od

LIACS Natural Computing Group Leiden University

• Individual scaling of object variables

• Rotation of coordinate system possible

• Increased global convergence reliability

• Much slower convergence

• Effort for mutations scales quadratically

• Self-adaptation very inefficient

LIACS Natural Computing Group Leiden University

• Generating N(0,1)-distributed random numbers?

If w > 1

LIACS Natural Computing Group Leiden University

Evolution Strategy: 1995.AlgorithmsRecombination

LIACS Natural Computing Group Leiden University

for 1995. i:=1 toldo

choose recombinant r1 uniformly at random

from parent_population;

choose recombinant r2 <> r1 uniformly at random

from parent population;

offspring := recombine(r1,r2);

add offspring to offspring_population;

od

Operators: Recombination

• Only for m > 1

• Directly after Secektion

• Iteratively generates l offspring:

LIACS Natural Computing Group Leiden University

Parent 1 1995.

Offspring

Parent 2

Operators: Recombination

• How does recombination work?

• Discrete recombination:

• Variable at position i will be copied at random (uniformly distr.) from parent 1 or parent 2, position i.

LIACS Natural Computing Group Leiden University

Parent 1 1995.

Offspring

Parent 2

Operators: Recombination

• Intermediate recombination:

• Variable at position i is arithmetic mean ofParent 1 and Parent 2, position i.

LIACS Natural Computing Group Leiden University

Parent 1 1995.

Offspring

Parent m

Parent 2

Operators: Recombination

• Global discrete recombination:

• Considers all parents

LIACS Natural Computing Group Leiden University

Parent 1 1995.

Offspring

Parent m

Parent 2

Operators: Recombination

• Global intermediary recombination:

• Considers all parents

LIACS Natural Computing Group Leiden University

Evolution Strategy 1995.AlgorithmsSelection

LIACS Natural Computing Group Leiden University

Parents don‘t survive!

… but a worse offspring.

… now this offspring survives.

Operators: Selection

• Example: (2,3)-Selection

• Example: (2+3)-Selection

LIACS Natural Computing Group Leiden University

Exception! 1995.

Operators: Selection

• Possible occurrences of selection

• (1+1)-ES: One parent, one offspring, 1/5-Rule

• (1,l)-ES: One Parent, l offspring

• Example: (1,10)-Strategy

• One step size / n self-adaptive step sizes

• Mutative step size control

• Derandomized strategy

• (m,l)-ES: m > 1 parents, l > m offspring

• Example: (2,15)-Strategy

• Includes recombination

• Can overcome local optima

• (m+l)-strategies: elitist strategies

LIACS Natural Computing Group Leiden University

Evolution Strategy: 1995.Self adaptation ofstep sizes

LIACS Natural Computing Group Leiden University

• No deterministic step size control!

• Rather: Evolution of step sizes

• Biology: Repair enzymes, mutator-genes

• Why should this work at all?

• Indirect coupling: step sizes – progress

• Good step sizes improve individuals

• Bad ones make them worse

• This yields an indirect step size selection

LIACS Natural Computing Group Leiden University

: 1995.Optimum

• How can we test this at all?

• Need to know optimal step size …

• Only for very simple, convex objective functions

• Here: Sphere model

• Dynamic sphere model

• Optimum locations changes occasionally

LIACS Natural Computing Group Leiden University

According to theory

ff optimal step sizes

Largest …

average …

… and smallest step size

measured in the population

LIACS Natural Computing Group Leiden University

• Self-adaptation of one step size

• Learning time for back adaptation proportional n

• Proofs only for convex functions

• Individual step sizes

• Experiments by Schwefel

• Correlated mutations

• Adaptation much slower

LIACS Natural Computing Group Leiden University

Evolution Strategy: 1995.Derandomization

LIACS Natural Computing Group Leiden University

Derandomization 1995.

• Goals:

• Fast convergence speed

• Fast step size adaptation

• Precise step size adaptation

• Compromise convergence velocity – convergence reliability

• Idea: Realizations of N(0, s) are important!

• Step sizes and realizations can be much different from each other

• Accumulates information over time

LIACS Natural Computing Group Leiden University

Offspring 1995.k

Global step size in generation g

Individual step sizes in generation g

Best of l offspring

in generation g

Derandomzed (1,l)-ES

• Current parent: in generation g

• Mutation (k=1,…,l):

• Selection: Choice of best offspring

LIACS Natural Computing Group Leiden University

which created the parent!

Derandomized (1,l)-ES

• Accumulation of selected mutations:

• Also: weighted history of good mutation vectors!

• Initialization:

• Weight factor:

LIACS Natural Computing Group Leiden University

Norm of vector 1995.

Vector of absolute values

speed and precision

Derandomized (1,l)-ES

• Step size adaptation:

LIACS Natural Computing Group Leiden University

Derandomized (1, 1995.l)-ES

• Explanations:

• Normalization of average variations in case of missing selection (no bias):

• Correction for small n: 1/(5n)

• Learning rates:

LIACS Natural Computing Group Leiden University

Evolution Strategy: 1995.Rules of thumb

LIACS Natural Computing Group Leiden University

Speedup by l is just logarithmic –

more processors are only to a

limited extend useful to increase

j.

Some Theory Highlights

• Convergence velocity:

• For (1,l)-strategies:

• For (m,l)-strategies (discrete and intermediary recombination):

Genetic Repair Effect

of recombination!

LIACS Natural Computing Group Leiden University

n 1995.

l

m

10

10.91

5.45

20

12.99

6.49

30

14.20

7.10

40

15.07

7.53

50

15.74

7.87

60

16.28

8.14

70

16.75

8.37

80

17.15

8.57

90

17.50

8.75

100

17.82

8.91

110

18.10

9.05

120

18.36

9.18

130

18.60

9.30

140

18.82

9.41

150

19.03

9.52

• For strategies with global intermediary recombination:

• Good heuristic for (1,l):

LIACS Natural Computing Group Leiden University

• Generalized optimization problem:

LIACS Natural Computing Group Leiden University

Learning rates (global)

Geometrical distribution

Mutation Probabilities

Mixed-Integer ES: Mutation

LIACS Natural Computing Group Leiden University

Literature 1995.

• H.-P. Schwefel: Evolution and Optimum Seeking, Wiley, NY, 1995.

• I. Rechenberg: Evolutionsstrategie 94, frommann-holzboog, Stuttgart, 1994.

• Th. Bäck: Evolutionary Algorithms in Theory and Practice, Oxford University Press, NY, 1996.

• Th. Bäck, D.B. Fogel, Z. Michalewicz (Hrsg.): Handbook of Evolutionary Computation, Vols. 1,2, Institute of Physics Publishing, 2000.

LIACS Natural Computing Group Leiden University