Stochastic Relaxation, Simulating Annealing, Global Minimizers

Stochastic Relaxation,Simulating Annealing,Global Minimizers

Different types of relaxation • Variable by variable relaxation – strict minimization • Changing a small subset of variables simultaneously – Window strict minimization relaxation • Stochastic relaxation – may increase the energy – should be followed by strict minimization

Complex landscape of E(X)

How to escape local minima? • First go uphill, then may hit a lower basin • In order to go uphill should allow increase in E(x) • Add stochasticity: allow E(x) to increase with probability which is governed by an external temperature-like parameter T The Metropolis Algorithm (Kirpartick et al. 1983) Assume xold is the current state, define xnewto be a neighboring state and delE=E(xnew)-E(xold) then IfdelE<0 replace xold by xnew else choose xnew with probability P(xnew)= and xoldwith probability P(xold)=1- P(xnew)

The probability to accept an increasing energy move

The Metropolis Algorithm • As T 0 and when delE>0 : P(xnew) 0 • At T=0: strict minimization • High T randomizes the configuration away from the minimum • Low T cannot escape local minima • Starting from a high T, the slower T is decreased the lower E(x) is achieved • The slow reduction in T allows the material to obtain a more arranged configuration: increase the size of its crystals and reduce their defects

Fast cooling – amorphous solid

Slow cooling - crystalline solid

SA for the 2D IsingE=-Sijsisj , i and j are nearest neighbors + + + + Eold=-2

SA for the 2D IsingE=-Sijsisj , i and j are nearest neighbors + + + + + + + Eold=-2 Enew=2

SA for the 2D IsingE=-Sijsisj , i and j are nearest neighbors + + + + + + + Eold=-2 Enew=2 delE=Enew- Eold=4>0 P(Enew)=exp(-4/T)

SA for the 2D IsingE=-Sijsisj , i and j are nearest neighbors + + + + + + + Eold=-2 Enew=2 delE=Enew- Eold=4>0 P(Enew)=exp(-4/T) =0.3 => T=-4/ln0.3 ~ 3.3 Reduce T by a factor a, 0<a<1: Tn+1=aTn

Exc#7: SA for the 2D Ising (see Exc#1) Consider the following cases: 1. For h1= h2=0 set a stripe of width 3,6 or 12 with opposite sign 2. For h1=-0.1, h2=0.4 set -1 at h1 and +1 at h2 3. Repeat 2. with 2 squares of 8x8 plus spins with h2=0.4 located apart from each other Calculate T0 to allow 10% flips of a spin surrounded by 4 neighbors of the same sign Use faster / slower cooling scheduling a. What was the starting T0 , E in each case b. How was T0 decreased, how many sweeps were employed c. What was the final configuration, was the global minimum achievable? If not try different T0 d. Is it harder to flip a wider stripe? e. Is it harder to flip 2 squares than just one?

SA for the bisectioning problem R R’ i

SA for the bisectioning problem “individual” temperature R R’ i The probability of i to belong to R’ depends on Si = Sj in R’ aij / S aij P(i in R’)={ 1 delE<=0 exp[-delE/(TSi)] delE> 0

SA for the bisectioning problem “individual” temperature R R’ i The probability of i to belong to R’ should increase if a bigger change along the cut line is made If delE is small enough it is expected that further moves will indeed eventually produce a lower E

SA for the bisectioning problem how to choose T R R’ i • Calculate delE/Si along the cut line and sort them • Decide upon the % of changes desired • Find the appropriate T by demanding P(%)=0.5

SA for the linear ordering problems: multiple choices for a variable • Try to move node i up to k moves to the right and to the left: choose between the 2k+1 possibilities • For j=-k,..,-1,1,..,k , P(j)=z min[1 , exp(-delE(j)/T(j))] • For k=0: P(0)=z minj[1 - P(j)/z] • z is calculated from the normalizationSj P(j)=1 • T(j) is calculating apriori for each j aiming at a certain acceptance rate (e.g. 60%)

The Metropolis Algorithm (cont.) • May result in a very slow processing • Still, SA is considered to be a powerful global minimizer • Instead of very slow cooling schedule, repeat heating-cooling several times

Heating-cooling scheduling T #relaxation sweeps

The Metropolis Algorithm (cont.) • May result in a very slow processing • Still, SA is considered to be a powerful global minimizer • Instead of very slow cooling schedule, repeat heating-cooling several times and keep track of the best-so-far configuration: • The best-so-far has a non-increasing E • It is an outside observer • The best-so-far is actually the calculated minimum

Heating-cooling scheduling T #relaxation sweeps Store the best-so-far

The Metropolis Algorithm (cont.) • May result in a very slow processing • Still, SA is considered to be a powerful global minimizer • Instead of very slow cooling schedule, repeat heating-cooling several times and keep track of the best-so-far configuration: • The best-so-far has a non-increasing E • It is an outside observer • The best-so-far is actually the calculated minimum • Problem: heating may destroy already achieved minima in various subregions • Add “memory” of the best-so-far for those subregions

Lowest Common Configuration The global minimum

Lowest Common Configuration C1 The global minimum

Lowest Common Configuration C1 C2 The global minimum

Lowest Common Configuration C1 C2 The global minimum E(LCC(C1, C2))<= min[E(C1), E(C2)] LCC(C1, C2)

Heating-cooling scheduling T #relaxation sweeps Apply LCC

Heating-cooling scheduling T #relaxation sweeps best-so-far LCC (best-so-far , the new T=0)

Exc#8: LCC for the bisectioning problem R R’ i Given 2 partitions, find a linear time algorithm for the construction of their LCC

Exc#8*:LCC for linear ordering problems Find a (nearly) linear time algorithm (e.g. sorting is allowed) for the LCC of 2 permutations, in which subpermutations are detected and chosen into the best-so-far

Multilevel Simulated Annealing • Do not increase T by much: avoid destroying the global solution inherited from the coarser levels • Reduce T quickly: typically 2-3 values of T>0 (followed by strict minimization) are sufficient • Repeat heating-cooling several times per level • Accumulate the minimal solution into the best-so-far by applying the LCC at the end of T=0 • Interpolate the best-so-far to the next level

Genetic algorithm:A global minimizer

Genetic algorithm • A global search technique inspired by evolutionary biology • Start from a population of individuals (randomly generated) – this is the 1st generation • The next generation follows by: 1. selection of individuals from the current generation to breed the next generation according to some fitness measure 2. crossover (recombination) of pair of (randomly chosen) parents to produce an offspring 3. mutations are applied randomly to enhance the diversity of the individuals in the generation

A genetic algorithm for the linear arrangement problem P=1 • Initial population: 1. select a starting vertex 2. built the permutation by the greedy frontal increase minimization algorithm

A genetic algorithm for the linear arrangement problem P=1 • Initial population: 1. select a starting vertex 2. built the permutation by the greedy frontal increase minimization algorithm Fi Choose a node from Fi The one which is mostly connected to the already placed nodes

A genetic algorithm for the linear arrangement problem P=1 • Initial population: 1. select a starting vertex 2. built the permutation by the greedy frontal increase minimization algorithm • Selection of survivals is based on the E(x) • Recombinate two randomly chosen parents: Parent 1: 5 7 2 3 8 6 9 1 4 Parent 2: 6 1 2 4 3 5 9 8 7

A genetic algorithm for the linear arrangement problem P=1 • Initial population: 1. select a starting vertex 2. built the permutation by the greedy frontal increase minimization algorithm • Selection of survivals is based on the E(x) • Recombinate two randomly chosen parents: Parent 1: 5 7 2 3 8 6 9 1 4 Parent 2: 6 1 2 4 3 5 9 8 7 Offspring:2 9

A genetic algorithm for the linear arrangement problem P=1 • Initial population: 1. select a starting vertex 2. built the permutation by the greedy frontal increase minimization algorithm • Selection of survivals is based on the E(x) • Recombinate two randomly chosen parents: Parent 1: 5 7 2 3 8 6 9 1 4 Parent 2: 6 1 2 4 3 5 9 8 7 Offspring:2 9 _ 326 _ 795 _

A genetic algorithm for the linear arrangement problem P=1 • Initial population: 1. select a starting vertex 2. built the permutation by the greedy frontal increase minimization algorithm • Selection of survivals is based on the E(x) • Recombinate two randomly chosen parents: Parent 1: 5 7 2 3 8 6 9 1 4 Parent 2: 6 1 2 4 3 5 9 8 7 Offspring:2 9 _ 32 6 _ 795 _ _ 32 687954

A genetic algorithm for the linear arrangement problem P=1 • Initial population: 1. select a starting vertex 2. built the permutation by the greedy frontal increase minimization algorithm • Selection of survivals is based on the E(x) • Recombinate two randomly chosen parents: Parent 1: 5 7 2 3 8 6 9 1 4 Parent 2: 6 1 2 4 3 5 9 8 7 Offspring:2 9 _ 32 6 _ 795 _ _ 32 687954 1 32 687954

A genetic algorithm for the linear arrangement problem P=1 • Initial population: 1. select a starting vertex 2. built the permutation by the greedy frontal increase minimization algorithm • Selection of survivals is based on the E(x) The next generation is constructed by: 1. Recombinations of 2 randomly chosen parents 2. Improving the E(x) of the offspring by local processing, e.g. by Simulated Annealing 3. Choose the best individuals from the pool of parents and children

Spectral Sequencing :A global minimizer

Spectral Sequencing: a global minimizer • Given a weighted graph where wij is the edge weight between the nodes i and j • Define the graph Laplacian A to be aij = -wij aii = Sjwij • A is symmetric semipositive definite • Consider the eigenvalue problem Ax=lx • Arranging the nodes of the graph according to the eigenvector associated with the 2nd smallest eigenvalue has been shown by Hall (1970) to be the solution to the problem minSjwij(xi - xj)2 for real variables x

Spectral Sequencing: a global minimizer • SS has been used extensively to solve a large variety of ordering problems: • Linear ordering problem: P=1,2, • Partitioning problems • Embedding to lower dimensions, etc. • To calculate the eigenvectors use multilevel • The direct use of multilevel to solve the original problem produces better results than using the ordering dictated by SS

P=2: Multilevel approach vs. Spectral method ratio graphs The results of the multilevel approach were obtained without post-processing! Ilya Safro, Dorit Ron, A. Brandt: J. Graph Alg. Appl. 10 (2006) 237-258

Stochastic Relaxation, Simulating Annealing, Global Minimizers

Stochastic Relaxation, Simulating Annealing, Global Minimizers

Presentation Transcript

Simulated Annealing

Simulated Annealing

Simulated Annealing

Methods for Simulating Discrete Stochastic Chemical Reactions

Simulated Annealing

Simulating Biological Systems in the Stochastic Pi-calculus

SIMULATED ANNEALING

Simulated Annealing

Simulated Annealing

Simulated Annealing

Revisiting stochastic models: Anomalous relaxation effects in 2D spectroscopy

Simulated Annealing

Stochastic Optimization and Simulated Annealing

Stochastic Approximation and Simulated Annealing

SIMULATED ANNEALING

Deterministic Annealing

Annealing

Methods for Simulating Discrete Stochastic Chemical Reactions

Annealing

Simulated Annealing

Annealing

Simulated Annealing