Simulated annealing for convex optimization
Download
1 / 28

Simulated annealing for convex optimization - PowerPoint PPT Presentation


  • 225 Views
  • Updated On :

Simulated annealing for convex optimization. Adam  . Kalai: TTI-Chicago Santosh Vempala: MIT. Bar Ilan University 2004. 100-million dollar endowment (thanks, Toyoda!) 12 tenure -track slots, 18 visitors On University of Chicago campus Optional teaching Advising graduate students.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Simulated annealing for convex optimization' - JasminFlorian


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Simulated annealing for convex optimization l.jpg

Simulated annealing for convex optimization

Adam . Kalai: TTI-Chicago

Santosh Vempala: MIT

Bar Ilan University

2004


Slide2 l.jpg

  • 100-million dollar endowment (thanks, Toyoda!)

  • 12 tenure-track slots, 18 visitors

  • On University of Chicago campus

    • Optional teaching

    • Advising graduate students


Outline l.jpg
Outline

Simulated annealing gives the best known run-time guarantees for this problem.

It is optimal among a class of random search techniques.

  • Simulated annealing

    • A method for blind search:

      • f:X!, minx2X f(x)

      • Neighbor structure N(x) µ X

    • Useful in practice

    • Difficult to analyze

  • A generalization of linear programming

    • Minimize a linear function over a convex set S ½ n

    • Example: min 2x1+5x2-11x3with x12+5x22+3x32· 1

    • Set S specified by membership oracle M: n! {0,1}

    • M(x) = 1 $ x 2 S

    • Difficult, cannot use most linear programming techniques [GLS81,BV02]

In high dimensions




Simulated annealing kgv83 l.jpg
Simulated Annealing [KGV83]

Phase 1: Hot (Random)

Phase 2: Warm (Bias down)

Phase 3: Cold (Descend)

Phase 1: Hot (Random)

Phase 2: Warm (Bias down)

Phase 3: Cold (Descent)


Simulated annealing l.jpg
Simulated Annealing

  • f:X!, minx2X f(x)

  • Proceed in phases i=0,1,2,…,m

  • Temperature Ti = T0(1-)i

  • In phase i, do a random walk with stationary distributioni:i(x) / e-f(x)/Ti

  • i=0: near uniform ! i=m: near optima

Geometric temperature schedule

Boltzmann distribution

Metropolis filter for stationary dist :

From x, pick random neighbor y.

If (y)>(x), move to y.

If (y)·(x) move to y with prob. (y)/(x)


Simulated annealing8 l.jpg
Simulated Annealing

  • Great blind search technique

  • Works well in practice

  • Little theory

    • Exponential time

    • Planted graph bisection [JS93]

    • Fractal functions [S91]


Convex optimization l.jpg
Convex optimization

minimize f(x) = c ¢ x = height

x 2 S = hill

Find the bottom of the hill

using few pokes (membership queries)

Convex and

linear slope


Convex optimization10 l.jpg
Convex optimization

minimize f(x) = c ¢ x = height

x 2 S ½n = hill

Find the bottom of the hill

using few pokes (membership queries)

  • Ellipsoid method: O*(n10) queries

  • Random walks [BV02] O*(n5) queries

Convex and

linear slope

n=# dimensions


Walking in a convex set l.jpg
Walking in a convex set

Metropolis filter for stationary dist:

From x, pick random neighbor y.

If (y)>(x), move to y.

If (y)·(x), move to y

with prob. (y)/(x)



Hit and run l.jpg
Hit and run

  • To sample with stationary dist.

  • Pick a random direction through the point

  • C = S Å line in direction

  • Take a random point from|C

C

S


Hit and run14 l.jpg
Hit and run

  • Start from a point x, random from dist.

  • After O*(n3) steps, you have a new random point, “almost independent” from x [LV03]

  • Difficult analysis

C

S


Random walks for optimization bv02 l.jpg
Random walks for optimization [BV02]

  • Each phase, volume decreases by¼ 2/3

  • In n dimensions, O(n) phases to halve distance to opt.


Annealing is slightly faster l.jpg
Annealing is slightly faster

  • minx 2 S c ¢ x

  • Use distributions:

    • i(x) / e-c¢x/Ti

    • .

    • After O( ) phases, halve distance to opt.

    • That’s compared to O(n) phases [BV02]

Boltzmann distribution

Geometric temperature schedule


Annealing optimality l.jpg
Annealing Optimality

  • Assumptions:

    • Sequence of distributions1,2,…

    • Each density diis log-concave:

    • Consecutive densities di, di+1overlap:

  • Requires at least*( ) phases

  • Simulated Annealing does it in O*( ) phases


Lower bound idea l.jpg
Lower bound idea

  • mean mi = Ei[c ¢ x]

  • variancei2 = Ei[(c ¢ x – mi)2]

  • overlap

  • lemma: mi – mi+1· (i+i+1)ln(2P)

    • follows from log-concavity ofi

    • log-concave ! P(t std dev’s from mean) < e-t

  • In worst case, e.g. cone, small std dev

    • i· (mi - min c ¢ x)/


Worst case a cone l.jpg
Worst case: a cone

  • minx 2 S x0

  • S = { x2n | -x0· x1,x2,…,xn-1· x0 · 10}

  • Uniform dist. on S|x0 < 

    • mean ¼ – /n

    • std dev ¼/n

  • Boltzmann dist. e- x/

    • mean ¼ n

    • std dev ¼

linear program


Any convex shape l.jpg
Any convex shape

  • Fix convex set S and direction c.

  • Fix mean m = E[c ¢ x]

  • d(x)=f(c¢x), log-concave

  • Conjecture:The log-concave distributionover S with largest variancei2 = Ei[(c ¢ x – mi)2] is a Boltzmann dist. (exponential dist.)


Upper bound basics l.jpg
Upper bound basics

  • Dist i/ e-c¢x/Ti

  • Lemma: Ei[c ¢ x] · (minx 2 S c ¢ x ) + n|c|Ti


Upper bound difficulties l.jpg
Upper bound difficulties

  • Not sufficient that distributions overlap

  • An expected warm start:

Shape may change


Shape estimation l.jpg
Shape estimation

Estimate covariance with O*(n) samples

Similar issues with hit and run


Shape re estimation l.jpg
Shape re-estimation

  • Shape estimate is covariance matrix (normalized)

  • OK as long as relative estimates are accurate within a constant factor

  • In most cases shape changes little

    • No need for re-estimation

    • Cube, ball, cone, …

  • In worst case, shape may change every phase

    • Increase run-time by factor of n

    • Differs from simulated annealing


Run time guarantees l.jpg
Run-time guarantees

  • Annealing: O*(n0.5) phases

  • State-of-the-art walks [LV03]

    • Worst case: O*(n) samples per phase(for shape)

    • O*(n3) steps per sample

  • Total: O*(n4.5) (compare to O*(n10) [GLS81], O*(n5) [BV02])


Conclusions l.jpg
Conclusions

  • Random search is useful for convex optimization [BV02]

  • Simulated annealing can be analyzed for convex optimization [KV04]

  • It’s opt among random search procedures

    • Annoying shape re-estimation

    • Difficult analyses of random walks [LV02]

  • Weird: no local minima!

  • Analyzed for other problems?


Reverse annealing lv03 l.jpg
Reverse annealing [LV03]

  • Start near single point v

  • Idea

    • Sample from density / e-|x-v|/Ti in phase i

    • Temperature increases

    • Move from single point to uniform dist

    • Estimate volume increase each time

  • Able to do in O*(n4) rather than O(n4.5)

  • Similar algorithm analysis