- 223 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'simulated annealing for convex optimization' - JasminFlorian

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Simulated annealing for convex optimization

Adam . Kalai: TTI-Chicago

Santosh Vempala: MIT

Bar Ilan University

2004

100-million dollar endowment (thanks, Toyoda!)

- 12 tenure-track slots, 18 visitors
- On University of Chicago campus
- Optional teaching
- Advising graduate students

Outline

Simulated annealing gives the best known run-time guarantees for this problem.

It is optimal among a class of random search techniques.

- Simulated annealing
- A method for blind search:
- f:X!, minx2X f(x)
- Neighbor structure N(x) µ X
- Useful in practice
- Difficult to analyze
- A generalization of linear programming
- Minimize a linear function over a convex set S ½ n
- Example: min 2x1+5x2-11x3with x12+5x22+3x32· 1
- Set S specified by membership oracle M: n! {0,1}
- M(x) = 1 $ x 2 S
- Difficult, cannot use most linear programming techniques [GLS81,BV02]

In high dimensions

Simulated Annealing [KGV83]

Phase 1: Hot (Random)

Phase 2: Warm (Bias down)

Phase 3: Cold (Descend)

Phase 1: Hot (Random)

Phase 2: Warm (Bias down)

Phase 3: Cold (Descent)

Simulated Annealing

- f:X!, minx2X f(x)
- Proceed in phases i=0,1,2,…,m
- Temperature Ti = T0(1-)i
- In phase i, do a random walk with stationary distributioni:i(x) / e-f(x)/Ti
- i=0: near uniform ! i=m: near optima

Geometric temperature schedule

Boltzmann distribution

Metropolis filter for stationary dist :

From x, pick random neighbor y.

If (y)>(x), move to y.

If (y)·(x) move to y with prob. (y)/(x)

Simulated Annealing

- Great blind search technique
- Works well in practice
- Little theory
- Exponential time
- Planted graph bisection [JS93]
- Fractal functions [S91]

Convex optimization

minimize f(x) = c ¢ x = height

x 2 S = hill

Find the bottom of the hill

using few pokes (membership queries)

Convex and

linear slope

Convex optimization

minimize f(x) = c ¢ x = height

x 2 S ½n = hill

Find the bottom of the hill

using few pokes (membership queries)

- Ellipsoid method: O*(n10) queries
- Random walks [BV02] O*(n5) queries

Convex and

linear slope

n=# dimensions

Walking in a convex set

Metropolis filter for stationary dist:

From x, pick random neighbor y.

If (y)>(x), move to y.

If (y)·(x), move to y

with prob. (y)/(x)

Hit and run

- To sample with stationary dist.
- Pick a random direction through the point
- C = S Å line in direction
- Take a random point from|C

C

S

Hit and run

- Start from a point x, random from dist.
- After O*(n3) steps, you have a new random point, “almost independent” from x [LV03]
- Difficult analysis

C

S

Random walks for optimization [BV02]

- Each phase, volume decreases by¼ 2/3
- In n dimensions, O(n) phases to halve distance to opt.

Annealing is slightly faster

- minx 2 S c ¢ x
- Use distributions:
- i(x) / e-c¢x/Ti
- .
- After O( ) phases, halve distance to opt.
- That’s compared to O(n) phases [BV02]

Boltzmann distribution

Geometric temperature schedule

Annealing Optimality

- Assumptions:
- Sequence of distributions1,2,…
- Each density diis log-concave:
- Consecutive densities di, di+1overlap:
- Requires at least*( ) phases
- Simulated Annealing does it in O*( ) phases

Lower bound idea

- mean mi = Ei[c ¢ x]
- variancei2 = Ei[(c ¢ x – mi)2]
- overlap
- lemma: mi – mi+1· (i+i+1)ln(2P)
- follows from log-concavity ofi
- log-concave ! P(t std dev’s from mean) < e-t
- In worst case, e.g. cone, small std dev
- i· (mi - min c ¢ x)/

Worst case: a cone

- minx 2 S x0
- S = { x2n | -x0· x1,x2,…,xn-1· x0 · 10}
- Uniform dist. on S|x0 <
- mean ¼ – /n
- std dev ¼/n
- Boltzmann dist. e- x/
- mean ¼ n
- std dev ¼

linear program

Any convex shape

- Fix convex set S and direction c.
- Fix mean m = E[c ¢ x]
- d(x)=f(c¢x), log-concave
- Conjecture:The log-concave distributionover S with largest variancei2 = Ei[(c ¢ x – mi)2] is a Boltzmann dist. (exponential dist.)

Upper bound basics

- Dist i/ e-c¢x/Ti
- Lemma: Ei[c ¢ x] · (minx 2 S c ¢ x ) + n|c|Ti

Upper bound difficulties

- Not sufficient that distributions overlap
- An expected warm start:

Shape may change

Shape re-estimation

- Shape estimate is covariance matrix (normalized)
- OK as long as relative estimates are accurate within a constant factor
- In most cases shape changes little
- No need for re-estimation
- Cube, ball, cone, …
- In worst case, shape may change every phase
- Increase run-time by factor of n
- Differs from simulated annealing

Run-time guarantees

- Annealing: O*(n0.5) phases
- State-of-the-art walks [LV03]
- Worst case: O*(n) samples per phase(for shape)
- O*(n3) steps per sample
- Total: O*(n4.5) (compare to O*(n10) [GLS81], O*(n5) [BV02])

Conclusions

- Random search is useful for convex optimization [BV02]
- Simulated annealing can be analyzed for convex optimization [KV04]
- It’s opt among random search procedures
- Annoying shape re-estimation
- Difficult analyses of random walks [LV02]
- Weird: no local minima!
- Analyzed for other problems?

Reverse annealing [LV03]

- Start near single point v
- Idea
- Sample from density / e-|x-v|/Ti in phase i
- Temperature increases
- Move from single point to uniform dist
- Estimate volume increase each time
- Able to do in O*(n4) rather than O(n4.5)
- Similar algorithm analysis

Download Presentation

Connecting to Server..