1 / 28

Blind online optimization Gradient descent without a gradient

Blind online optimization Gradient descent without a gradient. Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU. Goal: find x f(x) ¸ max z 2 S f(z) –  = f(x*) - . } . R d. Standard convex optimization. Convex feasible set S ½ < d Concave function f : S ! <. x*.

Ava
Download Presentation

Blind online optimization Gradient descent without a gradient

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Blind online optimizationGradient descent without a gradient Abie Flaxman CMU Adam Tauman Kalai TTI Brendan McMahan CMU

  2. Goal: find x f(x) ¸ maxz2Sf(z) –  = f(x*) -  }  Rd Standard convex optimization Convex feasible set S ½<d Concave function f : S !< x*

  3. Steepest ascent • Move in the direction of steepest ascent • Compute f’(x) (rf(x) in higher dimensions) • Works for convex optimization (and many other problems) x1 x2 x3 x4

  4. Typical application • Company produces certain numbers of cars per month • Vector x 2<d (#Corollas, #Camrys, …) • Profit of company is concave function of production vector • Maximize total (eq. average) profit PROBLEMS

  5. Problem definition and results • Sequence of unknown concave functions • period t:pick xt2 S, find out only ft(xt) • convex Theorem:

  6. Online model expected regret • Holds for arbitrary sequences • Stronger than stochastic model: • f1, f2, …, i.i.d. from D • x* = arg minx2S ED[f(x)]

  7. Outline • Problem definition • Simple algorithm • Analysis sketch • Variations • Related work & applications

  8. First try Zinkevich ’03: If we could only compute gradients… f4(x4) f3(x3) f2(x2) f4 PROFIT f1(x1) f3 f2 f1 x4 x3 x2 x* x1 #CAMRYS

  9. Idea: one point gradient With probability ½, estimate = f(x + )/ With probability ½, estimate = –f(x – )/ PROFIT E[ estimate ] ¼ f’(x) x x- x+ #CAMRYS

  10. d-dimensional online algorithm x3 x4 x1 x2 S

  11. Outline • Problem definition • Simple algorithm • Analysis sketch • Variations • Related work & applications

  12. Analysis ingredients • E[1-point estimate] is gradient of • is small • Online gradient ascent analysis [Z03] • Online expected gradient ascent analysis • (Hidden complications)

  13. 1-pt gradient analysis PROFIT x- x+ #CAMRYS

  14. 1-pt gradient analysis (d-dim) • E[1-point estimate] is gradient of • is small 2 • 1

  15. Online gradient ascent [Z03] (concave, bounded gradient)

  16. Expected gradient ascent analysis • Regular deterministic gradient ascent on gt (concave, bounded gradient)

  17. Hidden complication… S

  18. Hidden complication… S

  19. Hidden complication… S’

  20. Hidden complication… Thin sets are bad S

  21. Hidden complication… Round sets are good …reshape into “isotropic position” [LV03]

  22. Outline • Problem definition • Simple algorithm • Analysis sketch • Variations • Related work & applications

  23. Variations diameter • Works against adaptive adversary • Chooses ft knowing x1, x2, …, xt-1 • Also works if we only get a noisy estimate of ft(xt), i.e. E[ht(xt)|xt]=ft(xt) gradient bound

  24. Related convex optimization Gradient descent, ... Ellipsoid, Random walk [BV02], Sim. annealing [KV05], Finite difference Gradient descent (stoch.) 1-pt. gradient appx. [G89,S97] Finite difference Gradient descent (online) [Z03] 1-pt. gradient appx. [BKM04] Finite difference [Kleinberg04]

  25. S Multi-armed bandit (experts) 2 3 5 1 2 3 5 0 2 2 5 0 2 3 5 0 [R52,ACFS95,…]

  26. Driving to work (online routing) [TW02,KV02, AK04,BM04] 25 Exponentially many paths… Exponentially many slot machines? Finite dimensions Exploration/exploitation tradeoff S

  27. Online product design

  28. Conclusions and future work • Can “learn” to optimize a sequence of unrelated functions from evaluations • Answer to:“What is the sound of one hand clapping?” • Applications • Cholesterol • Paper airplanes • Advertising • Future work • Many players using same algorithm (game theory)

More Related