1 / 23

Introduction to introduction to introduction to … Optimization

Introduction to introduction to introduction to … Optimization . Leonhard Euler. … since the fabric of the universe is most perfect and the work of a most wise Creator, nothing at all takes place in the universe in which some rule of maximum or minimum does not appear. Boredom.

herb
Download Presentation

Introduction to introduction to introduction to … Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to introduction to introduction to … Optimization Leonhard Euler … since the fabric of the universe is most perfect and the work of a most wise Creator, nothing at all takes place in the universe in which some rule of maximum or minimumdoes not appear.

  2. Boredom Understanding 1 min 10 mins30mins1 hour Lecture time Optimal listening time for a talk: 8 minutes 25 seconds *

  3. height time Action at a point := Kinetic Energy – Potential Energy. Actionfor a path := Integrate action at points over the path. Nature chooses the path of “least” action! Pierre Louis Moreau de Maupertuis

  4. A + BC  AB + C Reference: ‘Holy grails of Chemistry’, Pushpendu Kumar Das. Acknowledgement: Deepika Viswanathan, PhD student, IPC, IISc

  5. Fermat The path taken between two points by a ray of light is the path that can be traversed in the leasttime Gibbs For all thermodynamic processes between the same initial and final state, the delivery of work is a maximumfor a reversible process William of Ockham Among competing hypotheses, the hypothesis with the fewestassumptions should be selected.

  6. Travelling Salesman Problem (TSP) Courtesy: xkcd

  7. A hungry cow is at position (2,2) in a open field. • It is tied to a rope that is 1 unit long. • Grass is at position (4,3) • A perpendicular electric fence passes through the point (2.5,2) • How close can the cow get to the fodder? • What do we want to find? • Position of cow: Let (x,y) be the solution. • What do we want to be solution to satisfy? • Grass: min (x-4)^2 + (y-3)^2 • What restrictions does the cow have? • Rope: (x-2)^2 + (y-2)^2 <= 1 • Fence: x <= 2.5 5 4 3 2 1 0 0 1 2 3 4

  8. Framework • Variables: • (x,y) (position of cow) • Objective : • (x-4)^2 + (y-3)^2 (distance from grass) • Constraints: • (x-2)^2 + (y-2)^2 <= 1 (rope) • x <= 2.5 (fence) minimize/maximize Objective (a function of Variables) subject to Constraints (functions of Variables)

  9. How? • Unconstrained case: • min (x-4)^2 + (y-3)^2 • - Cow starts at (2,2) • Does not know where grass is. Knows only distance from grass. • Needs ‘good’ direction to move from current point. 5 10 5 4 3 2 1 0 1 • The key question in optimization is ‘What is a good direction?’ 0 1 2 3 4

  10. Y = X^2+2 Cur_X = 0.8 Cur_X = -0.5 In general Cur_X > 0 Cur_X < 0 How can you choose d? Derivative! Y New_X = cur_X + d If Cur_X > 0 , want d < 0 Cur_X < 0, want d > 0 X

  11. Y = X^2+2 Derivative at Cur_X: 2(Cur_X) Negative of the derivative does the trick ! Y New_X = cur_X + d If Cur_X > 0 , want d < 0 Cur_X < 0, want d > 0 X

  12. Y = X^2+2 Example: Cur_X = 0.5 d = Negative derivative(Cur_X) = - 2(0.5) = -1 New_X = Cur_X + d = 0.5 -1 = -0.5 Update: Cur_X = New_X = -0.5 d = Negative derivative(Cur_X) = -2(-0.5) = 1 New_X = Cur_X + d = -0.5 + 1 = 0.5 Y What was the problem? “Step Size” X Think: How should you modify step size at every step to avoid this problem?

  13. Objective : (x-4)^2 + (y-3)^2 5 4 3 2 1 0 Algorithm: Gradient descent Start at any position Cur. Find gradient at Cur Cur = Cur – (stepSize)*gradient Repeat y Gradient is the generalization of derivative to higher dimensions Negative gradient at (2,2) = (4,2) Points towards grass! 0 1 2 3 4 x Negative gradient at (1,5) = (6,-4) Points towards grass!

  14. Gradient descent - summary • Gradient descent is the simplest unconstrained optimization procedure. Easy to implement. • If stepSize is chosen properly, it will provably converge to a local minimum • Think: Why doesn’t the gradient descent algorithm always converge to a global minimum? • Think: How to modify the algorithm to find a local maximum? • Host of other methods which pick the ‘direction’ differently • Think: Can you come up with a method that picks a different direction than just the negative gradient?

  15. Constrained Optimization 5 4 3 2 1 0 Real world problems are rarely unconstrained! Need to understand gradients better to understand how to solve them. 0 1 2 3 4

  16. Functions of one variable Let us begin with the Taylor series expansion of a function. For small enough , we have What should the value of d be such that is as small as possible? The negative derivative is the direction of maximumdescent. Important: Any direction such that is a descent direction !

  17. Functions of many variables Any direction such that is a descent direction

  18. Constrained Optimization Minimize f(x) Such that g(x) = 0 Given a point x, Descent direction: Any direction which will lead to a point x’ such that f(x’) < f(x) Feasible direction: Any direction which will lead to a point x’ such that g(x’) = 0 Say somebody gives you a point x* and claims it is the solution to this problem. What properties should this point satisfy? - Must be feasible g(x*) = 0 - There must NOT be a descent direction that is also feasible!

  19. Minimize f(x) Such that g(x) = 0 There must NOT be a descent direction that is also feasible!

  20. Constrained Optimization Problem Minimize f(x) Such that g(x) = 0 UnconstrainedOptimization Problem Minimize f(x) + g(x) (x,

  21. What we did not cover? • Constrained optimization with multiple constraints • Constrained optimization with inequality constraints • Karush-Kuhn-Tucker (KKT) conditions • Linear Programs • Convex optimization • Duality theory • etcetcetc

  22. Summary • Optimization is a very useful branch of applied mathematics • Very well studied, yet there are numerous problems to work on • If interested, we can talk more  Thank you !

More Related