Introduction to introduction to introduction to … Optimization Leonhard Euler … since the fabric of the universe is most perfect and the work of a most wise Creator, nothing at all takes place in the universe in which some rule of maximum or minimumdoes not appear.
Boredom Understanding 1 min 10 mins30mins1 hour Lecture time Optimal listening time for a talk: 8 minutes 25 seconds *
height time Action at a point := Kinetic Energy – Potential Energy. Actionfor a path := Integrate action at points over the path. Nature chooses the path of “least” action! Pierre Louis Moreau de Maupertuis
A + BC AB + C Reference: ‘Holy grails of Chemistry’, Pushpendu Kumar Das. Acknowledgement: Deepika Viswanathan, PhD student, IPC, IISc
Fermat The path taken between two points by a ray of light is the path that can be traversed in the leasttime Gibbs For all thermodynamic processes between the same initial and final state, the delivery of work is a maximumfor a reversible process William of Ockham Among competing hypotheses, the hypothesis with the fewestassumptions should be selected.
Travelling Salesman Problem (TSP) Courtesy: xkcd
A hungry cow is at position (2,2) in a open field. • It is tied to a rope that is 1 unit long. • Grass is at position (4,3) • A perpendicular electric fence passes through the point (2.5,2) • How close can the cow get to the fodder? • What do we want to find? • Position of cow: Let (x,y) be the solution. • What do we want to be solution to satisfy? • Grass: min (x-4)^2 + (y-3)^2 • What restrictions does the cow have? • Rope: (x-2)^2 + (y-2)^2 <= 1 • Fence: x <= 2.5 5 4 3 2 1 0 0 1 2 3 4
Framework • Variables: • (x,y) (position of cow) • Objective : • (x-4)^2 + (y-3)^2 (distance from grass) • Constraints: • (x-2)^2 + (y-2)^2 <= 1 (rope) • x <= 2.5 (fence) minimize/maximize Objective (a function of Variables) subject to Constraints (functions of Variables)
How? • Unconstrained case: • min (x-4)^2 + (y-3)^2 • - Cow starts at (2,2) • Does not know where grass is. Knows only distance from grass. • Needs ‘good’ direction to move from current point. 5 10 5 4 3 2 1 0 1 • The key question in optimization is ‘What is a good direction?’ 0 1 2 3 4
Y = X^2+2 Cur_X = 0.8 Cur_X = -0.5 In general Cur_X > 0 Cur_X < 0 How can you choose d? Derivative! Y New_X = cur_X + d If Cur_X > 0 , want d < 0 Cur_X < 0, want d > 0 X
Y = X^2+2 Derivative at Cur_X: 2(Cur_X) Negative of the derivative does the trick ! Y New_X = cur_X + d If Cur_X > 0 , want d < 0 Cur_X < 0, want d > 0 X
Y = X^2+2 Example: Cur_X = 0.5 d = Negative derivative(Cur_X) = - 2(0.5) = -1 New_X = Cur_X + d = 0.5 -1 = -0.5 Update: Cur_X = New_X = -0.5 d = Negative derivative(Cur_X) = -2(-0.5) = 1 New_X = Cur_X + d = -0.5 + 1 = 0.5 Y What was the problem? “Step Size” X Think: How should you modify step size at every step to avoid this problem?
Objective : (x-4)^2 + (y-3)^2 5 4 3 2 1 0 Algorithm: Gradient descent Start at any position Cur. Find gradient at Cur Cur = Cur – (stepSize)*gradient Repeat y Gradient is the generalization of derivative to higher dimensions Negative gradient at (2,2) = (4,2) Points towards grass! 0 1 2 3 4 x Negative gradient at (1,5) = (6,-4) Points towards grass!
Gradient descent - summary • Gradient descent is the simplest unconstrained optimization procedure. Easy to implement. • If stepSize is chosen properly, it will provably converge to a local minimum • Think: Why doesn’t the gradient descent algorithm always converge to a global minimum? • Think: How to modify the algorithm to find a local maximum? • Host of other methods which pick the ‘direction’ differently • Think: Can you come up with a method that picks a different direction than just the negative gradient?
Constrained Optimization 5 4 3 2 1 0 Real world problems are rarely unconstrained! Need to understand gradients better to understand how to solve them. 0 1 2 3 4
Functions of one variable Let us begin with the Taylor series expansion of a function. For small enough , we have What should the value of d be such that is as small as possible? The negative derivative is the direction of maximumdescent. Important: Any direction such that is a descent direction !
Functions of many variables Any direction such that is a descent direction
Constrained Optimization Minimize f(x) Such that g(x) = 0 Given a point x, Descent direction: Any direction which will lead to a point x’ such that f(x’) < f(x) Feasible direction: Any direction which will lead to a point x’ such that g(x’) = 0 Say somebody gives you a point x* and claims it is the solution to this problem. What properties should this point satisfy? - Must be feasible g(x*) = 0 - There must NOT be a descent direction that is also feasible!
Minimize f(x) Such that g(x) = 0 There must NOT be a descent direction that is also feasible!
Constrained Optimization Problem Minimize f(x) Such that g(x) = 0 UnconstrainedOptimization Problem Minimize f(x) + g(x) (x,
What we did not cover? • Constrained optimization with multiple constraints • Constrained optimization with inequality constraints • Karush-Kuhn-Tucker (KKT) conditions • Linear Programs • Convex optimization • Duality theory • etcetcetc
Summary • Optimization is a very useful branch of applied mathematics • Very well studied, yet there are numerous problems to work on • If interested, we can talk more Thank you !