OptimizationMulti-Dimensional Unconstrained OptimizationPart I: Non-gradient Methods
Optimization Methods One-Dimensional Unconstrained Optimization Golden-Section Search Quadratic Interpolation Newton's Method Multi-Dimensional Unconstrained Optimization Non-gradient or direct methods Gradient methods Linear Programming (Constrained) Graphical Solution Simplex Method
Multidimensional Unconstrained Optimization Techniques to find minimum and maximum of f(x1, x2, 3,…, xn) 2 classes of techniques: • Do not require derivative evaluation • Non-gradient or direct methods • Require derivative evaluation • Gradient or descent (or ascent) methods
DIRECT METHODS — Random Search max = -∞ for i = 1 to N for each xi xi = a value randomly selected from a given interval if max < f(x1, x2, 3,…, xn) max =f(x1, x2, 3,…, xn) • N has to be sufficiently large • Random numbers have to be evenly distributed. • Equivalent to selecting evenly distributed points systematically.
Random Search Advantages • Works even for discontinuous and nondifferentiable functions. • More likely to find the global optima rather than the local optima. Disadvantages • As the number of independent variables grows, the task can become onerous. • Not efficient, it does not account for the behavior of underlying function.
Finding the Optima Systematically Basic Idea (Like climbing a mountain) • If we keep moving upward, we will eventually reach the peak. Question • If we start from an arbitrary point, how should we "move" so that we can locate the peak in the "shortest amount of time"? • Good guess of direction toward the peak • Minimize computation Which path should you take? ? You are here. Peak is covered by the cloud.
General Optimization Algorithm All the methods discussed subsequently are iterative methods that can be generalized as: Select an initial point, x0 = ( x1, x2 , …, xn ) for i = 0 to Max_Iteration Select a direction Si xi+1 = Optimal point reached by traveling from xi in the direction of Si Stop loop if
Univariate Search Idea: Travel in alternating directions that are parallel to the coordinate axes. In each direction, we travel until we reach the peak along that direction and then select a new direction.
Univariate Search • More efficient than random search and still doesn’t require derivative evaluation • The basic strategy is: • Change one variable at a time while the other variables are held constant. • Thus problem is reduced to a sequence of one-dimensional searches • The search becomes less efficient as you approach the maximum. (Why?)
Univariate Search – Example • f(x, y) = y – x – 2x2 – 2xy – y2 • Start from (0, 0) Iteration #1 • Current point: (0, 0) • Direction: Along the the y-axis (i.e., x stays unchanged) • Objective: Find y that maximizesf(0, y) = y – y2 • Let g(y) = y – y2 . Solving g'(y) = 0 => 1 – 2y = 0 =>ymax = 0.5 • Next point: (0, 0.5)
Univariate Search – Example • f(x, y) = y – x – 2x2 – 2xy – y2 Iteration #2 • Current point: (0, 0.5), • Direction: Along the the x-axis (i.e., y stays unchanged) • Objective: Find x that maximizes f(x, 0.5) = 0.5 – x – 2x2 – x – 0.25 • Let g(x) = 0.5 – x – 2x2 – x – 0.25. Solving g'(x) = 0 => -1 – 4x – 1 = 0 => xmax = -0.5 • Next point: (-0.5, 0.5),
Univariate Search – Example • f(x, y) = y – x – 2x2 – 2xy – y2 Iteration #3 • Current point: (-0.5, 0.5), • Direction: Along the the y-axis (i.e., x stays unchanged) • Objective: Find y that maximizes f(-0.5, y) = y – (-0.5) – 2(0.25) – 2(-0.5)y - y2 = 2y – y2 • Let g(y) = 2y – y2. Solving g'(y) = 0 => 2 – 2y = 0 => ymax = 1 • Next point: (-0.5, 1) … Repeat until xi+1 = xi or yi+1 = yi or ea < es.
General Optimization Algorithm (Revised) Select an initial point, x0 = ( x1, x2 , …, xn ) for i = 0 to Max_Iteration Select a direction Si Find h such that f (xi + hSi) is maximized xi+1 = xi + hSi Stop loop if
Direction represented as a vector (Review) (x2, y2) (x1, y1) (x3, y3)
Finding optimal point in direction S • Current point: x = ( x1, x2 , …, xn ) • Direction: S = [ s1s2 … sn ]T • Objective: Find h that optimizes f(x + hS) = f (x1 + hs1, x2 + hs2, …, xn + hsn) Note: f is a function of one variable –h.
Finding optimal point in direction S (Example) • f(x, y)=y – x – 2x2 – 2xy – y2 • Current point: (x, y)=(-0.5, 0.5) • Direction: S = [ 0 1 ]T • Objective: Find h that optimizes f (-0.5, 0.5+ h) = (0.5+ h) – (-0.5) – 2 (-0.5)2 – 2 (-0.5)(0.5+ h) – (0.5+ h)2 = 0.5 + h + 0.5 – 0.5 + 0.5 + h – 0.25 – h – h2 = 0.75 + h – h2 • Let g(h) = 0.75 + h – h2 • Solvingg'(h) = 0 => 1 – 2h = 0 => h = 0.5 • Thus the optima in the direction of S from (-0.5, 0.5) is (-0.5, 1)
Univariate Search Algorithm Let Dk = [ d1 d2 … dn]Twhere dk = 1,dj = 0 for j≠kandj, k≤n. e.g.: n = 4, D2 = [ 0 1 0 0 ]T, D4 = [ 0 0 0 1 ]T Univariate Search Algorithm Select an initial point, x0 = ( x1, x2 , …, xn ) for i = 0 to Max_Iteration Si = Djwherej = i mod n + 1 Find h such that f (xi + hSi) is maximized xi+1 = xi + hSi Stop loop if x converges or if the error is small enough
Pattern Search Methods Observation: Lines connecting alternating points (1:3, 2:4, 3:5, etc.) give better indication where the peak is (as compared to the lines parallel to the coordinate axes). The general directions that point toward the optima is also known as the pattern directions. Optimization methods that utilize the pattern directions to improve convergent rate are known as pattern search methods.
Powell's Method Powell’s method (a well-known pattern search methods) is based on the observation that if points 1 and 2 are obtained by one-dimensional searches in the same direction but from different starting points, then, the line formed by 1 and 2 will be directed toward the maximum. The directions represented by such lines are called conjugate directions.
How Powell’s method selects directions ** • Start with initial set of n distinct directions,S, S, …, S[n] • Let counter[k] be the number of times S[k] is used. • Initially, counter[k] = 0 for all k = 1, 2, …, n Si = S[j] where j = i mod n + 1 xi+1 = optimum point traveled fromxiin the direction Si counter[j] = counter[j] + 1 if (counter[j]== 2) S[j] = direction defined by xi+1andxi+1–n counter[j]= 0 i.e., Each direction in the set, after being used twice, is replaced immediately by a new conjugate direction.
Quadratically Convergent • Definition: If an optimization method, using exact arithmetic, can find the optimum point in n steps while optimizing a quadratic function with n variables, the method is called a quadratically convergent method. • If f(x) is a quadratic function, sequential search along conjugate directions will converge quadratically. That is, in a finite number of steps regardless of the starting points.
Conjugate-based Methods • Since general non-linear functions can often be reasonably approximated by a quadratic function, methods based on conjugate directions are usually quite efficient and are in fact quadratically convergent as they approach the optimum.
Summary • Random Search • General algorithm for locating optimum point • Guess direction • Find optimum point in the guessed direction • How to find h such that f (xi + hSi) is maximized • Univariate Search Method • Pattern Search Method