Computacion inteligente
Download
1 / 80

Computacion Inteligente - PowerPoint PPT Presentation


  • 117 Views
  • Uploaded on

Computacion Inteligente. Derivative-Based Optimization. Contents. Optimization problems Mathematical background Descent Methods The Method of Steepest Descent Conjugate Gradient. OPTIMIZATION PROBLEMS.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Computacion Inteligente' - junior


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Computacion inteligente

Computacion Inteligente

Derivative-Based Optimization


Contents
Contents

  • Optimization problems

  • Mathematical background

  • Descent Methods

  • The Method of Steepest Descent

  • Conjugate Gradient



Computacion inteligente

  • Objective function – mathematical function which is optimized by changing the values of the design variables.

  • Design Variables – Those variables which we, as designers, can change.

  • Constraints – Functions of the design variables which establish limits in individual variables or combinations of design variables.


Computacion inteligente

3 basic ingredients…

  • an objective function,

  • a set of decision variables,

  • a set of equality/inequality constraints.

The problem is

to search for the values of the decision variables that minimize the objective function while satisfying the constraints…


Computacion inteligente

Obective

Decision vector

Bounds

constrains

  • Design Variables: decision and objective vector

  • Constraints: equality and inequality

  • Bounds: feasible ranges for variables

  • Objective Function: maximization can be converted to minimization due to the duality principle


Computacion inteligente

  • Identify the quantity or function, f, to be optimized.

  • Identify the design variables: x1, x2, x3, …,xn.

  • Identify the constraints if any exist

    a. Equalities

    b. Inequalities

  • Adjust the design variables (x’s) until f is optimized and all of the constraints are satisfied.


Computacion inteligente

  • Objective functions may be unimodal or multimodal.

    • Unimodal – only one optimum

    • Multimodal – more than one optimum

  • Most search schemes are based on the assumption of a unimodal surface. The optimum determined in such cases is called a local optimum design.

  • The global optimum is the best of all local optimum designs.


Computacion inteligente

  • Existence of global minimum

  • If f(x) is continuous on the feasible set S which is closed and bounded, then f(x) has a global minimum in S

    • A set S is closed if it contains all its boundary pts.

    • A set S is bounded if it is contained in the interior of some circle

compact = closed and bounded


Computacion inteligente

x2

x1


Computacion inteligente

saddle point

local max


Computacion inteligente

  • Derivative-based optimization (gradient based)

    • Capable of determining “search directions” according to an objective function’s derivative information

      • steepest descent method;

      • Newton’s method; Newton-Raphson method;

      • Conjugate gradient, etc.

  • Derivative-free optimization

    • random search method;

    • genetic algorithm;

    • simulated annealing; etc.



Computacion inteligente

The scalar xTMx= is called a quadratic form.

for all x ≠ 0

  • A square matrix M is positive definiteif

  • It is positive semidefiniteif

for all x


Computacion inteligente

  • A symmetric matrix M = MT is positive definite if and only if its eigenvalues λi > 0. (semidefinite ↔ λi ≥ 0)

    • Proof (→): Let vi the eigenvector for the i-th eigenvalue λi

    • Then,

    • which implies λi > 0,

prove that positive eigenvalues imply positive definiteness.


Computacion inteligente

  • Proof. Let’s f be defined as

  • If we can show that f is always positive then M must be positive definite. We can write this as

  • Provided that Ux always gives a non zero vector for all values of x except when x = 0 we can write b = U x, i.e.

  • so f must always be positive

  • Theorem: If a matrix M = UTU then it is positive definite


Computacion inteligente

  • f: Rn→ R is a quadratic function if

    • where Q is symmetric.


Computacion inteligente

Q is symmetric


Computacion inteligente

Q is symmetric


Computacion inteligente

If Q is positive definite, then f is a parabolic “bowl.”


Computacion inteligente


Computacion inteligente

  • Quadratics are useful in the study of optimization.

    • Often, objective functions are “close to” quadratic near the solution.

    • It is easier to analyze the behavior of algorithms when applied to quadratics.

    • Analysis of algorithms for quadratics gives insight into their behavior in general.


Computacion inteligente




Computacion inteligente

  • Definition: A real-valued function f: Rn→ R is said to be continuously differentiable if the partial derivatives

  • exist for each x in Rnand are continuous functions of x.

  • In this case, we say f C1(a smoothfunctionC1)


Computacion inteligente

  • Definition: The gradient of f: in R2→ R:

    It is a function ∇f: R2→ R2given by

In the plane


Computacion inteligente

  • Definition: The gradient of f: Rn→ R is a function ∇f: Rn→ Rngiven by



Computacion inteligente


Computacion inteligente

intuitive: the gradient points at the greatest change direction

Prove it!


Computacion inteligente

  • Proof function infinitesimally:

    • Assign:

    • by chain rule:


Computacion inteligente

  • Proof function infinitesimally:

    • On the other hand for general v:


Computacion inteligente

  • Proposition 2 function infinitesimally: let f: Rn→ R be a smooth function C1 around p,

  • if f has local minimum (maximum) at p then,

Intuitive: necessary for local min(max)


Computacion inteligente

  • Proof function infinitesimally: intuitive


Computacion inteligente


Computacion inteligente

  • The gradient of function infinitesimallyf: Rn→ Rmis a function Df: Rn→ Rm×ngiven by

called Jacobian

Note that for f: Rn→ R , we have ∇f(x) = Df(x)T.


Computacion inteligente

  • If the derivative of ∇ function infinitesimallyf exists, we say that f is twice differentiable.

    • Write the second derivative as D2f (or F), and call it the Hessianof f.


Computacion inteligente


Computacion inteligente

  • Fact function infinitesimally: ∇f(x0) is orthogonal to the level set at x0


Computacion inteligente

  • Proof of fact function infinitesimally:

    • Imagine a particle traveling along the level set.

    • Let g(t) be the position of the particle at time t, with g(0) = x0.

    • Note that f(g(t)) = constant for all t.

    • Velocity vector g′(t) is tangent to the level set.

    • Consider F(t) = f(g(t)). We have F′(0) = 0. By the chain rule,

    • Hence, ∇f(x0) and g′(0) are orthogonal.


Computacion inteligente

  • Suppose function infinitesimallyf: R → R is in C1. Then,

  • o(h) is a term such that o(h) = h → 0 as h → 0.

  • At x0, f can be approximated by a linear function, and the approximation gets better the closer we are to x0.


Computacion inteligente

  • Suppose function infinitesimallyf: R → R is in C2. Then,

  • At x0, f can be approximated by a quadratic function.


Computacion inteligente

  • Suppose function infinitesimallyf: Rn→ R.

    • If f in C1, then

    • If f in C2, then


Computacion inteligente

  • We already know that ∇ function infinitesimallyf(x0) is orthogonal to the level set at x0.

    • Suppose ∇f(x0) ≠ 0.

  • Fact: ∇f points in the direction of increasing f.


Computacion inteligente

  • Consider function infinitesimallyxα = x0 + α∇f(x0), α > 0.

    • By Taylor's formula,

  • Therefore, for sufficiently small ,

    f(xα) > f(x0)


Computacion inteligente

DESCENT METHODS function infinitesimally


Computacion inteligente


Computacion inteligente

Data

Step 0: set i = 0

Step 1: if stop,

else, compute search direction

Step 2: compute the step-size

Step 3: set go to step 1


Computacion inteligente

  • The Theorem properties to the constructive algorithm.:

    • Suppose f: Rn→ R C1 smooth, and exist continuous function: k: Rn→ [0,1], and,

    • And, the search vectors constructed by the model algorithm satisfy:


Computacion inteligente

  • And properties to the constructive algorithm.

  • Then

    • if is the sequence constructed by the algorithm model,

    • then any accumulation pointy of this sequence satisfy:


  • Computacion inteligente

    The principal differences between various descent algorithms lie inthe first procedure for determining successive directions


    Computacion inteligente

    STEEPEST DESCENT properties to the constructive algorithm.


    Computacion inteligente


    Computacion inteligente

    Data

    Step 0: set i = 0

    Step 1: if stop,

    else, compute search direction

    Step 2: compute the step-size

    Step 3: set go to step 1


    Computacion inteligente

    • Theorem minimization technique.:

      • If is a sequence constructed by the SD algorithm, then every accumulation point y of the sequence satisfy:

      • Proof: from Wolfe theorem

    Remark: Wolfe theorem gives us numerical stability if the derivatives aren’t given (are calculated numerically).


    Computacion inteligente

    Note search direction is

    • We are limited to a line search

  • Choose λ to minimize f .

  • . . . directional derivative is equal to zero.


    Computacion inteligente

    • How long a step to take? minimization technique.

      • From the chain rule:

    • Therefore the method of steepest descent looks like this:

    They are orthogonal !


    Computacion inteligente

    Given: minimization technique.

    Find the minimum when x1 is allowed to vary from 0.5 to 1.5 and x2 is allowed to vary from 0 to 2.

    λ arbitrary


    Computacion inteligente

    Given: minimization technique.

    Find the minimum when x1 is allowed to vary from 0.5 to 1.5 and x2 is allowed to vary from 0 to 2.


    Computacion inteligente

    CONJUGATE GRADIENT minimization technique.


    Computacion inteligente

    If A symmetric



    Computacion inteligente

    In general, the solution x lies at the intersection point

    of n hyperplanes, each having dimension n– 1.


    Computacion inteligente

    • What is the minimization technique.problem with steepest descent?

      • We can repeat the same directions over and over…

    • Wouldn’t it be better if, every time we took a step, we got it right the first time?


    Computacion inteligente

    • What is the minimization technique.problem with steepest descent?

      • We can repeat the same directions over and over…

    • Conjugate gradient requires n gradient evaluations and n line searches.


    Computacion inteligente

    solution minimization technique.

    • First, let’s define de error as

    • eiis a vector that indicates how far we are from the solution.

    Start point


    Computacion inteligente

    (should span minimization technique.Rn)

    • Let’s pick a set of orthogonal search directions

    • In each search direction, we’ll take exactly one step,

    that step will be just the right length to line up evenly with


    Computacion inteligente

    • Unfortunately, this method only works if you already know the answer.


    Computacion inteligente


    Computacion inteligente

    • ei+1 should be orthogonal to di


    Computacion inteligente


    Computacion inteligente

    On the other hand


    Computacion inteligente

    So if:


    Computacion inteligente

    The correct choice is


    Computacion inteligente

    Data minimization technique.

    Step 0:

    Step 1:

    Step 2:

    Step 3:

    Step 4: and repeat n times

    • Conjugate gradient algorithm for minimizing f:


    Sources
    Sources minimization technique.

    • J-Shing Roger Jang, Chuen-Tsai Sun and Eiji Mizutani, Slides for Ch. 5 of “Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence”, First Edition, Prentice Hall, 1997.

    • Djamel Bouchaffra. Soft Computing. Course materials. Oakland University. Fall 2005

    • Lucidi delle lezioni, Soft Computing. Materiale Didattico. Dipartimento di Elettronica e Informazione. Politecnico di Milano. 2004

    • Jeen-Shing Wang, Course: Introduction to Neural Networks. Lecture notes. Department of Electrical Engineering. National Cheng Kung University. Fall, 2005


    Sources1
    Sources minimization technique.

    • Carlo Tomasi, Mathematical Methods for Robotics and Vision. Stanford University. Fall 2000

    • Petros Ioannou, Jing Sun, Robust Adaptive Control. Prentice-Hall, Inc, Upper Saddle River: NJ, 1996

    • Jonathan Richard Shewchuk, An Introduction to the Conjugate Gradient Method Without the Agonizing Pain. Edition 11/4. School of Computer Science. Carnegie Mellon University. Pittsburgh. August 4, 1994

    • Gordon C. Everstine, Selected Topics in Linear Algebra. The GeorgeWashington University. 8 June 2004