1 / 28

Qualifier Exam in HPC

Qualifier Exam in HPC. February 10 th , 2010. Quasi-Newton methods. Alexandru Cioaca. Quasi-Newton methods (nonlinear systems). Nonlinear systems: F(x) = 0, F : R n  R n F(x) = [ f i (x 1 ,…, x n ) ] T Such systems appear in the simulation of processes (physical, chemical, etc.)

Download Presentation

Qualifier Exam in HPC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Qualifier Exam in HPC February 10th, 2010

  2. Quasi-Newton methods AlexandruCioaca

  3. Quasi-Newton methods(nonlinear systems) • Nonlinear systems: F(x) = 0, F : Rn  Rn F(x) = [ fi(x1,…,xn) ]T • Such systems appear in the simulation of processes (physical, chemical, etc.) • Iterative algorithm to solve nonlinear systems • Newton’s method != Nonlinear least-squares

  4. Quasi-Newton methods(nonlinear systems) Standard assumptions • F – continuously differentiable in an open convex set D • F – Lipschitz continuous on D • There is x* in D s.t. F(x*)=0, F’(x*) nonsingular Newton’s method: Starting from x0 (initial iterate) xk+1 = xk – F’(xk)-1 * F(xk), {xk}  x* Until termination criterion is satisfied

  5. Quasi-Newton methods(nonlinear systems) • Linear model around xk: Mn(x) = F(xn) + F’(xn)(x-xn) Mn(x) = 0  xn+1 = xn - F’(xn)-1 *F(xn) • Iterates are computed as: F’(xn) * sn = F(xn) xn+1 = xn - sn

  6. Quasi-Newton methods(nonlinear systems) Evaluate F’(xn) • Symbolically • Numerically with finite differences • Automatic differentiation Solve the linear system F’(xn) * sn = F(xn) • Direct solve: LU, Cholesky • Iterative methods: GMRES, CG

  7. Quasi-Newton methods(nonlinear systems) Computation: • F(xk) n scalar functions • F’(xk) n2 scalar functions • LU O(2n3/3) • Cholesky O(n3/3) • Krylov methods (depends on condition number)

  8. Quasi-Newton methods(nonlinear systems) • LU and Cholesky are useful when we want to reuse the factorization (quasi-implicit) • Difficult to parallelize and balance the workload • Cholesky is faster and more stable but needs SPD (!) • For n large, factorization is very impractical (n~106) • Krylov methods contain elements easily parallelizable (updates, inner products, matrix-vector products) • CG is faster and more stable but needs SPD

  9. Quasi-Newton methods(nonlinear systems) Advantages: • Under standard assumptions, Newton’s method converges locally and quadratically • There exists a domain of attraction S which contains the solution • Once the iterates enter S, they stay in S and eventually converge to x* • The algorithm is memoryless (self-corrective)

  10. Quasi-Newton methods(nonlinear systems) Disadvantages: • Convergence depends on the choice of x0 • F’(x) has to be evaluated for each xk • Computation can be expensive: F(xk), F’(xk), sk

  11. Quasi-Newton methods(nonlinear systems) • Implicit schemes for ODEs y’ = f(t,y) Forward Euler: yn+1 = yn + hf(tn,yn) (explicit) Backward Euler: yn+1 = yn + hf(tn+1, yn+1) (implicit) • Implicit schemes need the solution of a nonlinear system (also CN, RK, LMF)

  12. Quasi-Newton methods(nonlinear systems) • How to circumvent evaluating F’(xk) ? • Broyden’s method Bk+1 = Bk + (yk – Bk*sk)*skT / <sk, sk> xk+1 = xk – Bk-1 * F(xk) • Inverse update (Sherman-Morrison formula) Hk+1=Hk+(sk-Hk*yk)*skT*Hk/<sk,Hk*yk> xk+1 = xk – Hk * F(xk) ( sk+1 = xk+1 – xk, yk+1 = F(xk+1) – F(xk) )

  13. Quasi-Newton methods(nonlinear systems) Advantages: • No need to compute F’(xk) • For inverse update – no linear system to solve Disadvantages: • Superlinear convergence • No longer memoryless

  14. Quasi-Newton methods(unconstrained optimization) • Problem: Find the global minimizer of a cost function f : Rn R, x* = arg min f • f differentiable means the problem can be attacked by looking for zeros of the gradient

  15. Quasi-Newton methods(unconstrained optimization) • Descent methods xk+1=xk – λk*Pk*f(xk) Pk = In - steepest descent Pk = 2f(xk)-1 - Newton’s method Pk = Bk-1 - Quasi-Newton • Angle between Pk,f(xk) less than 90 • Bk has to mimic the behavior of the Hessian

  16. Quasi-Newton methods(unconstrained optimization) Global convergence • Line search Step length: backtracking, interpolation Sufficient decrease: Wolfe conditions • Trust regions

  17. Quasi-Newton methods(unconstrained optimization) For Quasi-Newton, Bk has to resemble 2f(xk) • Single-Rank: • Symmetry: • Positive def.: • Inverse update:

  18. Quasi-Newton methods(unconstrained optimization) Computation • Matrix updates, inner products • DFP, PSB 3 matrix-vector products • BFGS 2 matrix-matrix products Storage • Limited memory versions (L-BFGS) • Store {sk, yk} for the last m iterations and recompute H

  19. Further improvements Preconditioning the linear system • For faster convergence one may solve K*Bk*pk = K*F(xk) • If B is spd (and sparse) we can use sparse approximate inverses to generate the preconditioner • This preconditioner can be refined on a subspace of Bk using an algebraic multigrid technique • We need to solve the eigenvalue problem

  20. Further improvements Model reduction • Sometimes the dimension of the system is very large • Smaller model that captures the essence of the original • An approximation of the model variability can be retrieved from an ensemble of forward simulations • The covariance matrix gives the subspace • We need to solve the eigenvalue problem

  21. QR/QL algorithmsfor symmetric matrices • Solves the eigenvalue problem • Iterative algorithm • Uses QR/QL factorization at each step (A=Q*R, Q unitary, R upper triangular) for k = 1,2,.. Ak=Qk*Rk Ak+1=Rk*Qk end • Diagonal of Ak converges to eigenvalues of A

  22. QR/QL algorithmsfor symmetric matrices • The matrix A is reduced to upper Hessenberg form before starting the iterations • Householder reflections (U=I-v*v’) • Reduction is made column-wise • If A is symmetric, it is reduced to tridiagonal form

  23. QR/QL algorithmsfor symmetric matrices • Convergence to a triangular form can be slow • Origin shifts are used to accelerate it for k = 1,2,.. Ak-zk*I=Qk*Rk Ak+1=Rk*Qk+zk*I end • Wilkinson shift • QR makes heavy use of matrix-matrix products

  24. Alternatives to quasi-Newton Inexact Newton methods • Inner iteration – determine a search direction by solving the linear system with a certain tolerance • Only Hessian-vector products are necessary • Outer iteration – line search on the search direction Nonlinear CG • Residual replaced by gradient of cost function • Line search • Different flavors

  25. Alternatives to quasi-Newton Direct search • Does not involve derivatives of the cost function • Uses a structure called simplex to search for decrease in f • Stops when further progress cannot be achieved • Can get stuck in a local minima

  26. More alternatives Monte Carlo • Computational method relying on random sampling • Can be used for optimization (MDO), inverse problems by using random walks • In the case where we have multiple correlated variables, the correlation matrix is spd so we can use Cholesky to factorize it

  27. Conclusions • Newton’s method is a very powerful method with many applications and uses (solving nonlinear systems, finding minima of cost functions). Newton’s method can be used together with many other numerical algorithms (factorizations, linear solvers) • The optimization and parallelization of matrix-vector, matrix-matrix products, decompositions and other numerical methods can have a significant impact in overall performance

  28. Thank you for your time!

More Related