slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Paul Hovland (Argonne National Laboratory) Steven Lee (Lawrence Livermore National Laboratory) PowerPoint Presentation
Download Presentation
Paul Hovland (Argonne National Laboratory) Steven Lee (Lawrence Livermore National Laboratory)

Loading in 2 Seconds...

play fullscreen
1 / 43

Paul Hovland (Argonne National Laboratory) Steven Lee (Lawrence Livermore National Laboratory) - PowerPoint PPT Presentation


  • 131 Views
  • Uploaded on

Challenges and Opportunities in Using Automatic Differentiation with Object-Oriented Toolkits for Scientific Computing. Paul Hovland (Argonne National Laboratory) Steven Lee (Lawrence Livermore National Laboratory) Lois McInnes (ANL) Boyana Norris (ANL) Barry Smith (ANL).

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Paul Hovland (Argonne National Laboratory) Steven Lee (Lawrence Livermore National Laboratory)' - eddy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Challenges and Opportunities in Using Automatic Differentiation with Object-Oriented Toolkits for Scientific Computing

Paul Hovland (Argonne National Laboratory)

Steven Lee (Lawrence Livermore National Laboratory)

Lois McInnes (ANL)

Boyana Norris (ANL)

Barry Smith (ANL)

The Computational Differentiation Project

at Argonne National Laboratory

acknowledgments
Acknowledgments
  • Jason Abate
  • Satish Balay
  • Steve Benson
  • Peter Brown
  • Omar Ghattas
  • Lisa Grignon
  • William Gropp
  • Alan Hindmarsh
  • David Keyes
  • Jorge Moré
  • Linda Petzold
  • Widodo Samyono
outline
Outline
  • Intro to AD
  • Survey of Toolkits
    • SensPVODE
    • PETSc
    • TAO
  • Using AD with Toolkits
    • Toolkit Level
    • Parallel Function Level
    • Subdomain Level
    • Element/Vertex Function Level
  • Experimental Results
  • Conclusions and Expectations
automatic differentiation
Automatic Differentiation
  • Technique for augmenting code for computing a function with code for computing derivatives
  • Analytic differentiation of elementary operations/functions, propagation by chain rule
  • Can be implemented using source transformation or operator overloading
  • Two main modes
    • Forward: propagates derivatives from independent to dependent variables
    • Reverse (adjoint): propagates derivatives from dependent to independent variables
comparison of methods
Comparison of Methods
  • Finite Differences
    • Advantages: cheap Jv, easy
    • Disadvantages: inaccurate (not robust)
  • Hand Differentiation
    • Advantages: accurate; cheap Jv, JTv, Hv, …
    • Disadvantages: hard; difficult to maintain consistency
  • Automatic Differentiation
    • Advantages: cheap JTv, Hv; easy?
    • Disadvantages: Jv costs ~2 function evals; hard?
pvode parallel ode ivp solver
PVODE: Parallel ODE-IVP solver
  • Algorithm developers:

Hindmarsh, Byrne, Brown and Cohen

  • ODE Initial-Value Problems
  • Stiff and non-stiff integrators
  • Written in C
  • MPI calls for communication
pvode for ode simulations
ODE Initial-Value Problem (standard form):

Implicit time-stepping using BDF methods for y(tn)

Solve nonlinear system for y(tn)via Inexact Newton

Solve update to Newton iterate using Krylov methods

PVODE for ODE simulations
objective
Objective

F(y,p)

p

ODE Solver

(PVODE)

y|t=t1, t2, ...

y|t=0

automatically

Sensitivity

Solver

y|t=t1, t2, ...

p

dy/dp|t=t1, t2, ...

y|t=0

possible approaches
Possible Approaches

Apply AD to PVODE:

ad_F(y,ad_y,p,ad_p)

p,ad_p

ad_PVODE

y, ad_y|t=t1, t2, ...

y, ad_y|t=0

Solve sensitivity eqns:

ad_F(y,ad_y,p,ad_p)

p

SensPVODE

y, dy/dp |t=t1, t2, ...

y|t=0

sensitivity differential equations
Sensitivity Differential Equations
  • Differentiate y= f(t, y, p) with respect to pi:
  • A linearODE-IVP for the sensitivity vector si(t) :
petsc
PETSc
  • Portable, Extensible Toolkit for Scientific computing
  • Parallel
  • Object-oriented
  • Free
  • Supported (manuals, email)
  • Interfaces with Fortran 77/90, C/C++
  • Available for virtually all UNIX platforms, as well as Windows 95/NT
  • Flexible: many options for solver algorithms and parameters
nonlinear pde solution

Nonlinear Solvers (SNES)

Solve

F(u) = 0

Linear Solvers (SLES)

PETSc

PC

KSP

Application

Initialization

Function

Evaluation

Jacobian

Evaluation

Post-

Processing

User code

PETSc code

AD-generated code

Nonlinear PDE Solution

Main Routine

slide13
TAO

The process of nature by which all things change and

which is to be followed for a life of harmony

  • Object-oriented techniques
  • Component-based (CCA) interaction
  • Leverage existing parallel computing infrastructure
  • Reuse of external toolkits

The Right Way

Toolkit for advanced optimization

tao goals
TAO Goals
  • Portability
  • Performance
  • Scalable parallelism
  • An interface independent of architecture
tao algorithms
TAO Algorithms
  • Unconstrained optimization
    • Limited-memory variable-metric method
    • Trust region/line search Newton method
    • Conjugate-gradient method
    • Levenberg-Marquardt method
  • Bound-constrained optimization
    • Trust region Newton method
    • Gradient projection/conjugate gradient method
  • Linearly-constrained optimization
    • Interior-point method with iterative solvers
petsc and tao
PETSc and TAO

Application Driver

Optimization Tools

Toolkit for

Advanced

Optimization

(TAO)

Linear Solvers

Matrices

PC

KSP

Vectors

Application

Initialization

Function & Gradient

Evaluation

Hessian

Evaluation

Post-

Processing

PETSc code

User code

TAO code

using ad with toolkits
Using AD with Toolkits
  • Apply AD to toolkit to produce derivative-enhanced toolkit
  • Use AD to provide Jacobian/Hessian/gradient for use by toolkit. Apply AD at
    • Parallel Function Level
    • Subdomain Function Level
    • Element/Vertex Function Level
differentiated version of toolkit
Differentiated Version of Toolkit
  • Makes possible sensitivity analysis, black-box optimization of models constructed using toolkit
  • Can take advantage of high-level structure of algorithms, providing better performance: see Andreas’ and Linda’s talks
  • Ongoing work with PETSc and PVODE
levels of function evaluation
Levels of Function Evaluation

int FormFunction(SNES snes,Vec X,Vec F,void *ptr){ Parallel Function Level

/* Variable declarations omitted */

mx = user->mx; my = user->my; lambda = user->param; Subdomain Function Level

hx = one/(double)(mx-1); hy = one/(double)(my-1);

sc = hx*hy*lambda; hxdhy = hx/hy; hydhx = hy/hx;

ierr = DAGlobalToLocalBegin(user->da,X,INSERT_VALUES,localX);CHKERRQ(ierr);

ierr = DAGlobalToLocalEnd(user->da,X,INSERT_VALUES,localX);CHKERRQ(ierr);

ierr = VecGetArray(localX,&x);CHKERRQ(ierr); ierr = VecGetArray(localF,&f);CHKERRQ(ierr);

ierr = DAGetCorners(user->da,&xs,&ys,PETSC_NULL,&xm,&ym,PETSC_NULL);CHKERRQ(ierr);

ierr = DAGetGhostCorners(user->da,&gxs,&gys,PETSC_NULL,&gxm,&gym,PETSC_NULL);CHKERRQ(ierr);

for (j=ys; j<ys+ym; j++) {

row = (j - gys)*gxm + xs - gxs - 1;

for (i=xs; i<xs+xm; i++) {

row++;

if (i == 0 || j == 0 || i == mx-1 || j == my-1) {f[row] = x[row]; continue;} Vertex/Element Function Level

u = x[row];

uxx = (two*u - x[row-1] - x[row+1])*hydhx;

uyy = (two*u - x[row-gxm] - x[row+gxm])*hxdhy;

f[row] = uxx + uyy - sc*PetscExpScalar(u);

}

}

ierr = VecRestoreArray(localX,&x);CHKERRQ(ierr); ierr = VecRestoreArray(localF,&f);CHKERRQ(ierr);

ierr = DALocalToGlobal(user->da,localF,INSERT_VALUES,F);CHKERRQ(ierr);

ierr = PLogFlops(11*ym*xm);CHKERRQ(ierr);

return 0;

}

parallel function level
Parallel Function Level
  • Advantages
    • Well-defined interface:int Function(SNES, Vec, Vec, void *);void function(integer, Real, N_Vector, N_Vector, void *);
    • No changes to function
  • Disadvantages
    • Differentiation of toolkit support functions (may result in unnecessary work)
    • AD of parallel code (MPI, OpenMP)
    • May need global coloring
subdomain function level
Subdomain Function Level
  • Advantages
    • No need to differentiate communication functions
    • Interface may be well defined
  • Disadvantages
    • May need local coloring
    • May need to extract from parallel function
using ad with petsc
Using AD with PETSc

Global-to-local

scatter of ghost values

Local Function

computation

Local Function

computation

Parallel function

assembly

Script file

Global-to-local

scatter of ghost values

ADIFOR or ADIC

Coded manually; can be automated

Seed matrix

initialization

Local Jacobian

computation

Local Jacobian

computation

Parallel Jacobian

assembly

using ad with tao

Global-to-local

scatter of ghost values

Local Function

computation

Parallel function

assembly

Using AD with TAO

Local Function

computation

Script file

Global-to-local

scatter of ghost values

ADIFOR or ADIC

Coded manually; can be automated

Seed matrix

initialization

Local gradient

computation

Local gradient

computation

Parallel gradient

assembly

element vertex function level
Element/Vertex Function Level
  • Advantages
    • Reduced memory requirements
    • No need for matrix coloring
  • Disadvantages
    • May be difficult to decompose function to this level (boundary conditions, other special cases)
    • Decomposition to this level may impede efficiency
example
Example

int localfunction2d(Field **x,Field **f,int xs, int xm, int ys, int ym, int mx,int my, void *ptr) {

xints = xs; xinte = xs+xm; yints = ys; yinte = ys+ym;

if (yints == 0) {

j = 0;

yints = yints + 1;

for (i=xs; i<xs+xm; i++) {

f[j][i].u = x[j][i].u;

f[j][i].v = x[j][i].v;

f[j][i].omega = x[j][i].omega + (x[j+1][i].u - x[j][i].u)*dhy;

f[j][i].temp = x[j][i].temp-x[j+1][i].temp;

}

}

if (yinte == my) {

j = my - 1;

yinte = yinte - 1;

for (i=xs; i<xs+xm; i++) {

f[j][i].u = x[j][i].u - lid;

f[j][i].v = x[j][i].v;

f[j][i].omega = x[j][i].omega + (x[j][i].u - x[j-1][i].u)*dhy;

f[j][i].temp = x[j][i].temp-x[j-1][i].temp;

}

}

if (xints == 0) {

i = 0;

xints = xints + 1;

for (j=ys; j<ys+ym; j++) {

f[j][i].u = x[j][i].u;

f[j][i].v = x[j][i].v;

f[j][i].omega = x[j][i].omega - (x[j][i+1].v - x[j][i].v)*dhx;

f[j][i].temp = x[j][i].temp;

}

}

if (xinte == mx) {

i = mx - 1;

xinte = xinte - 1;

for (j=ys; j<ys+ym; j++) {

f[j][i].u = x[j][i].u;

f[j][i].v = x[j][i].v;

f[j][i].omega = x[j][i].omega - (x[j][i].v - x[j][i-1].v)*dhx;

f[j][i].temp = x[j][i].temp - (double)(grashof>0);

}

}

for (j=yints; j<yinte; j++) {

for (i=xints; i<xinte; i++) {

vx = x[j][i].u; avx = PetscAbsScalar(vx);

vxp = p5*(vx+avx); vxm = p5*(vx-avx);

vy = x[j][i].v; avy = PetscAbsScalar(vy);

vyp = p5*(vy+avy); vym = p5*(vy-avy);

u = x[j][i].u;

uxx = (two*u - x[j][i-1].u - x[j][i+1].u)*hydhx;

uyy = (two*u - x[j-1][i].u - x[j+1][i].u)*hxdhy;

f[j][i].u = uxx + uyy - p5*(x[j+1][i].omega-x[j-1][i].omega)*hx;

u = x[j][i].v;

uxx = (two*u - x[j][i-1].v - x[j][i+1].v)*hydhx;

uyy = (two*u - x[j-1][i].v - x[j+1][i].v)*hxdhy;

f[j][i].v = uxx + uyy + p5*(x[j][i+1].omega-x[j][i-1].omega)*hy;

u = x[j][i].omega;

uxx = (two*u - x[j][i-1].omega - x[j][i+1].omega)*hydhx;

uyy = (two*u - x[j-1][i].omega - x[j+1][i].omega)*hxdhy;

f[j][i].omega = uxx + uyy + (vxp*(u - x[j][i-1].omega) + vxm*(x[j][i+1].omega - u)) * hy +(vyp*(u - x[j-1][i].omega) + vym*(x[j+1][i].omega - u)) * hx -p5 * grashof * (x[j][i+1].temp - x[j][i-1].temp) * hy;

u = x[j][i].temp;

uxx = (two*u - x[j][i-1].temp - x[j][i+1].temp)*hydhx;

uyy = (two*u - x[j-1][i].temp - x[j+1][i].temp)*hxdhy;

f[j][i].temp = uxx + uyy + prandtl * ((vxp*(u - x[j][i-1].temp) + vxm*(x[j][i+1].temp - u)) * hy + (vyp*(u - x[j-1][i].temp) + vym*(x[j+1][i].temp - u)) * hx);

}

}

ierr = PetscLogFlops(84*ym*xm);CHKERRQ(ierr);

return 0;

}

experimental results
Experimental Results
  • Toolkit Level – Differentiated PETSc Linear Solver
  • Parallel Nonlinear Function Level – SensPVODE
  • Local Subdomain Function Level
    • PETSc
    • TAO
  • Element Function Level – PETSc
senspvode problem
SensPVODE: Problem
  • Diurnl kinetics advection-diffusion equation
  • 100x100 structured grid
  • 16 processors of a Linux cluster with 550 MHz processors and Myrinet interconnect
petsc applications
PETSc Applications
  • Toy problems
    • Solid fuel ignition: finite difference discretization; Fortran & C variants; differentiated using ADIFOR, ADIC
    • Driven cavity: finite difference discretization; C implementation; differentiated using ADIC
  • Euler code
    • Based on legacy F77code from D. Whitfield (MSU)
    • Finite volume discretization
    • Up to 1,121,320 unknowns
    • Mapped C-H grid
    • Fully implicit steady-state
    • Tools: SNES, DA, ADIFOR
for more information
For More Information
  • PETSc: http://www.mcs.anl.gov/petsc/
  • TAO: http://www.mcs.anl.gov/tao/
  • Automatic Differentiation at Argonne
    • http://www.mcs.anl.gov/autodiff/
    • ADIFOR: http://www.mcs.anl.gov/adifor/
    • ADIC: http://www.mcs.anl.gov/adic/
  • http://www.autodiff.org
10 challenges for pde optimization algorithms
10 challenges for PDE optimization algorithms
  • Problem size
  • Efficiency vs. intrusiveness
  • “Physics-based” globalizations
  • Inexact PDE solvers
  • Approximate Jacobians
  • Implicitly-defined PDE residuals
  • Non-smooth solutions
  • Pointwise inequality constraints
  • Scalability of adjoint methods to large numbers of inequalities
  • Time-dependent PDE optimization
7 non smoothness
7. Non-smoothness
  • PDE residual may not depend smoothly on state variables (maybe not even continuously)
    • Solution-adaptivity
    • Discontinuity-capturing, front-tracking
    • Subgrid-scale models
    • Material property evaluation
    • Contact problems
    • Elasto(visco)plasticity
  • PDE residual may not depend smoothly or continuously on decision variables
    • Solid modeler- and mesh generator-induced