Computing a posteriori covariance in variational DA I.Gejadze, F.-X. Le Dimet, V.Shutyaev

Computing a posteriori covariance in variational DA I.Gejadze, F.-X. Le Dimet, V.Shutyaev

What is a posteriori covariance and why one needs to compute it ? - 1 Model of evolution process can be defined in 2D-3D, can be a vector-function Measure of event - target state Uncertainty in initial, boundary conditions, distributed coefficients, model error …. Uncertainty in the event measure Data assimilation is used to reduce uncertainty in controls / parameters, and eventually in the event measure Objective function (for the initial value control)

What is a posteriori covariance and why one needs to compute it ? – 2(idea of adjoint sensitivities) Trivial relationship between (small) uncertainties in controls / parameters and in the event measure How to compute the gradient? 1. Direct method ( ) : for solve forward model with compute 2. Adjoint method: form the Lagrangian zero Gateaux derivative take variation integrate by parts If = zero and For initial value problem Generally

What is a posteriori covariance and why one needs to compute it ? - 3 - n-vector scalar sensitivity vector Discussion is mainly focused on the question: does the linearised error evolution model (adjoint sensitivity) approximate the non-linear model well? However, another issue is equally (if not more) important: do and how well we really now ? Objective function • the covariance matrix of the background error or a priori covariance • (measure of uncertainty in before DA) • must be a posteriori covariance of the estimation error • (measure of uncertainty in after DA)

How to compute a-posteriori covariance?

Hessian Definitions: Function Gradient , where Hessian For large-scale problems, keeping Hessian or its inverse in the matrix form is not always feasible. If the state vector dimension is , the Hessian contains elements.

Hessian – operator-function product form 1 Control problem: Optimality system: Void optimality system for exact solution

Hessian – operator-function product form - 2 Definition of errors: Non-linear optimality system for errors: There exists a unique representation In the form of non-linear operator equation, we call - the Hessian operator. Hessian operator-function product definition: All operators are defined similarly.

Hessian– operator-function product form - 3 Hessianoperator – is the only one to be inverted in error analysis for any inverse problem, therefore plays the key role in defining ! Since we do not store full matrices, we must find a way to define . We do not know (and do not want to know) , only . Main Theorem Assume errors are normally distributed, unbiased and mutually uncorrelated, and , ; a) if b) if (tangent linear hypothesis is a ‘local’ sufficient condition, not necessary!

Optimal solution error covariance reminder Holds for any control or combination of controls!

Optimal solution error covariance Case I: 1) Errors are normally distributed, 2) is moderately non-linear or are small, i.e. Case II: 1) errors have arbitrary pdf, a) pre-compute (define) . b) produce single errors implementation using certain pdf generators. 2) ‘weak’ non-linear conditions hold c) compute (no inversion involved)! d) generate ensemble of e) consider pdf of , which is not normal! f) use it, if you can. Case III: 1) Errors have arbitrary pdf, • 2) is strongly non-linear (chaotic?) • or/and are very big All as above, but compute by iterative procedure using as a pre-conditioner

Methods for computing the inverse Hessian Direct methods: 1. Fisher information matrix: Requires full matrix storage and algebraic inversion. 2. Sensitivity matrix method (A. Bennett), not suitable for large space-temporal data sets, i.e. good for 3DVar; 3. Finite difference method. Not suitable when constraints are partial differential equations due to large truncation errors; requires full matrix storage and algebraic matrix Inversion. Iterative methods: 4. Lanczos type methods Require only the product . The convergence could be erratic and slow if eigenvalues are no well separated. 5. Quasi-Newton BFGS/LBFGS (generate inverse Hessian as a collateral result) Require only the product . The convergence seems uniform. Required storage can be controlled. Exact method (J. Greenstadt, J.Bard)

Preconditioned LBFGS BFGS algorithm: - pairs of vectors to be kept in memory. LBFGS keeps only M latest pairs and the diagonal. Solve by LBFGS the following auxiliary control problem: for chosen . Retrieve pairs which define projected inverse Hessian . Then, How to get ? For example one can use Cholesky factors .

Stupid, yet 100% non-linear ensemble method 1. Consider function as the exact solution to the problem 2. Assume that and . 3. The solution of the control problem is 4. Start ensemble loop k=1,…,m. Generate , which correspond to the given pdf. Put Solve the non-linear optimization problem defined in 3). Compute . 5. End ensemble loop. 6. Compute statistic

NUMERICAL EXAMPLES: case 1 Model (non-linear convection-diffusion): - driving bc

NUMERICAL EXAMPLES: case 1 downstream boundary upstream boundary

NUMERICAL EXAMPLES: case 2

NUMERICAL EXAMPLES: case 3

SUMMARY • Hessian plays the crucial role in the analysis of the • inverse problem solution errors as the only invertible • operator; • If errors are normally distributed and constraints are • linear the inverse Hessian is itself the covariance operator • (matrix) of the optimal solution error; • If the problem is moderately non-linear, the inverse Hessian could be a good • approximation of the optimal solution error covariance far beyond the validity of • the tangent linear hypothesis. Higher order terms could be considered in problem • canonical decomposition. • 4. Inverse Hessian can be well approximated by a sequence of quasi-Newton • updates (LBFGS) using the operator-vector product only. This sequence seems • to converge uniformly; • 5. Preconditioning dramatically accelerates the computing; • 6. The computational cost of computing inverse Hessian should not exceed the • cost of data assimilation procedure itself. • 7. Inverse Hessian is useful for uncertainty analysis, experiment design, adaptive • measuring techniques, etc.

PUBLICATIONS • I.Yu. Gejadze, V.P. Shutyaev, An optimal control problem of initial data • restoration, Comput. Math. & Math. Physics, 39/9 (1999), pp.1416-1425 • F-X. Le-Dimet, V.P. Shutyaev, On deterministic error analysis in variational • data assimilation, Non-linear Processes in Geophysics 14(2005), pp1-10 • F.-X. Le-Dimet, V.P. Shutyaev, I.Yu. Gejadze, On optimal solution error in • variational data assimilation: theoretical aspects. Russ. J. Numer. Analysis and • Math. Modelling (2006), v21/2, pp.139-152 • 4. I. Gejadze, F-X. Le-Dimet and V. Shutyaev. On analysis error covariances • in variational data assimilation, SIAM J. Sci. Comp. (2008), v.30, no.4, 1847-74. • 5. I. Gejadze, F-X. Le-Dimet and V. Shutyaev. Optimal solution error • covariances in variational data assimilation problems, SIAM J. Sci. Comp. • (2008), to be published. Covariance matrix of the initial control problem for the diffusion equation. 3-sensor configuration

Computing a posteriori covariance in variational DA I.Gejadze, F.-X. Le Dimet, V.Shutyaev

Computing a posteriori covariance in variational DA I.Gejadze, F.-X. Le Dimet, V.Shutyaev

Presentation Transcript

About Wattyl Dimet

f(x)

f(x) = x f(x) = x – 1 f(x) = 2x – 1

a. g(x) = f(x) + 5 b. h(x) = f(x+1)

f ( x ) : a function of a variable ( x )

f(x) + a

f(x)

X ( f )

Computing F

F X

Estimating background-error covariance parameters in a variational ocean data

f (x)

f(x) = x 2

x f

f(-x) = f(x)  Even Function

If f ( x )  0 for a  x  b , then b f ( x ) dx a

f(x)