1 / 20

Lecture II-3: Interpolation and Variational Methods

Lecture II-3: Interpolation and Variational Methods. Lecture Outline: The Interpolation Problem, Estimation Options Regression Methods Linear Nonlinear Input-oriented Bayesian Methods Linear Nonlinear Variational Solutions SGP97 Case Study.

ondrea
Download Presentation

Lecture II-3: Interpolation and Variational Methods

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture II-3: Interpolation and Variational Methods • Lecture Outline: • The Interpolation Problem, Estimation Options • Regression Methods • Linear • Nonlinear • Input-oriented Bayesian Methods • Linear • Nonlinear • Variational Solutions • SGP97 Case Study

  2. A Typical Interpolation Problem -- Groundwater Flow State Eq. (GW Flow Eq) on boundaries Continuous Discretized: Output Eq. Meas Eq. Problem is to characterize unknown heads at nodes on a discrete grid. Estimates rely on scattered head measurements and must be compatible with the groundwater flow equation. Well observation Grid node y = vector of hydraulic head at n grid nodes u = vector of recharge values at n grid nodes (uncertain) T = Scalar transmissivity (assumed known) M = Matrix of coefs. used to interpolate nodal heads to measurement locations z = vector of measurements at n locations  = vector of n measurement errors (uncertain) How can we characterize unknown states (heads) and inputs (recharge) at all nodes?

  3. Options for Solving the Interpolation Problem The two most commonly used options for solving the interpolation problem emphasize point and probabilistic estimates, respectively. • Classical regression approach: Assume the input u is unknown and the measurement error  is random with a zero mean and known covariance C . Adjust nodal values of u to obtain the ‘best’ (e.g. least-squares) fit between the model output w and the measurement z. Given certain assumptions, this ‘point’ estimate may be used to derive probabilistic information about the range of likely states. 2 Bayesian estimation approach: Assume u and  are random vectors described by known unconditional PDFs f u(u) and f (). Derive the conditional PDF of the state f y|z (y|z) or, when this is not feasible, identify particular properties of this PDF. Use this information to characterize the uncertain state variable. Although these methods can lead to similar results in some cases, they are based on different assumptions and have somewhat different objectives. We will emphasize the Bayesian approach.

  4. is the classic least-squares estimate of u. The corresponding least-squares estimate of y is: Classical Regression - Linear Problems In the regression approach the “goodness of fit” between model outputs and observations is measured in terms of the weighted sum-squared errorJLS : When the problem is linear (as in the groundwater example), the state and output are linear functions of the input: In this case the error JLS is a quadratic function of u with a unique minimum which is a linear function of z: Function to minimize: Minimizing u: Note that the matrix [G T C -1 G] has an inverse only when the number of unknowns in u is less than the number of measurements in z.

  5. On iteration k ; k = 1, … kmax The iteration is started with a “first guess” and then continued until the sequence of estimates converges. An estimate of the state y = d(u) is obtained from the converged estimate of y : Classical Regression - Nonlinear Problems When the state and/or measurement vectors are nonlinear functions of the input, the regression approach can be applied iteratively. Suppose that w = g(u). At each iteration the linear estimation equations are used, with the nonlinear model approximated by a first-order Taylor series: Where: Then the least-squares estimation equations at iteration k become: In practice, JLS may have many local minima in the nonlinear case and convergence is not guaranteed (i.e. the estimation problem may be ill-posed).

  6. Derivation of f y|z(u|z) (which is multivariate normal) is straightforward when u and z are jointly normal. This requirement is met in our groundwater example if we assume that: • f ()is multivariate normal with a zero mean and covariance C • The state and measurement equations are linear and the measurement error is additive, so y = D u and z = My +  = MDu +  = G u + . • f u (u)is multivariate normal with specified mean and covariance Cu u • u and  are independent In this case, f y|z(y|z) is completely defined by its mean and covariance, which can be derived from the general expression for a conditional multivariate normal PDF: Bayesian Estimation - Linear Multivariate Normal Problems Bayesian estimation focuses on the conditional PDF f y|z(y|z). This PDF conveys all the information about the uncertain state y contained in the measurement vector z. These expressions are equivalent to those obtained from kriging with a known mean and optimal interpolation, when comparable assumptions are made.

  7. The conditional mean estimate obtained from these expressions can be shown to approach the least-squares estimate when C u u . Derivation of the Unconditional Mean and Covariance - Linear Multivariate Normal Problems The groundwater model enters the Bayesian estimation equations through the unconditional mean and the unconditional covariancesCyz and Czz. These can be derived from the linear state and measurement equations and the specified covariances Cu u and C. An approach similar to the one outlined above can be used to derive the conditional mean and covariance of the uncertain input u.

  8. 0.4 0.2 0 0 2 4 6 8 Contours of = E[y|z] Interpreting Bayesian Estimation Results The conditional PDFs produced in the linear multivariate normal case are not particularly informative in themselves. In practice, it is more useful to examine spatial plots of scalar properties of these PDFs, such as the mean and standard deviation, or plots of the marginal conditional PDFs at particular locations. 1 1 1 f y|z(y|z) 2 3 4 2 y Marginal conditional PDF of y at node 14 Contours ofyy|z The conditional mean is generally used as a point estimate of y while the conditional standard deviation provides a measure of confidence in this estimate. Note that the conditional standard deviation decreases near well locations, reflecting the local information provided by the head measurements.

  9. Bayesian Estimation - Nonlinear Problems When the state and/or measurement vectors are nonlinear functions of the input, the variables y and z are generally not mutivariate normal, even if u and  are normal. In this case, it is difficult to derive the conditional PDF f y|z(y| z) directly. An alternative is to work with f u|z(u| z), the conditional PDF of u. Once f u|z(u| z) is computed it may be possible to use it to derive f y|z(y| z) or some of its properties. The PDF f u|z(u| z) can be obtained from Bayes Theorem: We suppose that f u (u) and f ()are given (e.g. multivariate normal). If the measurement error is additive but the transformations y = d(u) and w = m(y) are nonlinear, then: and the PDF f z|u(z| u) is: In this case, we have all the information required to apply Bayes Theorem.

  10. 0.35 f u|z(u|z) 0.3 Conditional PDF of u (given z) for a scalar (single input) problem 0.25 0.2 0.15 0.1 0.05 0 Mode (peak) 8 10 12 14 0 2 4 6 Obtaining Practical Bayesian Estimates -- The Conditional Mode For problems of realistic size the conditional PDF f u|z(u| z) is difficult to derive in closed form and is too large to store in numerical form. Even when this PDF can be computed, it is difficult to interpret. Usually spatial plots of scalar PDF properties provide the best characterization of the system’s inputs and states. In the nonlinear case it is difficult to derive exact expressions for the conditional mean and standard deviation or for the marginal conditional densities for nonlinear problems. However, it is possible to estimate the conditional mode (maximum) of f u|z(u| z). u

  11. Terms that do not depend on u The estimated mode of f u|z(u| z) is the value of u (represented by ) which minimizes JB. Note that JB is anextended form of the least-square error measureJLS used in nonlinear regression. is found with an iterative search similar to the one used to solve the nonlinear regression problem. This search usually converges better than the regression search because the second term in JB tends to give a better defined minimum. This is sometimes called a regularization term. Deriving the Conditional Mode The conditional mode is derived by noting that the maximum (with respect to u) of the PDF f u|z(u| z) is the same as the minimum of - ln [ f u|z(u| z)] (since - ln[ ] is a monotonically decreasing function of its argument). From Bayes Theorem we have (for additive measurement error): If  and u are multivariate normal this expression may be written as:

  12. Iterative Solution of Nonlinear Bayesian Minimization Problems In spatially distributed problems where the dimension of u is large a gradient-based search is the preferred method for minimizing JB. The search is carried out iteratively, with the new estimate (at the end of iteration k) computed from the old estimate (at the end of iteration k -1) and the gradient of JB evaluated at the old estimate: u2 where: u1 Contours of JB for a problem with 2 uncertain inputs, with search steps shown in red Conventional numerical computation of JB /u using, for example, a finite difference technique, is very time-consuming, requiring order n model runs per iteration, where n is the dimension of u. Variational (adjoint) methods can greatly reduce the effort needed to compute JB /u.

  13. Variational (Adjoint) Methods for Deriving Search Gradients-1 Variational methods obtain the search gradient JB /u indirectly, from the first variation of a modified form of JB. These methods treat the state equation as an equality constraint. This constraint is adjoined to JB with a Lagrange multiplier (or adjoint vector). To illustrate, consider a static interpolation problem with nonlinear state and measurement equations and an additive measurement error: When the state equation is adjoined the part of JB that depends on u is: where  is the Lagrange multiplier (or adjoint) vector. At a local minimum the first variation of JB must equal zero: If  is selected to insure that the first bracketed term is zero then the second bracketed term is the desired gradient JB /u.

  14. Variational (Adjoint) Methods for Deriving Search Gradients - 2 Compute state using input estimate from iteration k-1 Compute adjoint from new state Compute gradient at Compute new input estimate Here the subscripts k-1 on the partial derivatives m /y and d/uindicate that they are evaluated at and , respectively. The variational approach for computing JB /u on iteration k of the search can be summarized as follows: There are many versions of this static variational algorithm, depending on the form used to write the state equation. All of these give the same final result. In particular, all require only one solution of the state equation, together with inversions of the covariance matrices C and Cu u . When these matrices are diagonal (implying uncorrelated input and measurement errors) the inversions are straightforward. When correlation is included they can be computationally demanding.

  15. Case Study Area Aircraft microwave measurements SGP97 Experiment - Soil Moisture Campaign

  16. Mean land-atmosphere boundary fluxes Random input error Soil properties and land use Land surface model Radiative transfer model “True” soil, canopy moisture and temperature “True” radiobrightness Mean initial conditions Random meas. error Estimation error Random initial condition error Variational Algorithm Soil properties and land use, mean fluxes and initial conditions, error covariances Estimated radiobrightness and soil moisture Test of Variational Smoothing Algorithm – SGP97 Soil Moisture Problem Observing System Simulation Experiment (OSSE) “Measured” radiobrightness

  17. Synthetic Experiment (OSSE) based on SGP97 Field Campaign Synthetic experiment uses real soil, landcover, and precipitation data from SGP97 (Oklahoma). Radiobrightness measurements are generated from our land surface and radiative transfer models, with space/time correlated model error (process noise) and measurement error added. SGP97 study area, showing principal inputs to data assimilation algorithm:

  18. Effects of Precipitation Information Variational algorithm performs well even without precipitation information. In this case, soil moisture is inferred only from microwave measurements.

  19. Summary • The Bayesian estimation approach outlined above is frequently used to solve static data assimilation (or interpolation) problems. It has the following notable features: • When the state and measurement equations are linear and inputs and measurements errors are normally distributed the conditional PDFs f y|z(y| z)] and f u|z(u| z) are multivariate normal. In this case the Bayesian conditional mean and Bayesian conditional mode approaches give the same point estimate (i.e. the conditional mode is equal to the conditional mean). • When the problem is nonlinear the Bayesian conditional mean and mode estimates are generally different. The Bayesian conditional mean estimate is generally not practical to compute for nonlinear problems of realistic size. • The least squares approach is generally less likely than the Bayesian approach to converge to a reasonable answer for nonlinear problems since it does not benefit from the “regularization” properties imparted by the second term in JB. • The variational (adjoint) approach greatly improves the computational efficiency of the Bayesian conditional mode estimation algorithm, especially for large problems. • The input-oriented variational approach discussed here is a 3DVAR data assimilation algorithm. This name reflects the fact that 3DVAR is used for problems with variability in three spatial dimensions but not in time. 4DVAR data assimilation methods extend the concepts discussed here to time-dependent (dynamic) problems.

More Related