Introduction to data assimilation: Lecture 3

Introduction to data assimilation: Lecture 3 Saroja Polavarapu Meteorological Research Division Environment Canada PIMS Institute, Victoria, 14-18 July 2008

OUTLINE • Covariance modelling – 2,3 • 4D-Variational assimilation • Nonlinear dynamics • Constrained variational data assimilation

Covariance Modelling • Innovations method • NMC-method • Ensemble method

2. NMC-method • Need global statistics • N. American radiosonde network is only 4000 km in extent defining only up to wavenumber 10. Vertical and horizontal resolution is too coarse. • A posteriori justification: compare resulting statistics with those obtained using other methods

The NMC-method 0 -48 -24 • Compares 24-h and 48-h forecasts valid at same time • Provides global, multivariate corr. with full vertical and spectral resolution • Not used for variances • Assumes forecast differences approximate forecast error Why 24 – 48 ? • 24-h start forecast avoids “spin-up” problems • 24-h period is short enough to claim similarity with 0-6 h forecast error. Final difference is scaled by an empirical factor • 24-h period long enough that the forecasts are dissimilar despite lack of data to update the starting analysis • 0-6 h forecast differences reflect assumptions made in OI background error covariances

A posteriori justification: compare NMC results to innovation-method results Horizontal correlation length scale Innovations NMC Rabier et al. (1998) Hollingsworth and Lonnberg (1986)

Different vertical correlation lengths for different wavenumbers Different horizontal correlation lengths for different vertical levels Rabier et al. (1998) Rabier et al. (1998)

Properties of the NMC-methodBouttier (1994) • For linear H, no model error, 6-h forecast difference, can compare NMC P calc. to what Kalman Filter suggests. • NMC-method breaks down if there is no data between launch of 2 forecasts. With no data P is under-estimated • For dense, good quality hor. uncorr. obs, P is over-estimated • For obs at every gridpoint, where obs and bkgd error variances are equal, the NMC-method P estimate is equivalent to that from the KF.

NMC-method usage *Later replaced by ensemble-based methods

Generate ensemble of N background states 3. Ensemble-based methods of covariance estimation These methods attempt to simulate error of actual assimilation systems by perturbing obs and background states with specified errors and computing ensemble spread Belo Pereira and Berre (2006)

Comparing NMC and ensemble-based method results Horizontal correlation length scales are longer with NMC method temperature vorticity Belo Pereira and Berre (2006)

NMC method Ensemble method Vertical correlations are too deep with NMC method Vertical correlations of temperature background error (at level 21, ~500 hPa) Belo Pereira and Berre (2006)

Y at 250 hPa T at 500 hPa Background error standard deviations Specified NMC STD are independent of longitude Ensemble-based STD show reduced error in data dense regions Time averaged background errors from actual EnsKF is used as reference Buehner (2005)

Ensemble-method usage

2. Four-Dimensional variational data assimilation

Extension to the time dimension 3D DA schemes make sense when all obs are taken at the same time (e.g. radiosondes). But they don’t take full advantage of measurements which have high temporal resolution (satellite obs, profilers, aircraft, etc.).

4D-Variational assimilation Analysis trajectory Background trajectory

The benefit of temporal information 4D-Var experiment with obs every time step at only 1 of 128 grid points Initial guess field misplaces front Dotted red line is 3D-Var solution With time series of obs from 1 station only, the frontal position is corrected

Run model with initial conditions xi0 from t0 to tN • Compute • Compute • Find step size: ri • Modify initial state: Analysis trajectory Background trajectory 4D-Var algorithm

TLM ADJ TLM ADJ

Minimization algorithm http://www-rocq.inria.fr/estime/modulopt/optimization-routines/m1qn3/m1qn3.html • M1QN3 • Gilbert & Lemaréchal 1989 • limited memory quasi-Newton technique (the L-BFGS method of J. Nocedal) • designed for very large scale problems Minimization of a quadratic cost function J(x). The gradient of the cost function and the cost function itself are supplied to a minimization algorithm which determines how to change x to get a lower cost.

4D-Var as described • Assumes NWP model is perfect • Complex nonlinear relationships between analysis variables are permitted • Aids in reducing underdeterminacy problem • Needs TLM and ADJ models for NWP model • DA scheme now intimately tied to NWP model • Is expensive • Adjoint model about 1-2 times CPU of NWP model. One iteration=NWP run + adj run. Typically 50 iterations.

Circled term is 1 column of B matrix, i.e. a vector LHS is a vector Term in ( ) is a scalar Predictability error

Geopotential height analysis increments at the end of a 24-h assimilation period due to 1 obs 500 hPa • 4D-Var single obs experiments show: • The shape of analysis increments depends on location of obs • The spreading of information is flow dependent 1000 hPa 4D-Var: 1 height obs at (42N,170.6E,850 hPa) Changes shape with height 3D-Var: 1 height obs at (42N,180E,500 hPa) No change of shape with height Thépaut et al. (1996)

3D-Var: • Treats obs as if valid at 00,06,12 or 18Z • Uses temporally continuous obs only close to synoptic times • Uses static error covariances • 4D-Var: • uses obs at their actual time of measurement • Uses all temporally continuous obs available within window • evolves error covariances in time Why does 4D-Var beat 3D-Var?

3. Complications due to nonlinear dynamics

Highly nonlinear dynamics Lorenz (1963) equations: for

If assimilation window is too long, 4D-Var fails t=7 Miller et al. (1994)

Length of 4D-Var assimilation window t=8 t=10 t=15 The longer the assimilation window, the greater the number of local minina in the cost function Miller et al. (1994)

Optimal assimilation period • examine ability to “fill in” small scales through downscale energy cascade • barotropic vorticity equation • Perfect model, observations • Initial guess for trajectory is completely decorrelated from truth ~3 days ~12 days Nonlinear time scale is TNL=9 Tanguay et al. (1995)

~1.5 days ~3 days Obs at large scales only Downscale transfer of information to unobserved scales ~6 days ~9 days Upscale propagation of error to observed scales Tanguay et al. (1995)

Incremental Approach Courtier et al. (1994) • TLM will be valid for large scales but not for some smaller scales • So, solve for analysis increments at lower resolution. Write 4D-Var cost in terms of increments (departures from background). • Use of lower resolution filters scales and processes not well forecast by TLM • Forecast model in cost function is then TLM model • Cost function is purely quadratic • Use of lower resolution reduces cost of 4D-Var • Compute the innovation (z-H(x)) at full resolution • Solve a series (2-3) 4D-Var problems, updating the background between each one

4. Constrained variational data assimilation

Does 4D-Var inherently produce balanced analyses? • 4D-Var tries to find the model state which best fits the observations in a time window • The model contains many modes at its disposal, for use in fitting observations: Rossby waves, gravity waves, … • If the obs contain high frequency signals (which they will), the model will use as many gravity waves as needed to fit the obs

Strong Constraints Necessary and sufficient conditions for x0 to be a minimum are: Minimize J(x0) subject to the constraints: Projection onto constraint tangent Gill, Murray, Wright (1981) Hessian of constraints

Large a Small a Penalty Methods: Minimize Weak Constraints

A’ 4DVAR with NNMI: weak constraint Courtier and Talagrand (1990) 4DVAR with NNMI: strong constraint …owing to the iterative and approximate character of the initialization algorithm, the condition || dG/dt || = 0 cannot in practice be enforced as an exact constraint. Courtier and Talagrand (1990) Thépaut and Courtier (1991)

Digital Filter Initialization (DFI) Lynch and Huang (1992) N=12, Dt=30 min Tc=6 h Tc=8 h Fillion et al. (1995)

4DVAR with DFI: Strong Constraint • Because filter is not perfect, some inversion of intermediate scale noise occurs, but DFI as a strong constraint suppresses small scale noise. Polavarapu et al. (2000) 4DVAR with DFI: Weak Constraint • Introduced by Gustafsson (1993) • Weak constraint can control small scale noise (Polavarapu et al. 2000) • Implemented operationally at Météo-France (Gauthier and Thépaut 2001)

Disadvantages of 4D-Var • Model specific (Needs TLM and ADJ) • The U.K. Met Office uses Perturbation Forecast Model and its Adjoint • Assumes NWP model is perfect. • Weak constraint formulations relax this assumption. Already under investigation at ECMWF* (see Tremolet QJ papers) • Expensive. 2-3 x CPU of NWP model per iteration, with ~50 iterations per outer loop • Computing power keeps increasing *European Centre for Medium Range Weather Forecasting

4D-Var Challenges • Obtaining fast, efficient large-scale optimization routines • Extracting analysis error covariance A-1 = B-1 + HTR-1H • Want to know MAMT to learn about forecast error levels • Cycling 4D-Var (Using evolved covariance at end of one assimilation window to start next assimilation cycle.) • Estimating and incorporating model error

Weather centers using 4D-var operationally

ERA-40 reanalyses • model, DAS fixed in time • observing system changes with time • Little improvement over 25 years • Operational system • model, DAS changes with time • observing system changes with time • Big improvement in skill in 25 years must be due to model, DAS improvements. Uppala et al. (2005, QJ)

Exciting but missed topics • Ensemble Kalman Filter • Operational at CMC for Ensemble prediction system • Combining variational and Ensemble techniques • WWRP/THOPEX workshop on 4D-Var and Ensemble Kalman Filter Inter-comparisons, Buenos Aires, Argentina, 10-13 Nov. 2008 http://4dvarenkf.cima.fcen.uba.ar/ • Operational ensemble/variational assimiliation system at Météo-France on July 1, 2008. Ref: Berre et al. (2007)

Final Summary • The atmospheric data assimilation problem is characterized by huge, nonlinear systems and insufficient observations. • Because the math of the linear estimation problem is well known, the key to progress is using atmospheric physics to make the right approximations • There has been considerable improvement in forecast skill in the past 2.5 decades, partly due to improvements in data assimilation systems.

The End

Introduction to data assimilation: Lecture 3