An Introduction to Optimal Estimation Theory
Sponsored Links
This presentation is the property of its rightful owner.
1 / 34

An Introduction to Optimal Estimation Theory Chris O´Dell AT652 Fall 2013 PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

An Introduction to Optimal Estimation Theory Chris O´Dell AT652 Fall 2013. The Bible of Optimal Estimation : Rodgers, 2000. The Retrieval PROBLEM. DATA = FORWARD MODEL ( State ) + NOISE. OR. State Vector Variables: Temperatures Cloud Variables Precip Quantities / Types

Download Presentation

An Introduction to Optimal Estimation Theory Chris O´Dell AT652 Fall 2013

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

An Introduction to Optimal Estimation Theory

Chris O´Dell

AT652 Fall 2013

The Bible of Optimal Estimation: Rodgers, 2000

The Retrieval PROBLEM



State Vector Variables:


Cloud Variables

Precip Quantities / Types

Measured Rainfall

Et Cetera


Instrument Noise

Forward Model Error

Sub grid-scale processes




Radar Backscatter

Measured Rainfall

Et Cetera

Forward Model:

Cloud Microphysics


Surface Albedo

Radiative Transfer

Instrument Response

Example: Sample Temperature Retrieval (project 2)

Temperature Profile

Temperature Weighting Functions

  • We must invert this equation to get x, but how to do this:

  • When there are different numbers of unknown variables as measured variables?

  • In the presence of noise?

  • When F(x) is nonlinear (often highly) ?

Retrieval Solution Techniques

  • Direct Solution (requires invertible forward model and same number of measured quantities as unknowns)

  • Multiple Linear Regression

  • Semi-Empirical Fitting

  • Neural Networks

  • Optimal Estimation Theory

Unconstrained Retrieval

Inversion is unstable and solution is essentially amplified noise.

Retrieval with Moderate Constraint

Retrieved profile is a combination of the true and a priori profiles.

Extreme Constraint

Retrieved profile is just the a priori profile, measurements

have no influence on the retrieval.

“Best” x

x from y alone:

x = F-1(y)

Prior x = xa

Mathematical Setup

State Vector Variables:

Temp Profile

(n,1) colvector



(m,1) col vector

Forward Model:

(m,n) matrix

  • Ifweassumethaterrors in eachyaredistributedlike a Gaussian, thenthelikelihoodof a givenxbeingcorrectis proportional to

  • This assumesnopriorconstraint on x.

  • Wecanwritethe2 in a generalizedwayusingmatricesas:

  • This is just a number (a scalar), andreducestothefirstequationwhenSyis diagonal.

Generalized Error Covariance Matrix

  • Variables: y1, y2 , y3 ...

  • 1-sigma errors: 1 , 2 , 3 ...

  • The correlation between y1 and y2 is c12 (between –1 and 1), etc.

  • Then, the Measurement Error Covariance Matrix is given by:

Diagonal Error Covariance Matrix

  • Assumenocorrelations (sometimesistruefor an instrument)

  • Hasvery simple inverse!

Prior Knowledge

  • Prior knowledge about x can be known from many different sources, like other measurements or a weather or climate model prediction or climatology.

  • In order to specify prior knowledge of x, called xa , we must also specify how well we know xa; we must specify the errors on xa .

  • The errors on xa are generally characterized by a Probability Distribution Function (PDF) with as many dimensions as x.

  • For simplicity, people often assume prior errors to be Gaussian; then we simply specify Sa, the error covariance matrix associated with xa .

Example: Temperature Profile Climatology for December over Hilo, Hawaii

P = (1000, 850, 700, 500, 400, 300) mbar

<T> = (22.2, 12.6, 7.6, -7.7, -19.5, -34.1) Celsius

Correlation Matrix:

1.00 0.47 0.29 0.21 0.21 0.16

0.47 1.00 0.09 0.14 0.15 0.11

0.29 0.09 1.00 0.53 0.39 0.24

0.21 0.14 0.53 1.00 0.68 0.40

0.21 0.15 0.39 0.68 1.00 0.64

0.16 0.11 0.24 0.40 0.64 1.00

Covariance Matrix:

2.71 1.42 1.12 0.79 0.82 0.71

1.42 3.42 0.37 0.58 0.68 0.52

1.12 0.37 5.31 2.75 2.18 1.45

0.79 0.58 2.75 5.07 3.67 2.41

0.82 0.68 2.18 3.67 5.81 4.10

0.71 0.52 1.45 2.41 4.10 7.09

A priori error covariance matrix Sa for project 2

  • Asks you to assume a 1-sigma error of 50 K at each level, but these errors are correlated!

  • The correlation falls off exponentially with the distance between two levels, with a decorrelation length l.

  • A couple ways to parameterize this. In the project, use:

The 2 with prior knowledge:

Yes, thatis a scarylookingequation.

But you just codeitup & everything will work!


  • Minimizing the 2 is hard. In general, you can use a look-up table (if you have tabulated values of F(x) ), but if the lookup table approach is not feasible (i.e., it’s too big), then:

  • If the problem is linear (like Project 2), then there is an analytic solution for the x the minimized 2, which we will call

  • If the problem is nonlinear, a minimization technique must be used such as Gauss-Newton iteration, Steepest-Descent, etc.

Linear Forward Model with Prior Constraint

The x the minimizes the cost function is analytic and is given by:

Use for P2!

The associated posterior error covariance matrix, which describes the theoretical errors on the retrieved x, is given by:

Use for P2!

A simple 1D example

You friend measurements the temperature in the room with a simple mercury thermometer.

She estimates the temperature is 20 ± 2 °C.

You have a thermistor which you believe to be pretty accurate. It measures a Voltage V proportional to the temperature T as:

y=V = kT

where k = 3 V/°C, and the measurement error is roughly ± 0.1 V, 1-sigma.

Sy=0.12=0.01 V2


Voltage=1.8± 0.1

Express Measurement Knowledge as a PDF



Prior Knowledge is PDF

Measurement is a PDF

 Posterior is a (narrower) PDF





A Simple Example:

Cloud Optical Thickness and Effective Radius

water cloud forward calculationsfrom Nakajima and King 1990

  • Cloud Optical Thickness and Effective Radius are often derived with a look-up table approach, although errors are not given.

  • The inputs are typically 0.86 micron and 2.06 micron reflectances from the cloud top.

  • x = {reff,  }

  • y = { R0.86 , R2.13 , ... }

  • Forward model must map x to y. Mie Theory, simple cloud droplet size distribution, radiative transfer model.

Simple Solution: The Look-up Table Approach

  • Assume we know .

  • Regardless of noise, find such that

  • But is a vector! Therefore, we minimize the 2 :

  • Can make a look-up table to speed up the search.

is minimized.


xtrue : {reff = 15 m,  = 30 }

Errors determinedbyhowmuchchange in eachparameter (reff ,  ) causesthe2tochangebyoneunit(formally: find theperimeterwhichcontains 68% ofthevolumeoftheposterior PDF)


  • What if the Look-Up Table is too big?

  • What if the errors in y are correlated?

  • How do we account for errors in the forward model?

  • Shouldn’t the output errors be correlated as well?

  • How do we incorporate prior knowledge about x ?

Gauss-Newton Iteration forNonlinear Problems


In general these are updated on every iteration.

Iteration in practice:

  • Not guaranteed to converge.

  • Can be slow, depends on non-linearity of F(x).

  • There are many tricks to make the iteration faster and more accurate.

  • Often, only a few function iterations are necessary.

Error Correlations?

So, again, what is optimal estimation?

Optimal Estimation is a way to infer information about a system, based on observations. It is necessary to be able to simulate the observations, given complete knowledge of the system state.

  • Optimal Estimation can:

  • Combine different observations of different types.

  • Utilize prior knowledge of the system state (climatology, model forecast, etc).

  • Errors are automatically provided, as are error correlations.

  • Estimate the information content of additional measurements.

Applications of Optimal Estimation

  • Retrieval Theory (standard, or using climatology, or using model forecast. Combine radar & satellite. Combine multiple satellites. Etc.)

  • Data Assimilation – optimally combine model forecast with measurements. Not just for weather! Example: carbon source sink models, hydrology models.

  • Channel Selection: can determine information content of additional channels when retrieving certain variables. Example: SIRICE mission is using this technique to select their IR channels to retrieve cirrus IWP.

  • Login