a regression model for ensemble forecasts n.
Skip this Video
Loading SlideShow in 5 Seconds..
A Regression Model for Ensemble Forecasts PowerPoint Presentation
Download Presentation
A Regression Model for Ensemble Forecasts

Loading in 2 Seconds...

play fullscreen
1 / 48

A Regression Model for Ensemble Forecasts - PowerPoint PPT Presentation

  • Uploaded on

A Regression Model for Ensemble Forecasts. David Unger Climate Prediction Center. Summary. A linear regression model can be designed specifically for ensemble prediction systems. It is best applied to direct model forecasts of the element in question.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'A Regression Model for Ensemble Forecasts' - kylene

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
a regression model for ensemble forecasts

A Regression Model for Ensemble Forecasts

David Unger

Climate Prediction Center

  • A linear regression model can be designed specifically for ensemble prediction systems.
  • It is best applied to direct model forecasts of the element in question.
  • Ensemble regression is easy to implement and calibrate.
  • This talk will summarize how it works
ensemble forecasting
Ensemble Forecasting

The ensemble forecasting approach is based on the following beliefs:

1) Individual solutions represent possible outcomes.

2) Each ensemble member is equally likely to best represent the observation.

3) The ensemble set behaves as a randomly selected sample from the expected distribution of observations.


An individual case: 5 Potential solutions identified

One actual observation (ovals).

Four others that “could” happen.

Red indicates best (closest) member.

20% chance

20% chance

Potential Observations

20% chance

20% chance

Actual obs


ensemble regression principal assumptions
Ensemble Regression Principal Assumptions
  • Statistics gathered from the one actual obs
  • Math applied with the assumption that each ensemble member could also be a solution.
ensemble regression
“Ensemble” Regression

Best Member

Regression Eq. same as for the Ensemble mean

Residual errors much smaller (usually)

what it means in english
What it means in English?
  • Derive a regression equation relating the ensemble mean and the observation.
  • Apply this equation to each individual member.
  • Apply an error estimate to each individual regression corrected forecast
  • This looks a lot like the “Gaussian Kernel” approach.

(Kernel Dressing)


The regression is computed from similar “statistics” needed for standard linear regression with only two additional array elements related to the ensemble size and spread.

multiple linear regression
Multiple linear regression
  • Theory (applying the ensemble mean equation to individual members) also applies to multiple linear regression PROVIDED all predictors are linear. (Inclusion of binary predictors, interactive predictors etc. will not be theoretically correct).
  • Ensemble regression may be easier to apply to the MOS forecasts in a second step.

(Derive equations, apply them to get a series of forecasts, and do a second step processing of those forecasts)

  • Combines GEFS and Canadian ensembles
  • Bias corrected by EMC (6-hourly)
  • 2 meter temperatures processed by CPC into probability of above-near-below normal categories(5-day means)
naefs kernel density example
NAEFS Kernel Density Example

Probability Density

Standardized Temperature (Z)

long lead consolidation
Long Lead Consolidation

Nino 3.4 SST forecasts

Seasonal Forecast Consolidation

naefs performance

6-10 Day Forecast Reliability

8-14 Day Forecast Reliability

naefs performance1
NAEFS Performance

Official Forecast NAEFS Guidance

climate forecast system version 2 cfsv2
Climate Forecast System Version 2(CFSv2)
  • 4 runs per day 1 every 6 hrs.
  • Lagged ensemble – Ensemble formed from model forecasts from different initial times all valid for the same target period
  • Hindcast data available only every 5th day from 1982-present.
  • Example forecast from Jan 26, 2010.
forecast situation
Forecast Situation
  • El Nino conditions were observed in early 2010.
  • CFS was the first to warn of a La Nina
  • Most models have too little spread (overconfident). This is compensated for by wide kernels.
  • If the mean ensemble spread is too large, adjustments must be made.

CFSv2 Nino 3.4 K=.2

Red – Regression on the ensemble mean. (Standard regression)

Green line – Individual members

Blue Combined envelop


SST ( C )


Unaltered Ensemble Regression K=1.0

Red – Ensmble Mean

Blue – Kernel Env.

Probability Density

Green – Individual members

SST ( C )


K=1.6 Near Max






an information tidbit
An information tidbit
  • Generate N values taken randomly from a Gaussian distributed variable. Label them as the ensemble forecasts. N < 20.
  • Take another value randomly from that same distribution and label it the observation.
  • Do an ensemble regression on it many cases (but not so many that R=0)
  • Question: What happens?

Maintains a fixed ratio (on the average)


Unaltered Ensemble Regression K=1.0

Very Close to Maximum K for 4 a member ensemble.

Red - Ensm

Probability Density

Blue – Kernel Env.

Green – Individual members

SST ( C )

weighting illustration
Weighting (illustration)

Two forecasts (Red = GFS hi-res ensemble mean standard regression error distribution)

Blue = GFS ensembles.

The “Best” forecast in this case is the one with the highest PDF

GFS hi-res

Is Better

GEFS is more likely

to have the best

member if

Obs<26.8 C

weighting continued
Weighting (Continued)
  • Group ensembles into sets of equal skill.

(GEFS, Canadian ensembles, ECMWF ensembles, hi-res GFS, hi-res ECMWF etc)

Pass 1) Calculate PDF’s separately

Pass 2) Choose highest PDF as best. Keep track of percentages.

Pass 3) Enter WEIGHTED ensembles into an ensemble regression. Weights=P(Best)/N

An adaptive regression can do this in real time.

weighted ensemble cfsv2 nino 3 4 ssts lead 6 mo
Weighted Ensemble CFSv2Nino 3.4 SSTs – Lead 6-mo.

Ensemble Group 1 – Jan 26 2010 For August 2010 Wgt: .36

Ensemble Group 2 – Jan 21 2010 For August 2010 Wgt: .36

Ensemble Group 4 – Jan 16 2010 For August 2010 Wgt: .28

  • It is theoretically sound to derive an equation from the ensemble mean and apply it to individual members.
  • An ensemble regression forecast together with its error estimates resembles Gaussian kernel smoothing except members are first processed by the ensemble mean-based regression equation.
  • Additional control can be achieved by adjusting the spread (K-factor). This capability is required for the case where the ensemble spread is too high.
  • Ensemble regression need not require equally weighted members, only that the probability that each member will be closest be estimated.
  • Weighting coefficients can be derived from the PDFs from component models in relation to the observations.
  • The system delivers reliable probabilistic forecasts that are competitive in skill with manual forecasts (better in reliability).