- 89 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Statistical Forecasting [Part 1]' - keith-schultz

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Statistical Forecasting [Part 1]

69EG3137 – Impacts & Models of Climate Change

Details for Today:

DATE: 25th November 2004

BY: Mark Cresswell

FOLLOWED BY: Literature Tutorial

Lecture Topics

- What is statistical forecasting?
- Simple linear regression
- Multiple linear regression
- Analog forecasting
- Model Output Statistics (MOS) forecasts
- PCA and EOF
- Canonical Correlation Analysis

What is Statistical Forecasting?

In nature, observed phenomena are intrinsically linked to each other by physical processes. Such processes are referred to as the causality or causal links.

In statistical forecasting, we can exploit this causality mathematically by replicating the pattern of change observed for a particular set of conditions. The physical processes represent the forcing and the observed pattern (of weather!) is the direct result.

Different sets of conditions (forcing) will give rise to replicable and specific patterns of weather

What is Statistical Forecasting?

Example: Unusually warm sea-surface temperature (SST) conditions in the Indian Ocean is usually associated with a greater than normal frequency (and magnitude) of tropical cyclones. The forcing here is the increased SSTs whilst the observed pattern is enhanced cyclogenesis.

Cyclone Eline

(Feb 2000)

What is Statistical Forecasting?

Model: Since we know there is a causal link (enhanced energy flux, more evaporation over the ocean, greater convection etc) we can determine a statistical relationship from historical observations

Model: The previous example illustrated how a specific forcing can be seen to alter future weather conditions. We can summarise this relationship mathematically in a regression equation

This type of

Relationship

Is known as

empirical

Model: The regression model informs us of the dependence one variable has on another. Usually, we will select variables that are correlated with one another

We must be careful however when using relationships based purely on a correlation as association is not causation

In children, shoe size may be strongly correlated with reading skills. This does not mean that children who learn to read new words sprout longer feet !

The simple calculation of one variable from another based on a regression equation is known as the method of least squares. Normally we can insert a line of best fit through a scatter-plot of X and Y data pairs. The line that makes the smallest r.m.s (root mean square) error in predicting Y from X is the regression line

The regression line is often referred to as the least squares line. We can use the slope and intercept characteristics of a least squares line to derive constants, m (slope) and b (intercept) that can be used in our linear regression model equation:

The intercept is the height of the least squares line when X is zero. The slope is the rate at which Y increases per unit increase in X

Thus, for any given value of our X variable (and values for m and b which we calculate from observations) we can estimate a value for Y

Often, we will not look at the contribution of a single variable in isolation – but instead a number of predictors will be included in a forecast

Different predictors may be causally related to the same weather phenomena.

If K is the number of predictor variables then:

ŷ = b0 + b1x1 + b2x2 + ····· + bkxk

Linear regression models provide a “fit” for our estimate of y for a number of observations of x

A straight-line fit (simple linear regression) will not go through all points…but a multivariate regression line will be curved thus allowing a better fit and a more accurate estimate of y for a given value of x

Multivariate models are often used on weather prediction to estimate future change based on historical observations of a trend.

Not all objective statistical forecast procedures are based on regression

Some methods were in use prior to the advent of fairly accurate NWP forecasts (12-48hr range). One such method is analog forecasting

Analog forecasting is still in use for long-range (seasonal) forecasting – although the climatological database it uses is deemed to be too short for AF forecasts to be competitive in ordinary short-range weather forecasting

The idea underlying AF is to search the archives of climatological synoptic data for maps closely resembling current observations, and assume that the future evolution of the atmosphere will be similar to the flows that followed the historical analogs

The method is intuitive and gains from the value provided from experienced weather forecasters

AF is limited however as the atmosphere apparently does not exactly repeat itself – so matches can only be approximate

The MOS approach is a preferred method of incorporating NWP forecast information into statistical weather forecasts

The MOS approach has the capacity to include directly into the regression equations the influences of specific characteristics of different NWP models at different time projections in the future

To develop MOS forecast equations it is necessary to have a developmental data set composed of historical records of the predictand, together with archived records of the forecasts produced by the NWP model for the same days on which the predictand was observed

Sometimes we might need to compare sets of variables against patterns of change – and synthesise them

It might be the case that environmental change (shifts in weather patterns) are due to more than one variable. In order to determine the spatial limits of their influence (in a geographical sense) we can use a spatially dependent correlation scheme – called Principal Components Analysis (PCA). The technique allows data reduction

PCA as a technique, became popular following papers by Lorenz in the mid 1950s – who called the technique Empirical Orthogonal Function (EOF) analysis. Both names refer to the same set of procedures

The purpose of PCA is to reduce a data set containing a large number of variables to a new data set containing far fewer new variables – but which nevertheless represent a large fraction of the variability contained in the original data

Following PCA analysis the method provides a number of principal components – which constitute a compact representation of the original data.

PCA can yield substantial insights into both the spatial and temporal variations exhibited by the field or fields being analysed

Canonical Correlation Analysis

CCA is a statistical technique that identifies a sequence of pairs of patterns in two multivariate data sets – and constructs sets of transformed variables by projecting the original data onto these patterns

The patterns are chosen such that the new variables defined by projection of the two data sets onto these patterns exhibit maximum correlation

CCA is an extension of multiple regression models. It is often applied to fields – such as SST or heights of pressure.

- Using library and online literature resources, find paper references for the following topics:
- Natural climate forcing AND sea-level rise
- Anthropogenic climate forcing AND human health
- Dynamical climate modelling AND hydrology/water resources
- Statistical climate modelling AND politics/policies e.g Kyoto

Download Presentation

Connecting to Server..