Loading in 5 sec....

Building Statistical Forecast ModelsPowerPoint Presentation

Building Statistical Forecast Models

- By
**gamba** - Follow User

- 205 Views
- Updated On :

Building Statistical Forecast Models. Wes Wilson MIT Lincoln Laboratory April, 2001. Experiential Forecasting. Idea: Base Forecast on observed outcomes in previous similar situations (training data) Possible ways to evaluate and condense the training data Categorization

Related searches for Building Statistical Forecast Models

Download Presentation
## PowerPoint Slideshow about 'Building Statistical Forecast Models' - gamba

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Experiential Forecasting

- Idea: Base Forecast on observed outcomes in previous similar situations (training data)
- Possible ways to evaluate and condense the training data
- Categorization
- Seek comparable cases, usually expert-based

- Statistical
- Correlation and significance analysis

- Fuzzy Logic
- Combines Expert and Statistical analysis

- Categorization
- Belief: Incremental changes in predictors relate to incremental changes in the predictand
- Issues
- Requirements on the Training Data
- Development Methodology
- Automation

Outline

- Regression-based Models
- Predictor Selection
- Data Quality and Clustering
- Measuring Success
- An Example

Statistical Forecast Models

- Multi-Linear Regression
F = w0 + S wi Pi

wi = Predictor Weighting

w0 = Conditional Climatology Mean Predictor Values

- GAM: Generalized Additive Models
F = w0 + S wi fi(Pi)

fi = Structure Function, determined during regression

- PGAM: Pre-scaled Generalized Additive Models
F = w0 + S wi fi(Pi)

fi = Structure Function, determined prior to regression

- The constant term w0 is conditional climatology less the weighted mean bias of the scaled predictors

Models Based on Regression Regression solutions are obtained by adjusting the parametric description of the forecast model (parameters w) until the objective function J(w) = R2 is minimized Multi-Linear Regression (MLR) MLR is solved by matrix algebra; the most stable solution is provided by the SVD decomposition of A

- Training Data for one predictor
- P vector of predictor values
- E vector of observed events

- Residual
- R2 = || FP – E ||2

- J(w) = || Aw – E ||2

Regression and Correlation

- Training Data for one predictor
- P vector of predictor values
- E vector of observed events
- Error Residual: R2 = || FP – E ||2

- Correlation Coefficient r(P, E) = DP •DE / sDPsDE
- Fundamental Relationship. Let F0 be a forecast equation with error residuals E0 (||E0||=R0). Let W0 + W1 P be a BLUE correction for E0, and let F = F0 + E0 . The error residual RF of F satisfies
- RF2 = R02 [ 1 - r(P, E0)2 ]

Model Training Considerations

- Assumption: The training data are representative of what is expected during the implementation period
- Simple models are less likely to capture undesirable (non-stationary) short-term fluctuations in the training data
- The climatology of the training period should match that expected in the intended implementation period (decade scale)
- It is irrational to expect that short training periods can lead to models with long-term skill
- Plan for repeated model tuning
- Design self-tuning into the system

- It is desirable to have many more training cases than model parameters

The only way to prepare for the future is to prepare to be surprised;

that doesn’t mean we have to be flabbergasted. Kenneth Boulding

GAM

- An established statistical technique, which uses the training data to define nonlinear scaling of the predictors
- Standard implementation represents the structure functions as B-splines with many knots, which requires the use of a large set of training data
- The forecast equations are determined by linear regression including the nonlinear scaling of the predictors
F = w0 + Siwi fi(Pi)

- The objective is to minimize the error residual
- The structure functions are influence by all of the predictors, and may change if the predictor mix is altered
- If a GAM model has p predictors and k knots per structure function, then the regression model has np+1 (linear) regression parameters

PGAM: Pre-scaled GAM The structure function is determined for each predictor separately Composite predictors should be scaled as composites The structure functions often have interpretations in terms of scientific principles and forecasting techniques

- A new statistical technique, which permits the use of training sets that are decidedly smaller than those for GAM
- Once the structure functions are selected, the forecast equations are determined by linear regression of the pre-scaled predictors
F = w0 + S wi fi(Pi)

- Determination of the structure functions is based on enhancing the correlation of the (scaled) predictor with the error residual of conditional climatology
- Maximize r( fi(Pi), DE )

Predictors

- Every Method Involves a Choice of Predictors
- The Great Predictor Set: Everything relevant and available
- Possible Reduction based on Correlation Analysis
- Predictor Selection Strategies
- Sequential Addition
- Sequential Deletion
- Ensemble Decision ( SVD )

- Changing the predictor list changes the model weights; for GAM, it also changes the structure functions

Computing Solutions for the Basic Regression Problem

- Setting: Predictor List { Pi }n and observed outcomes b over the m trials of the training set
- Basic Linear Regression Problem
A w = b

where the columns of the m by n matrix A are the lists of observed predictor values over the trials

- Normal Equations: ATA w = ATb
- Linear Algebra: w = (ATA)-1 Atb
- Optimization: Find x to minimize R2 = | Aw – b |2

SVD – Singular Value Decomposition UT A w = S VT w = UT b Set w = VTw, b = [UTb]n Restatement of the Basic Problem (original problem space) (VT-transformed problem space) Since U is orthogonal, the error residual is not altered by this restatement of the problem

S

0

[ S | 0 ] T =

- A = U S VT where U and V are orthogonal matrices
- and S = [ S | 0 ]T where S is diagonal with positive diagonal entries

- S VT w = b or S w = b

CAUTION: Analysis of Residuals can be misleading unless the

dynamic ranges of the predictor values have been standardized

Structure of the Error Residual Vector

0

0

- Truncated Problem: For i > k , . set wi = 0. This increases the . error residual to
- Rk2 = Sk+1mbi2= R*2+ Sk+1nbi2

- si’s are usually decreasing
- sn > 0, or reduce predictor list
- For i < n, wi = bi / si
- For i > n, there is no solution. This is the portion of the problem that is not resolved by these predictors
- Magnitude of the unresolved portion of the problem: .R*2 = Sn+1mbi2

Sw = b

s1

s2

s3

*

sn

w1

w2

w3

*

wn

b1

b2

b3

*

bn

bn+1

*

*

*

*

bm

=

Controlling Predictor Selection Predictor Nulling: Benefits of predictor nulling

- SVD / PC analysis provides guidance
- Truncation in w space reduces the degrees of freedom
- Truncation does not provide nulling of predictors: . since 0 components of w. do not lead to 0 components of w = V w
- Seek a linear forecast model of the form
- F( a ) = aT w = S wi ai , a is a vector of predictor values

- The ith predictor is eliminated from the problem if wi = 0

- Provides simple models
- Eliminate designated predictors (missing data problem)
- Quantifies the incremental benefit provided by essential predictors (sensor benefit problem)

Predictor Selection Process

- Gross Predictor Selection (availability & correlation)
- SVD for problem sizing an gross error estimation
- Truncation and Predictor Nulling maximal model(s)
( there may be more than one good solution)

- Successive Elimination in the Original Problem Space
minimal model (until SD starts to grow rapidly)

- Successive Augmentation in the Original Problem Space
- At this point, the good solutions are bracketed between the maximal and the minimal models; exhaustive searches are probably feasible, cross validation is wise.

Creating 15z Satellite Forecast Models (1)

- 149 marine stratus days from 1996 to 2000
- 51 sectors and 3 potential predictors per sector (153)
- Compute the correlation for each predictor with the residual from conditional climatology
- Retain only predictors, which have correlation greater than .25, reduces the predictor list to 45 predictors
- Separate analysis for two data sets, Raw and PGAM
- Truncate each when SD reduction drops below 1.5 %

RAW:

PGAM:

SVD Raw 6

PGAM Data

SVD PGAM 6

Creating 15z Satellite Forecast Models (2)Sigma PC 6

1.134

Sigma PC 6

0.999

Sigma

1.148

Sigma

0.999

- SVD Truncate 6 Pred.Nulling
- In the Truncation space:
Null to 7 predictors with acceptable error growth

- Maximal Problems (R-8,P-7)
- Minimal Problems (R-5,P-4)
- Neither problem would accept augmentation according to the strict cross-validation test
- Different predictors were selected

Data Quality and Clustering

- DQA is similar to NWP
- need to do the training set
- probably need to work to tighter standards

- Data Clustering
- During training - manual ++
- For implementation - fully automated

- Conditional Climatology based on Clustering

Satellite Statistical Model (MIT/LL)

- 1-km visible channel (brightness)
- Data pre-processing
- re-mapping to 2 km grid
- 3x3 median smoother
- normalized for sun angle
- calibrated for lens graying

- Grid points grouped into sectors
- topography
- physical forcing
- operational areas

- Sector statistics
- Brightness
- Coverage
- Texture

- 4 year data archive, 153 predictors
- PGAM Regression Analysis

SECTORIZATION

Consensus Forecast

Day Characterization

- Wind direction

- Inversion height

- Forcing influences

COBEL

Forecast Weighting Function

Local SFM

Consensus

Forecast

Regional SFM

Satellite SFM

Conclusions

- PGAM, SVD/PC, and Predictor Nulling provides a systematic way to approach the development of Linear Forecast models via Regression
- This methodology provides a way to investigate the elimination of specific predictors, which could be useful in the development of contingency models
- We are investigating full automation

Download Presentation

Connecting to Server..