slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Best Practices vs. Misuse of PCA in the Analysis of Climate Variability PowerPoint Presentation
Download Presentation
Best Practices vs. Misuse of PCA in the Analysis of Climate Variability

Loading in 2 Seconds...

play fullscreen
1 / 25

Best Practices vs. Misuse of PCA in the Analysis of Climate Variability - PowerPoint PPT Presentation


  • 148 Views
  • Uploaded on

Best Practices vs. Misuse of PCA in the Analysis of Climate Variability . Bob Livezey Climate Services /Office of Services/NWS/NOAA 30 th Climate Diagnostics and Prediction Workshop State College, PA, October 26, 2005. Outline. Motivation, take-home messages and references

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Best Practices vs. Misuse of PCA in the Analysis of Climate Variability' - neo


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Best Practices vs. Misuse of PCA in the Analysis of Climate Variability

Bob Livezey

Climate Services /Office of Services/NWS/NOAA

30th Climate Diagnostics and Prediction Workshop

State College, PA, October 26, 2005

outline
Outline
  • Motivation, take-home messages and references
  • Preprocessing considerations
  • S-mode example: Mathematics, characteristics, interpretation, testing, and truncation
  • Rotation: Benefits and truncation considerations
  • Conclusions
eigenvector based linear techniques
Eigenvector-BasedLinear Techniques
  • Dealing simultaneously with many time series:
    • Principal Component Analysis (PCA) – efficient representation of the information in multiple time series (time series of gridded maps);
    • Rotation – linear transformation of PCA and other eigenvector based methods to improve the representation;
    • Canonical Correlation Analysis (CCA) – one of the better ways to efficiently represent linearly the relationships between two different time series of gridded maps (say 500 mb heights and surface temperatures).
take home messages
Take-Home Messages
  • PCA is an extremely useful linear tool for data compression, orthogonalization, and filtering
  • PCA results are mathematical and (for even the first mode) don’t necessarily have to have physical relevance
    • Even when the first mode has physical relevance its representation may be flawed (e.g. the “Arctic Oscillation”)
  • PCA results can be critically impacted by choices of domain, grid, scaling, etc.
  • Effective PC truncation requires insight and experimentation
  • Rotation can enhance physical relevance and reduce sampling variability
    • Under- and over-rotation can negate these gains
  • Just because an area on a map has a closed loading contour doesn’t make it part of a “dipole” or “tripole”
references for basic pca and rpca
REFERENCES FOR BASIC PCA AND RPCA
  • Barnston, A. G., and R. E. Livezey, 1987: Classification, seasonality, and persistence of low frequency atmospheric circulation patterns. Mon. Wea. Rev., 115, 1083-1126.
  • Huth, R., 2006: The effect of various methodological options on the detection of leading modes of sea level pressure variability. Tellus, under revision.
  • Jolliffe, I. T., 1995: Rotation of principal components: choice of normalization constraints. J. Appl. Statistics, 22, 29-35.
  • Livezey, R. E., and T. M. Smith, 1999b: Considerations for use of the Barnett and Preisendorfer (1987) algorithm for canonical correlation analysis of climate variations. J. Climate, 12, 303-305.
  • North, G. R., T. L. Bell, and R. F. Cahalan, 1982: Sampling errors in the estimation of empirical orthogonal functions. Mon. Wea. Rev., 110, 699-706.
  • O'Lenic, E., and R. E. Livezey , 1988: Practical considerations in the use of rotated principal components analysis (RPCA) in diagnostic studies of upper_air height fields. Mon. Wea. Rev., 116, 1682-1689.
  • Richman, M. B., 1986: Rotation of principal components. J. Climatology, 6, 293-335.
  • Richman, M. B., and P. J. Lamb, 1985: Climatic pattern analysis of 3- and 7-day summer rainfall in the central United States: Some methodological considerations and a regionalization. J. Clim. Appl. Meteor., 24, 1325-1343.
preparing data
Preparing Data

1. Preprocessing often has major impact on results and their interpretation.

2. PCA results are inherently domain dependent as I

will illustrate later.

3. Standardization means each record has equal weight in variance-based multivariate analyses; ie high latitudes vs tropics, January vs. November.

If this is desirable then PCA should be based on the correlation matrix, if not desirable then the covariance matrix.

preparing data7
Preparing Data

4. PCA should be performed on as narrow a window in the seasonal cycle as sample considerations permit to avoid mixing inhomogeneous climates (like the January vs. November example in 3 above).

5. Area averaged or gridded data often must be weighted in in multivariate analyses:

Smaller areas can influence results as much as larger;

On lat/lon grids density of points (and influence) increase with latitude.

preparing data8
Preparing Data

5. Two ways to treat the problem:

Create an approximate equal area representation (ie CPC megadivisions, Barnston and Livezey, 1987, grid);

Weight the data – generally proportional to the square root of the area.

preparing data9
Preparing Data

5 . If weights are needed and PCA on the correlation matrix is the objective, then standardization should be performed before weighting and then the covariance matrix formed. Otherwise weights are removed in the standardization step.

preparing data10
Preparing Data

6. In EPCA (see below), CCA, etc. maps of variables with greater numbers of data points will have disproportionate influence on the results unless the maps are weighted, ie proportionately to the square root of the ratio of the total variance in all variables to the total variance in the weighted variable (see Livezey and Smith, 1999b).

principal component analysis
Principal Component Analysis
  • Used principally for data compression and filtering, often as first step to other analyses; direct physical interpretation VERY limited.
  • The form most commonly used in climate studies (S-mode) starts with n (t = 1,…,n) maps or groups of maps z with m data points x and the period-of-record means removed; z(x,t).
  • The maps are decomposed into a linear combination of map patterns; the first pattern explains the most variance, the second is orthogonal to the first and explains the second most variance, etc.
principal component analysis12
Principal Component Analysis
  • N=smaller(m,n),
  • z(x,t): Original maps, linear combinations of fixed patterns ei(x) with time-dependent weights ai(t)
  • ai(t): Principal component scores (time series), the projections of the maps onto the eigenvectors
  • ei(x): Principal component loadings (map patterns), also eigenvectors of the covariance matrix of z.
  • λi: Eigenvalues of the covariance matrix of z.
principal component analysis13
Principal Component Analysis

4. Example of first four patterns of 3-day precipitation for May-August over the central US (Richman and Lamb, 1985). The sequence of patterns is seen repeatedly in other analyses and can be considered an artifact of the geometry of PCA:

principal component analysis14
Principal Component Analysis
  • All of the patterns (the e’s) are orthogonal and the leading ones reflect the data points with the most variance. The eigenvaluesgive these variances; the first four for the Richman and Lamb patterns are 11.13%, 9.33%, 5.55%, and 4.54%.
  • Usually (always when the PCA is on the correlation matrix) the numbers on the maps are correlations of the original data series with the corresponding scores, thus their squares represent explained variance. Thus in the latter context:

(a) a point with 0.5 is more than 6 times more important than a point with 0.2, a point with 0.8 more than 7 times more important than one with 0.3, etc.;

(b) summations of the squares over the maps give the total variances listed in 5 above;

(c) comparing the squared central values within closed contours allows practical discrimination between monopoles, dipoles, etc.

principal component analysis15
Principal Component Analysis

7. The time series that go with the patterns (the a’s) are uncorrelated (i.e. not collinear), so they are desirable for multiple linear regression.

8. To compress or filter the data some of the patterns must be thrown out, i.e. the series must be truncated; this is an ART (see O’Lenic and Livezey, 1988 for the best approach I know).

In these applications over-truncation (throwing baby out with the bath water) is of far more concern than under-truncation (retention of some noise). As a pre-step for rotation, CCA, etc., both should be of concern (see below).

principal component analysis16
Principal Component Analysis

9. Physical interpretation of other than the leading PC pattern is usually unwarranted, and this is often the case for the first as well. Richman (1986) shows this for the example in two ways. First he splits the domain in two and does separate PCA on each. Here’s the result for the first PCA mode. Note that the first mode for the southern domain (a monopole covering the domain) is not reproduced in the full domain analysis:

principal component analysis17
Principal Component Analysis

Next he computes the one-point teleconnectionpattern for the largest loading on each pattern. Here’s the result for the second PCA mode. The PCA mode is a dipole, the teleconnection pattern (reflecting the physical covariance structure around the point) a monopole:

principal component analysis18
Principal Component Analysis

10. The North et al. (1982) Test is to determine whether two consecutive patterns can be reasonably interpreted as distinct patterns or separate signals. It assumes the n samples are independent (heuristically adjust downward for dependence):

10. Other kinds of PCA:

Combined (CPCA) – more than one mapped variable;

Extended (EPCA) – group of maps of same variable at different lags to capture pattern evolution (MSSA is a variant);

Rotated (RPCA) – to reduce sampling error and improve physical representiveness.

rotation
Rotation
  • Rotation, ie the linear transformation of a truncated set of patterns (Richman, 1986), should be considered in many problems when patterns with minimum sampling variability, little domain dependence, and increased physical relevance are needed.

2. Note the robustness of rotated patterns in Richman’s split domain example (all patterns are present in both analyses):

rotation20
Rotation

Now compare rotated mode 2 and its corresponding teleconnection pattern (both are monopoles with similar scales):

rotation21
Rotation

3. Barnston and Livezey (1987) compared 120 monthly 700 mb height PCA and RPCA patterns with their corresponding one-point teleconnection patterns – the average pattern correlation was 0.69 and 0.90 respectively. They also used sensitivity tests to demonstrate dramatic reductions in sampling error.

slide22

Barnston and Livezey (1987) RPCA Patterns

Pacific

North

America

North

Atlantic

Oscillation

(a dipole!)

Western

Pacific

Oscillation

Tropical

Northern

Hemisphere

rotation23
Rotation

4. The most likely reason for the success of rotation is the relaxation of the geometrical and mathematical constraints on the analysis, ie the data can speak more for itself.

In a commonly used variant of varimax where the eigenvectors are weighted by the square root of the eigenvalue the resulting patterns do not have to be orthogonal and the resulting time series do not have to be independent (Jolliffe, 1995).

under and over rotation
Under- and Over-Rotation

5. Under-rotation (truncation of too many modes) can result in discarded signal while over-rotation (truncation of too few) can result in over-regionalization of signals (see Olenic and Livezey, 1988).

Map (a) here is a dipole but (b)and (c) are monopoles.

conclusions
Conclusions
  • PCA is an extremely useful linear tool for data compression, orthogonalization, and filtering
  • PCA results are mathematical and (for even the first mode) don’t necessarily have to have physical relevance
    • Even when the first mode has physical relevance its representation may be flawed (e.g. the “Arctic Oscillation”)
  • PCA results can be critically impacted by choices of domain, grid, scaling, etc.
  • Effective PC truncation requires insight and experimentation
  • Rotation can enhance physical relevance and reduce sampling variability
    • Under- and over-rotation can negate these gains
  • Just because an area on a map has a closed loading contour doesn’t make it part of a “dipole” or “tripole”