techniques for studying correlation and covariance structure n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Techniques for studying correlation and covariance structure PowerPoint Presentation
Download Presentation
Techniques for studying correlation and covariance structure

Loading in 2 Seconds...

play fullscreen
1 / 40

Techniques for studying correlation and covariance structure - PowerPoint PPT Presentation


  • 373 Views
  • Uploaded on

Techniques for studying correlation and covariance structure. Principal Components Analysis (PCA) Factor Analysis. Principal Component Analysis. Let . have a p -variate Normal distribution . with mean vector . Definition:. The linear combination.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Techniques for studying correlation and covariance structure' - Audrey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
techniques for studying correlation and covariance structure

Techniques for studying correlation and covariance structure

Principal Components Analysis (PCA)

Factor Analysis

slide3

Let

have a p-variate Normal distribution

with mean vector

Definition:

The linear combination

is called the first principal component if

is chosen to maximize

subject to

slide4

Consider maximizing

subject to

Using the Lagrange multiplier technique

Let

slide5

Now

and

slide6

Summary

is the first principal component if

is the eigenvector (length 1)of S associated with the largest eigenvalue l1 of S.

slide7

The complete set of Principal components

Let

have a p-variate Normal distribution

with mean vector

Definition:

The set of linear combinations

are called the principal components of

if

are chosen such that

slide8

and

  • Var(C1) is maximized.
  • Var(Ci) is maximized subject to Ci being independent of C1, …, Ci-1 (the previous i -1 principle components)

Note: we have already shown that

is the eigenvector of S associated with the largest eigenvalue, l1 ,of the covariance matrix and

slide9

We will now show that

is the eigenvector of S associated with the ithlargest eigenvalue, li of the covariance matrix and

Proof (by induction – Assume true for i -1, then prove true for i)

slide10

Now

has covariance matrix

slide11

Hence Ci is independent of C1, …, Ci-1 if

We want to maximize

subject to

Let

slide12

Now

and

slide13

Now

hence

(1)

Also for j < i

Hence fj = 0 for j < I and equation (1) becomes

slide14

are the eignevectors of S associated with the eigenvalues

Thus

and

  • Var(C1) is maximized.
  • Var(Ci) is maximized subject to Ci being independent of C1, …, Ci-1 (the previous i -1 principal components)

where

slide15

Recall any positive matrix, S

where

are eigenvectors of S of length 1 and

are eigenvalues of S.

example
Example

In this example wildlife (moose) population density was measured over time (once a year) in three areas.

picture
picture

Area 3

Area 2

Area 1

the sample statistics
The Sample Statistics

The mean vector

The covariance matrix

The correlation matrix

principal component analysis1
Principal component Analysis

The eigenvalues of S

The eigenvectors of S

The principal components

slide20

Area 3

Area 2

Area 1

slide21

Area 3

Area 2

Area 1

slide22

Area 3

Area 2

Area 1

slide23

Graphical Picture of Principal Components

Multivariate Normal data falls in an ellipsoidal pattern.

The shape and orientation of the ellipsoid is determined by the covariance matrix S.

The eignevectors of S are vectors giving the directions of the axes of the ellopsoid. The eigenvalues give the length of these axes.

slide24
Recall that if S is a positive definite matrix

where P is an orthogonal matrix (P’P = PP’ = I) with the columns equal to the eigenvectors of S.

and D is a diagonal matrix with diagonal elements equal to the eigenvalues of S.

slide26
An orthogonal matrix rotates vectors, thus

rotates the vector

into the vector of Principal components

Also

tr(D) =

slide27
The ratio

denotes the proportion of variance explained by the ithprincipal component Ci.

slide29

Also

where

slide30
Comment:

If instead of the covariance matrix, S, The correlation matrix R, is used to extract the Principal components then the Principal components are defined in terms of the standard scores of the observations:

The correlation matrix is the covariance matrix of the standard scores of the observations:

slide37

Recall:

Computation of the eigenvalues and eigenvectors of S

slide38

continuing we see that:

For large values of n

slide39

The algorithm for computing the eigenvector

  • Compute

rescaling so that the elements do not become to large in value.

i.e. rescale so that the largest element is 1.

  • Compute

using the fact that:

  • Compute l1 using
slide40

Repeat using the matrix

  • Continue with i = 2 , … , p – 1 using the matrix

Example – Using Excel - Eigen