The Multiple Correlation Coefficient

1 / 37

# The Multiple Correlation Coefficient - PowerPoint PPT Presentation

The Multiple Correlation Coefficient. Definition. has ( p +1)-variate Normal distribution. with mean vector. and Covariance matrix. We are interested if the variable y is independent of the vector.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'The Multiple Correlation Coefficient' - locke

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### The Multiple Correlation Coefficient

Definition

has (p +1)-variate Normal distribution

with mean vector

and Covariance matrix

We are interested if the variable y is independent of the vector

The multiple correlation coefficient is the maximum correlation between y and a linear combination of the components of

Derivation

This vector has a bivariate Normal distribution

with mean vector

and Covariance matrix

We are interested if the variable y is independent of the vector

The multiple correlation coefficient is the maximum correlation between y and a linear combination of the components of

The multiple correlation coefficient is the maximum correlation between y and

The correlation between y and

Thus we want to choose to maximize

Equivalently

The sample Multiple correlation coefficient

Then the sample Multiple correlation coefficient is

Testing for independence between y and

The test statistic

If independence is true then the test statistic F will have an F-distributions with n1 = p degrees of freedom in the numerator and n1 = n – p + 1degrees of freedom in the denominator

The test is to reject independence if:

Other tests for Independence

Test for zero correlation (Independence between a two variables)

The test statistic

If independence is true then the test statistic t will have a t -distributions with n = n –2degrees of freedom.

The test is to reject independence if:

Test for non-zero correlation (H0: r = r0 )

The test statistic

If H0 is true the test statistic z will have approximately a Standard Normal distribution

We then rejectH0 if:

Test for zero partial correlation correlation (Conditional independence between a two variables given a set of p Independent variables)

The test statistic

= the partial correlation between yi and yj given x1, …, xp.

If independence is true then the test statistic t will have a t -distributions with n = n – p - 2degrees of freedom.

The test is to reject independence if:

Test for non-zero partial correlation

The test statistic

If H0 is true the test statistic z will have approximately a Standard Normal distribution

We then rejectH0 if:

### Canonical Correlation Analysis

The problem

Quite often when one has collected data on several variables.

The variables are grouped into two (or more) sets of variables and the researcher is interested in whether one set of variables is independent of the other set.

In addition if it is found that the two sets of variates are dependent, it is then important to describe and understand the nature of this dependence.

The appropriate statistical procedure in this case is called Canonical Correlation Analysis.

Canonical Correlation: An Example

In the following study the researcher was interested in whether specific instructions on how to relax when taking tests and how to increase Motivation , would affect performance on standardized achievement tests

• Language and
• Mathematics
A group of 65 third- and fourth-grade students were rated after the instruction and immediately prior taking the Scholastic Achievement tests on:
• how relaxed they were (X1) and
• how motivated they were (X2).

In addition data was collected on the three achievement tests

• Language (Y2) and
• Mathematics (Y3).

The data were tabulated on the next page

Definition: (Canonical variates and Canonical correlations)

have p-variate Normal distribution

and

with

Let

and

be such that U1 and V1 have achieved the maximum correlation f1.

Then U1 and V1 are called the first pair of canonical variates and f1 is called the first canonical correlation coefficient.

Now

Thus

has covariance matrix

Now

Thus

has covariance matrix

hence

Thus we want to choose

so that

is at a maximum

or

is at a maximum

Let

Thus

This shows that

is an eigenvector of

k is the largest eigenvalue of

and

is the eigenvector associated with the largest eigenvalue.

Also

and

Summary:

The first pair of canonical variates

, eigenvectors of the matrices

are found by finding

associated with the largest eigenvalue (same for both matrices)

The largest eigenvalue of the two matrices is the square of the first canonical correlation coefficientf1

The second pair of canonical variates

, so that

are found by finding

1. (U2,V2) are independent of (U1,V1).

2. The correlation between U2 and V2 is maximized

The correlation, f2, between U2 and V2 is called the second canonical correlation coefficient.

The ith pair of canonical variates

, so that

are found by finding

1. (Ui,Vi) are independent of (U1,V1), …, (Ui-1,Vi-1).

2. The correlation between Ui and Vi is maximized

The correlation, f2, between U2 and V2 is called the second canonical correlation coefficient.

Now

has covariance matrix

Now

and maximizing

Is equivalent to maximizing

subject to

Using the Lagrange multiplier technique

Now

and

also

gives the restrictions

These equations can used to show that

are eigenvectors of the matrices

associated with the 2nd largest eigenvalue (same for both matrices)

The 2ndlargest eigenvalue of the two matrices is the square of the 2nd canonical correlation coefficientf2

continuing

Coefficients for the ith pair of canonical variates,

are eigenvectors of the matrices

associated with the ith largest eigenvalue (same for both matrices)

The ithlargest eigenvalue of the two matrices is the square of the ith canonical correlation coefficientfi

Example

Variables

• relaxation Score (X1)
• motivation score (X2).
• Language (Y2) and
• Mathematics (Y3).
Summary

U1 = 0.197 Relax + 0.979 Mot

V1 = 0.504 Read + 0.900 Lang + 0.565 Math

f1 = .592

U2 = 0.980 Relax + 0.203 Mot

V2 = 0.391 Math - 0.361 Read - 0.354 Lang

f2 = .159