1 / 20

# Canonical Correlation Analysis (CCA) - PowerPoint PPT Presentation

Canonical Correlation Analysis (CCA). CCA. This is it! The mother of all linear statistical analysis. When ? We want to find a structural relation between a set of independent variables and a set of dependent variables. CCA. When ? (part 2)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Canonical Correlation Analysis (CCA)' - constance-morse

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

• This is it!

• The mother of all linear statistical analysis

• When ?

• We want to find a structural relation between a set of independent variables and a set of dependent variables.

• When ? (part 2)

• To what extend can one set of two or more variables be predicted or “explained” by another set of two or more variables?

• What contribution does a single variable make to the explanatory power to the set of variables to which the variable belongs?

• What contribution does a single variable contribute to predicting or “explaining” the composite of the variables in the variable set to which the variable does not belong?

• What different dynamics are involved in the ability of one variable set to “explain” in different ways different portions of other variable set?

• What relative power do different canonical functions have to predict or explain relationships?

• How stable are canonical results across samples or sample subgroups?

• How closely do obtained canonical results conform to expected canonical results?

• Assumptions

• Linearity: if not, nonlinear canonical correlation analysis.

• Absence of multicollinearity: If not, Partial Least Squares (PLS) regression to reduce the space.

• Homoscedasticity: If not, data transformation.

• Normality: If not, re-sampling.

• A lot of data: Max(p, q)20nb of pairs.

• Absence of outliers.

• Toy example

IVs

DVs

=X

• Z score transformation

IV1

IV1

DV2

DV2

=Z

• Canonical Correlation Matrix

• Relations with other subspace methods

• Eigenvalues and eigenvectors decomposition

R =

PCA

• Eigenvalues and eigenvectors decomposition

• The roots of the eigenvalues are the canonical correlation values

• Significance test for the canonical correlation

• A significant output indicates that there is a variance share between IV and DV sets

• Procedure:

• We test for all the variables (m=1,…,min(p,q))

• If significant, we removed the first variable (canonical correlate) and test for the remaining ones (m=2,…, min(p,q)

• Repeat

• Significance test for the canonical correlation

Since all canonical variables are significant, we will keep them all.

• Canonical Coefficients

• Analogous to regression coefficients

BY=

Eigenvectors

Correlation matrix of the dependant variables

Bx=

• Canonical Variates

• Analogous to regression coefficients

• Matrices of correlations between the variables and the canonical coefficients

Ax

Ay

• Only coefficient higher than |0.3| are interpreted.

Canonical correlation

• Proportion of variance extracted

• How much variance does each of the canonical variates extract form the variables on its own side of the equation?

First

First

Second

Second

• Redundancy

• How much variance the canonical variates form the IVs extract from the DVs, and vice versa.

rdyx

Eigenvalues

• Redundancy

• How much variance the canonical variates form the IVs extract from the DVs, and vice versa.

Summary

The first canonical variate from IVs extract 40% of the variance in the y variable.

The second canonical variate form IVs extract 30% of the variance in the y variable.

Together they extract 70% of the variance in the DVs.

The first canonical variate from DVs extract 49% of the variance in the x variable.

The second canonical variate form DVs extract 24% of the variance in the x variable.

Together they extract 73% of the variance in the IVs.

• Rotation

• A rotation does not influence the variance proportion or the redundancy.