Canonical Correlation Analysis (CCA)

1 / 20

# Canonical Correlation Analysis (CCA) - PowerPoint PPT Presentation

Canonical Correlation Analysis (CCA). CCA. This is it! The mother of all linear statistical analysis. When ? We want to find a structural relation between a set of independent variables and a set of dependent variables. CCA. When ? (part 2)

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Canonical Correlation Analysis (CCA)' - constance-morse

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
CCA
• This is it!
• The mother of all linear statistical analysis
• When ?
• We want to find a structural relation between a set of independent variables and a set of dependent variables.
CCA
• When ? (part 2)
• To what extend can one set of two or more variables be predicted or “explained” by another set of two or more variables?
• What contribution does a single variable make to the explanatory power to the set of variables to which the variable belongs?
• What contribution does a single variable contribute to predicting or “explaining” the composite of the variables in the variable set to which the variable does not belong?
• What different dynamics are involved in the ability of one variable set to “explain” in different ways different portions of other variable set?
• What relative power do different canonical functions have to predict or explain relationships?
• How stable are canonical results across samples or sample subgroups?
• How closely do obtained canonical results conform to expected canonical results?
CCA
• Assumptions
• Linearity: if not, nonlinear canonical correlation analysis.
• Absence of multicollinearity: If not, Partial Least Squares (PLS) regression to reduce the space.
• Homoscedasticity: If not, data transformation.
• Normality: If not, re-sampling.
• A lot of data: Max(p, q)20nb of pairs.
• Absence of outliers.
CCA
• Toy example

IVs

DVs

=X

CCA
• Z score transformation

IV1

IV1

DV2

DV2

=Z

CCA
• Canonical Correlation Matrix
CCA
• Relations with other subspace methods
CCA
• Eigenvalues and eigenvectors decomposition

R =

PCA

CCA
• Eigenvalues and eigenvectors decomposition
• The roots of the eigenvalues are the canonical correlation values
CCA
• Significance test for the canonical correlation
• A significant output indicates that there is a variance share between IV and DV sets
• Procedure:
• We test for all the variables (m=1,…,min(p,q))
• If significant, we removed the first variable (canonical correlate) and test for the remaining ones (m=2,…, min(p,q)
• Repeat
CCA
• Significance test for the canonical correlation

Since all canonical variables are significant, we will keep them all.

CCA
• Canonical Coefficients
• Analogous to regression coefficients

BY=

Eigenvectors

Correlation matrix of the dependant variables

Bx=

CCA
• Canonical Variates
• Analogous to regression coefficients
CCA
• Matrices of correlations between the variables and the canonical coefficients

Ax

Ay

CCA
• Only coefficient higher than |0.3| are interpreted.

Canonical correlation

CCA
• Proportion of variance extracted
• How much variance does each of the canonical variates extract form the variables on its own side of the equation?

First

First

Second

Second

CCA
• Redundancy
• How much variance the canonical variates form the IVs extract from the DVs, and vice versa.

rdyx

Eigenvalues

CCA
• Redundancy
• How much variance the canonical variates form the IVs extract from the DVs, and vice versa.

Summary

The first canonical variate from IVs extract 40% of the variance in the y variable.

The second canonical variate form IVs extract 30% of the variance in the y variable.

Together they extract 70% of the variance in the DVs.

The first canonical variate from DVs extract 49% of the variance in the x variable.

The second canonical variate form DVs extract 24% of the variance in the x variable.

Together they extract 73% of the variance in the IVs.

CCA
• Rotation
• A rotation does not influence the variance proportion or the redundancy.