- 90 Views
- Uploaded on
- Presentation posted in: General

Canonical Correlation Analysis (CCA)

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

- This is it!
- The mother of all linear statistical analysis

- When ?
- We want to find a structural relation between a set of independent variables and a set of dependent variables.

- When ? (part 2)
- To what extend can one set of two or more variables be predicted or “explained” by another set of two or more variables?
- What contribution does a single variable make to the explanatory power to the set of variables to which the variable belongs?
- What contribution does a single variable contribute to predicting or “explaining” the composite of the variables in the variable set to which the variable does not belong?
- What different dynamics are involved in the ability of one variable set to “explain” in different ways different portions of other variable set?
- What relative power do different canonical functions have to predict or explain relationships?
- How stable are canonical results across samples or sample subgroups?
- How closely do obtained canonical results conform to expected canonical results?

- Assumptions
- Linearity: if not, nonlinear canonical correlation analysis.
- Absence of multicollinearity: If not, Partial Least Squares (PLS) regression to reduce the space.
- Homoscedasticity: If not, data transformation.
- Normality: If not, re-sampling.
- A lot of data: Max(p, q)20nb of pairs.
- Absence of outliers.

- Toy example

IVs

DVs

=X

- Z score transformation

IV1

IV1

DV2

DV2

=Z

- Canonical Correlation Matrix

- Relations with other subspace methods

- Eigenvalues and eigenvectors decomposition

R =

PCA

- Eigenvalues and eigenvectors decomposition

- The roots of the eigenvalues are the canonical correlation values

- Significance test for the canonical correlation

- A significant output indicates that there is a variance share between IV and DV sets
- Procedure:
- We test for all the variables (m=1,…,min(p,q))
- If significant, we removed the first variable (canonical correlate) and test for the remaining ones (m=2,…, min(p,q)
- Repeat

- Significance test for the canonical correlation

Since all canonical variables are significant, we will keep them all.

- Canonical Coefficients
- Analogous to regression coefficients

BY=

Eigenvectors

Correlation matrix of the dependant variables

Bx=

- Canonical Variates
- Analogous to regression coefficients

- Loading matrices
- Matrices of correlations between the variables and the canonical coefficients

Ax

Ay

- Loadings and canonical correlations for both canonical variate pairs
- Only coefficient higher than |0.3| are interpreted.

Loading

Canonical correlation

- Proportion of variance extracted
- How much variance does each of the canonical variates extract form the variables on its own side of the equation?

First

First

Second

Second

- Redundancy
- How much variance the canonical variates form the IVs extract from the DVs, and vice versa.

rdyx

Eigenvalues

- Redundancy
- How much variance the canonical variates form the IVs extract from the DVs, and vice versa.

Summary

The first canonical variate from IVs extract 40% of the variance in the y variable.

The second canonical variate form IVs extract 30% of the variance in the y variable.

Together they extract 70% of the variance in the DVs.

The first canonical variate from DVs extract 49% of the variance in the x variable.

The second canonical variate form DVs extract 24% of the variance in the x variable.

Together they extract 73% of the variance in the IVs.

- Rotation
- A rotation does not influence the variance proportion or the redundancy.

= Loading matrix =