Survey on ICA

1 / 34

# Survey on ICA - PowerPoint PPT Presentation

Survey on ICA . Technical Report, Aapo Hyvärinen, 1999. http://ww.icsi.berkeley.edu/~jagota/NCS. Outline. 2nd-order methods PCA / factor analysis Higher order methods Projection pursuit / Blind deconvolution ICA definitions criteria for identifiability relations to other methods

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Survey on ICA' - avinoam

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Survey on ICA

Technical Report, Aapo Hyvärinen, 1999.

http://ww.icsi.berkeley.edu/~jagota/NCS

### Outline

• 2nd-order methods
• PCA / factor analysis
• Higher order methods
• Projection pursuit / Blind deconvolution
• ICA
• definitions
• criteria for identifiability
• relations to other methods
• Applications
• Contrast functions
• Algorithms

### General model

x = As + n

Observations

Mixing matrix

Noise

Latent variables, factors, independent components

### Find transformation

s = f (x)

Consider only linear transformation:

s = Wx

### Principal component analysis

• Find direction(s) where variance of wTx is maximized.
• Equivalent to finding the eigenvectors of C=E(xxT) corresponding to the k largest eigenvalues

### Factor analysis

• Closely related to PCA
• x = As + n
• Method of principal factors:
• Assumes knowledge of covariance matrix of the noise: E(nnT)
• PCA on: C = E(xxT)– E(nnT)
• Factors are not defined uniquely, but only up to a rotation

### Higher order methods

• Projection pursuit
• Redundancy reduction
• Blind deconvolution
• Requires assumption that data are not Gaussian

### Projection pursuit

• Find direction w, such that wTx has an ’interesting’ distribution
• Argued that interesting directions are those that show the least Gaussian distribution

### Differential entropy

• Maximised when f is a Gaussian density
• Minimize H(wTx) to find projection pursuit directions (y = wTx)
• Difficult to estimate the density of wTx

### Blind deconvolution

Observe filtered version of s(t):

x(t) = s(t)*g(t)

Find filter h(t), such that

s(t) = h(t)*x(t)

### Example blind deconvolution

Seismic: ”statistical deconvolution”

g(t)

t

s(t)

t

### ICA definitions

Definition 1 (General definition)

ICA of a random vector x consists of finding a linear transformation, s=Wx, so that the components, si, are as independent as possible, in the sense of maximizing some function F(s1,..,sm) that measure independence.

### ICA definitions

Definition 2 (Noisy ICA)

ICA of a random vector x consists of estimating the following model for the data:

x = As + n

where the latent variables si are assumed independent

Definition 3 (Noise-free ICA) x = As

### Statistical independence

• ICA requires statistical independence
• Distinguish between statistically independent and uncorrelated variables
• Statistically independent:
• Uncorrelated:

### Identifiability of ICA model

• All the independent components, but one, must be non-Gaussian
• The number of observed mixtures must be at least as large the number of independent components, m >= n
• The matrix A must be of full column rank
• Note: with m < n, A may still be indentifiable

### Relations to other methods

• Redundancy reduction
• Noise free case
• Find ’interesting’ projections
• Special case of projection pursuit
• Blind deconvolution
• Factor analysis for non-Gaussian data
• Related to non-linear PCA

### Applications of ICA

• Blind source separation
• Cocktail party problem
• Feature extraction
• Blind deconvolution

### Objective (contrast) functions

ICA method = Objective function + Optimization algorithm

• Multi-unit contrast functions
• Find all independent components
• One-unit contrast functions
• Find one independent component (at a time)

### Mutual information

• Mutual information is zero if the yi are independent
• Difficult to estimate, approximations exist

### Mutual information (2)

• Alternative definition

H(X|Y)

H(Y|X)

I(X,Y)

H(X)

H(Y)

### Non-linear PCA

• Add non-linearity function g(.) in the formula for PCA

### One-unit contrast functions

• Find one vector, w, so that wTx equals one of the independent components, si
• Related to projection pursuit
• Prior knowledge of number of independent components not needed

### Negentropy

• Difference between differential entropy of y and differential entropy of Gaussian variable with same variance
• If the yi are uncorrelated, the mutual information can be expressed as
• J(y) can be approximated by higher-order cumulants, but estimation is sensitive to outliers

### Algorithms

• Have x=As, want to find s=Wx
• Preprocessing
• Centering of x
• Sphering (whitening) of x
• Find transformation; v=Qx such that E(vvT)=I
• Found via PCA / SVD
• Sphering does not solve problem alone

### Algorithms (2)

• Jutten-Herault
• Cancel non-linear cross-correlations
• Non-diagonal terms of W are updated according to
• The yi are updated iteratively as y = (I+W)-1x
• Non-linear decorrelation
• Non-linear PCA
• FastICA, ..., etc.

### Summary

• Definitions of ICA
• Conditions for identifiability of model
• Relations to other methods
• Contrast functions
• One-unit / multi-unit
• Mutual information / Negentropy
• Applications of ICA
• Algorithms

### Future research

• Noisy ICA
• Tailor-made methods for certain applications
• Use of time correlations if x is a stochastic process
• Time delays/echoes in cocktail-party problem
• Non-linear ICA