Newton Method for the ICA Mixture Model

1 / 21

# Newton Method for the ICA Mixture Model - PowerPoint PPT Presentation

Newton Method for the ICA Mixture Model. Jason A. Palmer 1 Scott Makeig 1 Ken Kreutz-Delgado 2 Bhaskar D. Rao 2 1 Swartz Center for Computational Neuroscience 2 Dept of Electrical and Computer Engineering University of California San Diego, La Jolla, CA. Introduction.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Newton Method for the ICA Mixture Model' - caia

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Newton Method for theICA Mixture Model

1 Swartz Center for Computational Neuroscience2 Dept of Electrical and Computer EngineeringUniversity of California San Diego, La Jolla, CA

Introduction
• Want to model sensor array data with multiple independent sources — ICA
• Non-stationary source activity — mixture model
• Want the adaptation to be computationally efficient — Newton method
Outline
• ICA mixture model
• Basic Newton method
• Positive definiteness of Hessian when model source densities are true source densities
• Newton for ICA mixture model
• Example applications to analysis of EEG
ICA Mixture Model—toy example
• 3 models in two dimensions, 500 points per model
• Newton method converges < 200 iterations, natural gradient fails to converge, has difficulty on poorly conditioned models
ICA Mixture Model
• Want to model observations x(t), t = 1,…,N, different models “active” at different times
• Bayesian linear mixture model, h = 1, . . . , M :
• Conditionally linear given the model, :
• Samples are modeled as independent in time:
Source Density Mixture Model
• Each source density mixture component has unknown location, scale, and shape:
• Generalizes Gaussian mixture model, more peaked, heavier tails
ICA Mixture Model—Invariances
• The complete set of parameters to be estimated is:

h = 1, . . ., M, i = 1, . . ., n, j = 1, . . ., m

• Invariances: W row norm/source density scale and model centers/source density locations:
Basic ICA Newton Method
• Transform gradient (1st derivative) of cost function using inverse Hessian (2nd derivative)
• Cost function is data log likelihood:
• Natural gradient (positive definite transform):
Newton Method – Hessian
• Take derivative of (i,j)th element of gradient with respect to (k,l)th element of W :
• This defines a linear transform :
• In matrix form, this is:
Newton Method – Hessian
• To invert: rewrite the Hessian transformation in terms of the source estimates:
• Define , , :
• Want to solve linear equation :
Newton Method – Hessian
• The Hessian transformation can be simplified using source independence and zero mean:
• This leads to 2x2 block diagonal form:
Newton Direction
• Invert Hessian transformation, evaluate at gradient:
• Leads to the following equations:
• Calculate the Newton direction:
Positive Definiteness of Hessian
• Conditions for positive definiteness:
• Always true for true when model source densities match true densities:

1)

2)

3)

Newton for ICA Mixture Model
• Similar derivation applies to ICA mixture model:
Convergence Rates
• Convergence is really much faster than natural gradient. Works with step size 1!
• Need correct source density model

log likelihood

iteration

iteration

Segmentation of EEG experiment trials

3 models

4 models

trial

trial

time

time

log

likelihood

log

likelihood

iteration

iteration

Applications to EEG—Epilepsy

1 model

5 models

log

likelihood

time

time

log

likelihood

difference

from

single model

time

Conclusion
• We applied method of Amari, Cardoso and Laheld, to formulate a Newton method for the ICA mixture model
• Arbitrary source densities modeled with non-gaussian source mixture model
• Non-stationarity modeled with ICA mixture model (multiple mixing matrices learned)
• It works! Newton method is substantially faster (superlinear). Also Newton can converge when Natural Gradient fails
Code
• There is Matlab code available!!
• Generate toy mixture model data for testing
• Full method implemented: mixture sources, mixture ICA, Newton
• Extended version of paper in preparation, with derivation of mixture model Newton updates