1 / 10

Connection between Multilayer Perceptrons and Regression Using Independent Component Analysis

Connection between Multilayer Perceptrons and Regression Using Independent Component Analysis. Aapo Hyv ä rinen and Ella Bingham Preliminary version appeared in Proc. ICANN'99 Summarized by Seong-woo Chung 2001.9.14. Introduction.

von
Download Presentation

Connection between Multilayer Perceptrons and Regression Using Independent Component Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Connection between Multilayer Perceptrons and Regression Using Independent Component Analysis Aapo Hyvärinen and Ella Bingham Preliminary version appeared in Proc. ICANN'99 Summarized by Seong-woo Chung 2001.9.14

  2. Introduction • Express observed random variables x1, x2, …, xq as linear combinations of unknown component variables, denoted by s1, s2, …, sn (n>=q for nonsingular joint density) • The variables in xare divided into two parts, observed and missing • So k first variables form the vector of the observed variables xo=(x1, …, xk)T ,and the remaining variables forum the vector of the missing variables xm=(xk+1, …, xq)T (C) 2001, SNU CSE Biointelligence Lab

  3. Introduction(Continued) • The problem is to predict xm for a given observation of xo • The regression is conventionally defined as the conditional expectation • Model the joint density of x by ICA, and then, for a given sample of incomplete data, predict the missing values in xm using the conditional expectation, which is well defined once the ICA model has been estimated (C) 2001, SNU CSE Biointelligence Lab

  4. Regression by ICA and by an MLP: The connection • Denote the probability densities of the si by pi , and gi(u) = p´i(u) / pi(u) + cu • The regression function for data modeled by ICA, is given by the output of an MLP with one hidden layer • The weight vectors of the MLP are simple functions of the mixing matrix, and the nonlinear activation functions of the MLP are functions of the probability densities of the si • The vector AoTxo can be interpreted as an initial linear estimate of s • The nonlinear aspect of g() consists largely of thresholding the linear estimates of s, to obtain s= g(AoTxo) • The final linear layer is basically a linear reconstruction of the formxm = Amŝ (C) 2001, SNU CSE Biointelligence Lab

  5. Simulation • Simulation data is 100-dimensional and there are 101000 data samples • The independent components, generated according to some probability density are mixed using a randomly generated n×n mixing matrix • The mixtures x are divided into observed (xo) and missing (xm) • The dimensionality of xo is 99 and the dimensionality of xm is 1 • The variables in xo are uncorrelated and their variance is set to one • A training data set of size 100000 and a test data set of size 1000 • The ICA estimation on the training data set give the estimated values for the source signals s and the mixing matrix A • The value of the missing variable xm is predicted either using numerical integration or using approximation method (C) 2001, SNU CSE Biointelligence Lab

  6. Simulation – Strongly Supergaussian Data (C) 2001, SNU CSE Biointelligence Lab

  7. Simulation – Laplace Distributed Data (C) 2001, SNU CSE Biointelligence Lab

  8. Simulation – Very Weakly Supergaussian Data (C) 2001, SNU CSE Biointelligence Lab

  9. Conclusion • Approximation • If the distributions of the independent components are close to gaussian, it gives excellent results • If they are strongly supergaussian, the approximation is less accurate but still quite reasonable in the range we experimented with • Regression • The stronger the supergaussianity, the better the quality of the regression • In contrast, for weakly supergaussian components, ICA regression does not really explain the data (C) 2001, SNU CSE Biointelligence Lab

  10. Discussion • Regression by ICA is computationally demanding, due to the integration • The integration may be approximated by the computationally simple procedure of computing the outputs of an MLP • The output of each hidden-layer neuron corresponds to estimation of one of the independent components • The choice of the nonlinearity is a problem of estimating the probability densities of the independent components (C) 2001, SNU CSE Biointelligence Lab

More Related