1 / 17

Matrix Pseudoinversion for Image Neural Processing

Matrix Pseudoinversion for Image Neural Processing. Rossella Cancelliere* University of Turin Turin, Italy. Mario Gai National Institute of Astrophysics Turin, Italy. Thierry Artières LIP6, P. et M. Curie University Paris, France. Patrick Gallinari LIP6, P. et M. Curie University

darcie
Download Presentation

Matrix Pseudoinversion for Image Neural Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Matrix Pseudoinversion forImage Neural Processing Rossella Cancelliere* University of Turin Turin, Italy Mario Gai National Institute of Astrophysics Turin, Italy Thierry Artières LIP6, P. et M. Curie University Paris, France Patrick Gallinari LIP6, P. et M. Curie University Paris, France

  2. Summary • Introduction • How to use pseudoinversion for neural training • How to evaluate pseudoinverse matrices • The application: an astronomical problem • Results and discussion 2

  3. Introduction 1 Our work moves forward from some new ideas concerning the use of matrix pseudoinversion to train Single Hidden Layer Feedforward Networks (SLFN). Many largely used training techniques random assign initial weights values, that are then iteratively modified (e.g. gradient descent methods) So doing it is necessary to deal with some usual issues, as slowness, local minima, and optimal learning step determination

  4. Introduction 2 Some procedures based on evaluation of generalized inverse matrix (or Moore-Penrose pseudoinverse) have been recently proposed, as the extreme learning machine (elm, Huang et al., 2006) Their main feature is that input weights are randomly chosen, and never modified, while output weights are anatically determined by MP pseudoinversion These non iterative procedures makes training very fast but some care is required because of the known numerical instability of pseudoinversion

  5. Summary • Introduction • How to use Pseudoinversion for neural training • How to evaluate pseudoinverse matrices • The application: an astronomical problem • Results and discussion 5

  6. Notation Training set: N distinct pairs Training aim: in matrix notation:

  7. Least-squares solution The number of hidden nodes is much lower than the number of distinct training samples so H is a non-square matrix One of the least-squares solution w of the linear system is where is the Moore-Penrose pseudoinverse of Main properties: • it has the smallest norm among all least-squares solutions • it reaches the smallest training error! Potentially dangerous for generalization: in case of many free parameters it can cause overfitting

  8. Summary • Introduction • How to use Pseudoinversion for neural training • How to evaluate pseudoinverse matrices • The application: an astronomical problem • Results and discussion

  9. Pseudoinverse computation Several methods are available to evaluate MP matrix: • the orthogonal projection (OP) method • the regularized OP (ROP) method • the singular value decomposition (SVD) • method. • V, U: unitary matrices • : diagonal matrix. Its entries are the • inverses of the singular values of H Potentially sensitive to numerical instability!

  10. Summary • Introduction • How to use pseudoinversion for neural training • How to evaluate pseudoinverse matrices • The application: an astronomical problem • Results and discussion

  11. Chromaticity diagnosis The measured image profile of a star depends on its spectral type: this error on measured position is called Chromaticity Its correction is a major issue of the European Space Agency (ESA) mission Gaia for global astrometry, approved for launch in 2013 NN inputs: first 5 statistical moments for each simulated image where: K = 1,......5 sign. detected on pixel n, ideal sign., sign. barycenter, evaluated both for ‘blue’ and ‘red’ stars, plus the ‘red’ barycenter. The different NN models so have 11 input neurons and 1 output neuron to detect chromaticity

  12. Summary • Introduction • How to use Pseudoinversion for neural training • How to evaluate pseudoinverse matrices • The application: an astronomical problem • Results and discussion

  13. Reference result SLFN with 11 input neurons and 1 output neuron, trained with backpropagation algorithm Activation functions: hyperbolic tangent (less saturation problems because of its non-zero mean value) Training set size: 10000 instances, test size: 3000 instances We look for minimum RMSE when hidden layer size increases from 10 to 200. η in the range (0.1, 0.9) Best RMSE: 3.81 90 hidden neurons

  14. Pseudoinversion results (1) Hidden Space Related Pseudoinversion (HSR- Pinv) - Input weights: randomly chosen according to a uniform distribution in the interval This controls saturation issues, forcing the use of the central part of the hyperbolic activation function - Output weights: evaluated by pseudoinversion via SVD - Hidden layer size is gradually increased from 50 to 600 - 10 simulation trials are performed for each selected size σ-SVD: state of the art Sigmoid activation functions and random weights uniformely distributed in (-1,1)

  15. Pseudoinversion results (2) Best results are achieved with the proposed HSR method (blue curve) The same method used with sigmoid functions performs slightly worse (green curve) ‘Constant weights size + pseudoinversion’ approach clearly shows worse performance (red and pale blue curves) Hypothesis: saturation control doesn’t allow specialization on particular training instances, so avoiding overfitting

  16. Pseudoinversion results (3) Error peak: The ratio of minimum singular value and matlab default threshold approaches unity in the peak region (logarithmic unities) Solution: Threshold tuning The new threshold is a function of singular values size neareby the peak region(180 hidden neurons) Results better than BP are anyway obtained with less neurons (roughly 150) Greater robustness, slight RMSE increase

  17. Further developements The issues of overfitting and numerical instability seem to have a dramatic impact on performance Regularization (Tikhonov 1963, 1977) is an established method to deal with ill-posed problems: thank to introduction of a penality term, it seems to be promising to avoid overfitting Possible effects also on instability control have to be investigated

More Related