1 / 27

Neural Network Classification of Simulated AUGER data

Neural Network Classification of Simulated AUGER data. Giuseppe Longo 1 on behalf of the Naples team (Aramo, Ambrosio, Donalek,, Tagliaferri) Department of Physical Sciences - University of Napoli “Federico II”, Sezione di Napoli; longo@na.infn.it. Valencia, october 2003. SOM.

zavad
Download Presentation

Neural Network Classification of Simulated AUGER data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Neural Network Classification of Simulated AUGER data Giuseppe Longo 1 on behalf of the Naples team (Aramo, Ambrosio, Donalek,, Tagliaferri) Department of Physical Sciences - University of Napoli “Federico II”, Sezione di Napoli; longo@na.infn.it Valencia, october 2003

  2. SOM • Self-Organizing Map (SOM) is an unsupervised neural network (Kohonen, 1982, 1988) tailored for the visualisation of high-dimensional data. • A SOM consists of neurons (10 ~ 1000) organized on a regular low-dimensional grid. • Each neuron is represented by a d-dimensional weight vector, where d is equal to the dimension of the input vectors. Neurons are connected to the adjacent neurons by a neighborhood relation, dictating the topology (structure). • They are automatically organized into a meaningful two-dimensional order in which neurons modelling similar data, are closer to each other in the grid than the dissimilar ones. In this sense the SOM is a similarity graph, and a clustering diagram , too. • A SOM is trained iteratively. For each vector x from the input data set, distances between it and all weight vectors of the SOM are calculated using some distance measure (typically euclidean distance). The neuron whose weight vectors is closest to the input vector x is called the Best-Matching Unit (BMU).

  3. Relation between U-matrix and maps • In each figure, the hexagon in a certain position corresponds to the same map unit. In the U-matrix, addional hexagons exists between all pairs of neighbouring map units. • The component plane (parameter) and U-matrix may have colour bars showing the scale for the variable. The scale shows by default the values that the variables have in the map structure.

  4. Data Mining: SOM - U Matrix - plain components (CDF)

  5. Upper: cell structure Lower: smoothed (close to confidence levels) U-matrix The Unified distances matrix (U-matrix) visualizes distances between neighbouring map units, and thus shows the cluster structure of the map. Similar color area correspond to the neurons that recognize similar objects. It is calculated using all variables.

  6. Data Mining: SOM - U Matrix - plain components

  7. Validation • where on the map a specific data sample is located • The simplest answer is to find the BMU of the data sample. • localization can be performed also using only a subset of parameters. S S

  8. G G

  9. One can also investigate whole data sets using the map. Here is the response for of a set of 50 similar objects

  10. Other Visualization 3D U-Matrix Similarity coloring BARPLANE shows a barchart in each map unit.

  11. Intriguing pattern recognition problem: Curves are very similar in shape

  12. UNSUPERVISED SOM (120 nodes) SOM similarity coloring map: each hexagon represents a neuronand different colors denote different clusters. neurons are labeled using simulated data: (A=proton; B=Helium; C=Oxygen; D=Iron). P = 34% success rate, He = 30%, O = 28%, Fe = 41%

  13. SUPERVISED MLP Results for two MLP (22 hidden neurons, SoftMax activation function) Panel a: coniugate gradient optimization algorithm. Panel b: discendent gradient optimization function.

  14. Something which is going on in Napoli (DSF-INFN) • Centro Calcolo Parallelo e GRID Dipartimento di Scienze Fisiche, CNR, INFN, INFM, INVG (3 Beowulf ca. 512 nodi, 1 IBM 64 processori)

  15. The future: Generative Topograhic Mapping The Generative Topographic Mapping (GTM) model was introduced by Bishop et al. (1998) as a probabilistic re-formulation of the self-organizing map (SOM). It overcomes the limitations of the SOM, while introducing, no significant disadvantages.

  16. S.O.M. versus G.T.M. The SOM algorithm is not derived by optimizing an objective function • SOM does not define a density model • Neighbourhood preservation is not guaranteed by the SOM procedure. • There is no certainty that the code-book vectors will converge using SOM. • GTM instead: • In GTM the neighbourhood-preserving nature of the mapping is an automatic consequence of the choice of a smooth, continuous function y(x; W). • GTM defines an explicit probability density function in data space. In contrast, • Convergence of the batch GTM algorithm is guaranteed by the EM (Expectation Maximization) algorithm

  17. How GTM works We define a probability distribution p(x) on the (projected) latent variable space, this will induce a corresponding distribution p(y|W) in the data space. p(x) is a prior distribution of x. Since in reality the data will only approximately lie on a lower-dimensional manifold, it is appropriate to include a noise model for the t vector, for example Spherical Gaussian.

  18. The pdf in t-space, for a given value of W, is then obtained by integration over the x-distribution (1) we take p(x) This form of p(x) allows the integral in (1) to be performed analytically and the distribution function in data space takes the form of a Gaussian mixture model :

  19. To maximise it respect to W and b, we can use the likelihood However, usually is more convenient to use the log-likelihood

  20. EM algorithm for GTM Given some initial values for W and b , the E-step for the GTM is the same as for a general Gaussian mixture model, computing the responsibilities That corresponds to the posterior probability that the n-th data point was generated by the k-th component. Then M-step maximize parameters W and b

  21. Summary of the GTM algorithm

  22. The GTM Learning Process The plots show the density model in data space at iteration 0 (the initial configuration), 1, 2, 4, 8 and 15. data points are plotted as ‘o’ while the centres of the Gaussian mixture are plotted as '+'. centres are joined by a line according to their ordering in the latent space.

  23. Visualization An important potential application for the GTM is visualization. We define a probability distribution in the data space conditioned by the latent variable. We can therefore use Bayes' theorem, in conjunction with the prior distribution over latent variable, p(x), to compute the corresponding posterior distribution in latent space for any given point in data space, t, as

  24. Plane Latent Space Latent space posterior probability distribution for all dataset points

  25. Latent space Plane Latent Space posterior probability distribution for all Galaxies

  26. Latent space Plane Latent Space posterior probability distribution for all objects class. A

  27. All points B A Not classified

More Related