2806 neural computation self organizing maps lecture 9 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
2806 Neural Computation Self-Organizing Maps Lecture 9 PowerPoint Presentation
Download Presentation
2806 Neural Computation Self-Organizing Maps Lecture 9

Loading in 2 Seconds...

play fullscreen
1 / 24

2806 Neural Computation Self-Organizing Maps Lecture 9 - PowerPoint PPT Presentation


  • 293 Views
  • Uploaded on

2806 Neural Computation Self-Organizing Maps Lecture 9. 2005 Ari Visa. Agenda. Some historical notes Some theory Self-Organizing Map Learning Vector Quantization C onclusions . Some Historical Notes . Local ordering (von der Malsbyrg, 1973)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about '2806 Neural Computation Self-Organizing Maps Lecture 9' - Samuel


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
agenda
Agenda
  • Some historical notes
  • Some theory
  • Self-Organizing Map
  • Learning Vector Quantization
  • Conclusions
some historical notes
Some Historical Notes

Local ordering (von der Malsbyrg, 1973)

(Amari, 1980) a matematical analysis elucidates the dynamic stability of a cortical map

Self-organizing feature map (SOM), (Kohonen 1982)

(Erwin, 1992) a convex neighbourhood function should be used (~ Gaussian)

The relationship between the SOM and principal curves is dicussed (Ritter, 1992 & Cherkassky and Mulier, 1995)

some historical notes4
Some Historical Notes
  • Vector quantization: Lloyd algorithm 1957 for scalar quantization (= Max quantizer )
  • The generilized Lloyd algorithm for vector quantization (= k-means algorithm McQueen 1969 = LBG algorithm Linde et al 1980)
  • The idea of learning vector quantization (LVQ) (Kohonen, 1986)
  • Convergence properties of the LVQ algorithm using the ordinary differential equation (ODE) (Baras and LaVigna, 1990)
some theory
Some Theory
  • The spatial location of an output neuron in a topographic map corresponds to a particular domain or feature of data drawn from the input space. (Kohonen 1990)
  • Two approaches: Willshaw-von der Malsburg (explain neurobiological details) and Kohonen ( a more general than the first model in the sense that it is capable of performing data reduction).
some theory6
Some Theory
  • The principle goal of the self-organizing map is to transform an incoming signal pattern of arbitary dimension into a one- or two dimensional discrete map, and to perform this transformation adaptively in a topologically ordered fashion.
slide7
The formation of the self-organizing map

1. Competition

2. Cooperation

3. Synaptic Adaptation

Competitive Process

x = [x1 , x2 , ... ,xm ]T

wj = [wj1,wj2 , ... ,wjm]T

where j = 1,2,...,l (=total number of neurons in the network)

Select the neuron with largest wjTx

i(x) = arg minj||x – wj ||

A continuous input space of activation patterns is mapped onto a discrete output space of neurons by a process of competition among the neurons in the network .

SOM
slide8
SOM

Cooperation

  • How to define a topological neigborhood that is neurobiologically correct?
  • Let hij denote the topological neighborhood centered on winning neuron i, and encompassing a set of excited neurons denoted by j.
  • The topological neighborhood is symmetric about the maximum point
  • The amplitude of the topological neighborhood decreases monotonically with the increasing lateral distance  hj,i(x) (n) = exp(d²j,i/2²(n))
  • (n) = 0exp(-n/1), n =0,1,2,… .
slide9
SOM

Adaptation

  • wj(n+1) = wj(n) + (n) hj,i(x)(n)[x-wj(n)], note Hebbian learning
  • The synaptic weight vector wj of winning neuron i move toward the input vector x. Upon repeated presentations of the training data, the synaptic weight vector tend to follow the distribution of the input vectors due to the neighborhood updating.  topological ordering
  • The learning-rate parameter (n)should be time varying. (n) = 0exp(-n/2), n =0,1,2,…
slide10
SOM
  • Ordering and Convergence
  • Self-organizing or ordering phase: (max 1000 iterations)
  • (n) = [0.1, 0.01], 2 = 1000
  • hj,i(n) = [”radius” of the lattice, the winning neuron and a couple neighboring neurons around], 1= 1000/ log 0
  • Convergence phase: (fine tune the feature map, 500*the number of neurons in the network)

(n) = 0.01, hj,i(n) = the winning neuron and one or zero neighboring neurons. (summary of SOM)

some theory11
Some Theory
  • Property 1. Approximation of the Input Space. The feature map , represented by the set of synaptic weight vectors {wj} in the input space A, provides a good approximation to the input space H.
  • The theoretical basis of the idea is rooted in vector quantization theory.
some theory12
Some Theory
  • c(x) acts as an encoder of the input vector x and x’(c) acts as a decoder of c(x). The vector x is selected at random from the training sample, subject to an underlying probability density function fx(x). The optimum encoding-decoding scheme is detemined by varying the functions c(x) and x’(c), so as to minimize the expected distortion defined by .
some theory13
Some Theory
  • d(x,x’) = ||x-x’||²

= (x-x’)T(x-x’)

Generalized Lloyd algorithm

Condition 1. Given the input vector x, choose the code c = c(x) to minimize the squared error distortion ||x-x’(c)||².

Condition 2. Given the code c, compute the reconstuction vector x =x’(c) as the centroid of the input vectors x that satisfy condition 1.

The Generalized Lloyd algorithm is closely related to the SOM.

some theory14
Some Theory

Condition 1. Given the input vector x, choose the code c = c(x) to minimize the distortion measure D2 .

Condition 2. Given the code c, compute the reconstuction vector x =x’(c) to satisfy the conditions.

  • x’new(c) ← x’old(c) + (c-c(x))[ x-x’old(c)]
some theory15
Some Theory
  • Property 2. Topological Ordering. The feature map  computed by the SOM algorithm is topologically ordered in the sense that the spatial location of a neuron in the lattice corresponds to a particular domain or feature of input patterns.
some theory16
Some Theory
  • Property 3. Density Matching. The feature map  reflects variations in the statistics of the input distributions: regions in the input space H from which the sample vectors x are drawn with a high probability of occurrence are mapped onto larger domains of the output space A, and therefore with better resolution than regions in H from which sample vectors x are drawn with a low probability of occurrence.
  • Minimum-distortion encoding, according to which the curvature terms and all higher-order terms in the distortion measure due to noise model () are retained. m(x)  fx ⅓(x)
  • Nearest-neighbor encoding, which emerges if the curvature terms are ignored, as in the standard form of the SOM algorithm. m(x)  fx ⅔(x)
some theory17
Some Theory
  • Property 4. Feature Selection. Given data from an input space with an nonlinear distribution, the self-organizing map is able to select a set of best features for approximating the underlying distribution.

.

learning vector quantizer
Learning Vector Quantizer
  • Vector quantization: an input space is divided into a number of distinct regions, and for each region a reconstruction vector is defined.
  • A vector quantizer with minimum encoding distortion is called a Voronoi or nearest-neighbor quantizer.
  • The collection of possible reproduction vectors is called the code book of the quantizer, and its members are called code vectors.
learning vector quantizer19
Learning Vector Quantizer

The SOM algorithm provides an approximate method for computing the Voronoi vectors in unsupervised manner.

Learning vector quantization (LVQ) is a supervised learning technique that uses class information to move the Voronoi vectors slightly, so as to improve the quality of the classifier decision regions.

learning vector quantizer20
An input vector x is picked at random from the input space. If the class labels of the input vector x and a Voronoi vector w agree, the Voronoi vector w is moved in the direction of the input vector x. If the class labels of the input vector x and the Voronoi vector w disagree, the Voronoi vector w is moved away from the input vector x.

Let {wi}1i=1 denote the set of Voronoi vectors, and the {xi}Ni=1 denote the set of input vectors.

LVQ:

Suppose that the Voronoi vector wc is the closest to the input vector xi . Let Lwc denote the class associated with the Voronoi vector wc and Lxi denote the class label of the input vector xi. The Voronoi vector wc is adjusted as follows:

If Lwc = Lxi ,then wc(n+1) = wc(n) + αn[xi - wc(n)] where 0< αn<1.

If Lwc ≠ Lxi ,then wc(n+1) = wc(n) - αn[xi - wc(n)] where 0< αn<1.

II. The other Voronoi vectors are not modified.

Learning Vector Quantizer
learning vector quantizer21
Learning Vector Quantizer

Vector quantization is a form of lossy data compression.

Rate distortion theory (Gray 1984): Better data compression performance can always be achieved by coding vectors instead of scalars, even if the source of data is memoryless, or if the data compression system has memory.

A multistage hierarchical vector quantizer (Luttrell 1989)

learning vector quantizer22
Learning Vector Quantizer
  • First-order autoregressive (AR) model: x(n+1) = x(n) +(n) where  is the AR coefficient and the (n) are independent and identical distributed (iid) Gaussian random variables of zero mean and unit variance.
some theory23
Some Theory
  • Attribute code xa
  • Symbol code xs
  • x =[xs, 0]T+ [0,xa]T
  • Contextual map or semantic map
summary
Summary
  • The SOM algorithm is neurobiologically inspired, incorporating all the mechanisms that are basic to self-organization: competition, cooperation, and self-amplification.
  • The Kohonen’s SOM algorithm is so simple to implement, yet matematically so difficult to analyze its properties in a general setting.
  • The self-organizing map may be viewed as a vector quantizer.