Introduction to Neural NetworksPowerPoint Presentation

Introduction to Neural Networks

Introduction to Neural Networks

Presentation Transcript

Chapter 5: Adaptive Resonance Theory

- 1987, Carpenter and Grossberg
- ART1: clusters binary vectors
- ART2: clusters continuous vectors

General

- Weights on a cluster unit can be considered to be a prototype pattern
- Relative similarity is used instead of an absolute difference. Thus, a difference of 1 in a vector with only a few non-zero components becomes more significant.

General

- Training examples may be presented several times.
- Training examples may be presented in any order.
- An example might change clusters.
- Nets are stable (patterns don’t oscillate).
- Nets are plastic (examples can be added).

Architecture

- Input layer (xi)
- Output layer or cluster layer – competitive (yi)
- Units in the output layer can be active, inactive, or inhibited.

Nomenclature

- bij: bottom up weight
- tij: top down weight
- s: input vector
- x: activation vector
- n: number of components in input vector
- m: maximum number of clusters
- || x ||: S xi
- p: vigilance parameter

Training Algorithm

1. L > 1, 0 < p <= 1 tji(0) = 1 0 < bij(0) < L / (L – 1 + n)

2. while stopping criterion is false do

steps 3 – 12

3. for each training example do

steps 4 - 12

Training Algorithm

4. yi = 0

5. compute || s ||

6. xi = si

7. if yj (do for each j) is not inhibited then

yj = S bij xi

8. find largest yj that is not inhibited

9. xi = si * tji

Training Algorithm

10. compute || x ||

11. if || x || / || s || < p then yj = -1, go to step 8

12. bij = L xi / ( L – 1 + || x || )

tji = xi

Possible Stopping Criterion

- No weight changes.
- Maximum number of epochs reached.

What Happens If All Units Are Inhibited?

- Lower p.
- Add a cluster unit.
- Throw out the current input as an outlier.

Example

x1

- n = 4
- m = 3
- p = 0.4 (low vigilance)
- L = 2
- bij(0) = 1/(1 + n) = 0.2
- tji(0) = 1

y1

x2

y2

x3

y3

x4

Example

3. input vector (1 1 0 0)

4. yi = 0

5. || s || = 2

6. x = (1 1 0 0)

7. y1 = .2(1) + .2(1) + .2(0) + .2(0) = 0.4

y2 = y3 = y4 = 0.4

Example

8. j = 1 (use lowest index to break ties)

9. x1 = s1 * t11 = 1 * 1 = 1

x2 = s2 * t12 = 1 * 1 = 1

x3 = s3 * t13 = 0 * 1 = 0

x4 = s4 * t14 = 0 * 1 = 0

10. || x || = 2

11. || x || / || s || = 1 >= 0.4

Example

12. b11 = 2 * xi / (2 - 1 + || x ||)

= 2 * 1 / (1 + 2) = .667

b21 = .667

b31 = b41 = 0

t11 = x1 = 1

t12 = 1

t13 = t14 = 0

Exercise

- Show the network after the training example (0 0 0 1) is processed.

Observations

- Typically, stable weight matrices are obtained quickly.
- The cluster units are all topologically independent of one another.
- We have just looked at the fast learning version of ART1. There is also a slow learning version that updates just one weight per training example.

