Other Network Models. Deterministic weight updates. Until now, weight updates have been deterministic. State = current weight values & unit activations But a probabilistic distribution can be used to determine whether or not a unit should change to the new calculated state.
Points tried at medium
Points tried at low
Find a global minimum using simulated annealing
Is change in energy and T is temperature.
5. If there have been a specified number (M) of changes in x for which the value of f has dropped or there have been N changes in x since the last change in temperature, then set T = αT.
6. If the minimum value of f has not decreased more than some specified constant in the last L iterations then stop, otherwise go back and repeat from step 2.
= correlation between units during clamped phase
= correlation between units during free-running phase
Gaussian function for two variables
The estimated PDF is the summation of the individual Gaussians centered at each sample point. Here σ = 0.1
The same estimate as in previous figure but with σ = 0.3. The width is too large, then there is a danger that classes will become blurred (a high chance of misclassifying).
The same estimate as in previous figure but with σ = 0.05. The width becomes too small, then there is a danger of poor generalization: the fit around the training samples becomes too close.
This square will be cancelled with square-root in normalization formula.
Figure. The unknown sample to be classified using a PDF.
x is an unknown input pattern.
If the input vectors are all of unit length, then the following form of the activation function can be used.
Number of input units = number of features
Number of pattern units = number of training samples
Number of summation units = number of classes
The weights from the pattern to summations units are fixed at 1.
The vectors shown in previous figure are normalized here.