Debate 2 : Boltzmann Machine and Simulated Annealing. Presented by: Yevgeniy Gershteyn Larisa Perman 04/24/2003. Boltzmann Machine. Boltzmann Machine neural net was introduced by Hinton and Sejnowski in 1983. Used for solving constrained optimization problems. Typical Boltzmann Machine:
Boltzmann Machine and Simulated Annealing
wij – weight of the connection
xi, xj – are the states of the Xi and Xj units
If units are connected: wij ≠ 0
The bidirectional nature of connections: wij = wji
C = ∑ [∑ wij xi xj]
∆C(i) - the change in consensus (unit Xi were to change its state)
xi – the current state of unit Xi
[1 – 2xi] - +1, if Xi is ‘off’; -1, if Xi is ‘on’
∆C(i) = [1 – 2xi][wii + ∑ wij xj]
j ≠ i
T (temperature) – control parameter that reduced as the net searches for a maximal consensus
A(i,T) = 1 / ( 1 + exp( - ∆C(i) / T ) )
This allows a much richer representation of the input data.
Where the energy of the net is:
P( sj -sj) = 1 / ( 1 + exp(-∆E/ T) )
E = -1/2 ∑ ∑ wji sj si
P(α) = ( exp( -Eα / T ) ) / ( ∑β exp(-Eβ /T) )
Where: | vj | - the absolute value of the jth neuron’s activation
Where: ‘+’ indicates that the correlations is calculated when the visible neurons are in a clamped state
∆E = -∆sj ∑wjisi = -2 * | vj |
ρ+ĳ = ‹sj si›+
Where: ‘-’ indicates that the correlations is carried out when the visible neurons are not in a clamped state
ρ-ij = ‹sjsi›-
Where: η – is a learning rate.
This means: whether weight are changed depend on the difference between the correlations in clamped vs. free mode.
∆wĳ = η(ρ+ij – ρ-ij), ∀i,j
Where: Pα – priori probability of state αat the input neurons.
∆wĳ,α = ηPα(ρ+ij – ρ-ij), ∀i,j