1 / 37

Introduction to Artificial Intelligence (G51IAI)

Introduction to Artificial Intelligence (G51IAI). Dr Matthew Hyde Neural Networks. More precisely: “ Artificial Neural Networks” Simulating, on a computer, what we understand about neural networks in the brain. Lecture Outline. Recap on perceptrons Linear Separability

Download Presentation

Introduction to Artificial Intelligence (G51IAI)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Artificial Intelligence (G51IAI) Dr Matthew Hyde Neural Networks • More precisely: • “Artificial Neural Networks” • Simulating, on a computer, what we understand about neural networks in the brain

  2. Lecture Outline • Recap on perceptrons • Linear Separability • Learning / Training • The Neuron’s Activation Function

  3. Recap from last lecture • A ‘Perceptron’ • Single layer NN (one neuron) • Inputs can be any number • Weights on the edges • Output can only be 0 or 1 5 0.5 θ = 6 6 2 Z 0 or 1 3 -3

  4. Truth Tables and Linear Separability

  5. AND function, and OR function AND XOR These are called “truth tables”

  6. AND function, and OR function AND XOR These are called “truth tables”

  7. Important!!! • You can represent any truth table graphically, as a diagram • The diagram is 2-dimensional if there are two inputs • 3-dimensional if there are three inputs • Examples on the board in the lecture, and in the handouts

  8. 3 Inputs means 3-dimensions 0,1,0 1,1,0 0,1,1 1,1,1 Y axis X axis Z axis 0,0,0 1,0,0 0,0,1 1,0,1

  9. Linear Separability in 3-dimensions Instead of a line, the dots are separated by a plane

  10. 0,1 0,1 1,1 1,1 0,0 1,0 XOR 0,0 1,0 XOR AND Minsky & Papert AND • Functions which can be separated in this way are called Linearly Separable • Only linearly Separable functions can be represented by a Perceptron

  11. Examples – Handout 3 • Linear Separability • Fill in the diagrams with the correct dots • black or white, for an output of 1 or 0

  12. How to Train your Perceptron

  13. Simple Networks AND X 1 θ=1.5 1 Y -1 Both of these represent the AND function. It is sometimes convenient to set the threshold to zero, and add a constant negative input 1.5 X θ=0 1 1 Y

  14. 0,1 1,1 AND 0,0 1,0 Training a NN AND

  15. Randomly Initialise the Network • We set the weights randomly, because we do not know what we want it to learn. • The weights can change to whatever value is necessary • It is normal to initialise them in the range [-1,1]

  16. Randomly Initialise the Network -1 0.3 0.5 X θ=0 Y -0.4

  17. Learning While epoch produces an error Present network with next inputs (pattern) from epoch Err = T – O If Err <> 0 then Wj = Wj + LR * Ij * Err End If End While Get used to this notation!! Make sure that you can reproduce this pseudocode AND understand what all of the terms mean

  18. Epoch • The ‘epoch’ is the entire training set • The training set is the set of four input and output pairs DESIRED OUTPUT INPUT

  19. The learning algorithm DESIRED OUTPUT INPUT Input the first inputs from the training set into the Neural Network What does the neural network output? Is it what we want it to output? If not then we work out the error and change some weights

  20. First training step • Input 1, 1 • Desired output is 1 • Actual output is 0 -1 0.3 0.5 1 θ=0 -0.3 + 0.5 + -0.4 = -0.2 1 -0.4 = Output of 0

  21. First training step • We wanted 1 • We got 0 • Error = 1 – 0 = 1 While epoch produces an error Present network with next inputs (pattern) from epoch Err = T – O If Err <> 0 then Wj = Wj + LR * Ij * Err End If End While If there IS an error, then we change ALL the weights in the network

  22. If there is an error, change ALL the weights • Wj = Wj + ( LR * Ij * Err ) • New Weight = Old Weight + (Learning Rate * Input Value * Error) • New Weight = 0.3 + (0.1 * -1 * 1) = 0.2 -1 0.3 0.2 0.5 1 θ=0

  23. If there is an error, change ALL the weights • Wj = Wj + ( LR * Ij * Err ) • New Weight = 0.5 + (0.1 * 1 * 1) = 0.6 -1 0.2 0.5 0.6 1 θ=0 -0.4 1

  24. Effects of the first change • The output was too low (it was 0, but we wanted 1) • Weights that contributed negatively have reduced • Weights that contributed positively have increased • It is trying to ‘correct’ the output gradually -1 -1 0.2 0.3 0.5 0.6 X X θ=0 θ=0 Y Y -0.3 -0.4

  25. Epoch not finished yet • The ‘epoch’ is the entire training set • We do the same for the other 3 input-output pairs DESIRED OUTPUT INPUT

  26. The epoch is now finished • Was there an error for any of the inputs? • If yes, then the network is not trained yet • We do the same for another epoch, from the first inputs again

  27. The epoch is now finished • If there were no errors, then we have the network that we want • It has been trained While epoch produces an error Present network with next inputs (pattern) from epoch Err = T – O If Err <> 0 then Wj = Wj + LR * Ij * Err End If End While

  28. Effect of the learning rate • Set too high • The network quickly gets near to what you want • But, right at the end, it may ‘bounce around’ the correct weights • It may go too far one way, and then when it tries to compensate it will go too far back the other way Wj = Wj + ( LR * Ij * Err )

  29. 0,1 1,1 0,0 1,0 AND Effect of the learning rate • Set too high • It may ‘bounce around’ the correct weights

  30. Effect of the learning rate • Set too low • The network slowly gets near to what you want • It will eventually converge (for a linearly separable function) • but that could take a long time • When setting the learning rule, you have to strike a balance between speed and effectiveness Wj = Wj + LR * Ij * Err

  31. The Neuron’s Activation Function

  32. Expanding the Model of the Neuron: Outputs other than ‘1’ Output is 1 or 0 It doesn’t matter about how far over the threshold we are X1 θ = 5 2 20 Y1 -5 -10 1 X2 θ = 2 -2 θ = 9 1 -4 Z Y2 5 0 1 1 X3 3 1 2 1 6 θ = 0

  33. Example from last lecture Left wheel speed Right wheel speed ... ... The speed of the wheels is not just 0 or 1

  34. Expanding the Model of the Neuron: Outputs other than ‘1’ • So far, the neurons have only output a value of 1 when they fire. • If the input sum is greater than the threshold the neuron outputs 1. • In fact, the neurons can output any value that you want.

  35. Modelling a Neuron • aj : Input value (output from unit j) • wj,i : Weight on the link from unit j to unit i • ini : Weighted sum of inputs to unit i • ai : Activation value of unit i • g : Activation function

  36. Activation Functions • Stept(x) = 1 if x >= t, else 0 • Sign(x) = +1 if x >= 0, else –1 • Sigmoid(x) = 1/(1+e-x) • aj : Input value (output from unit j) • ini : Weighted sum of inputs to unit i • ai : Activation value of unit i • g : Activation function

  37. Summary • Linear Separability • Learning Algorithm Pseudocode • Activation function (threshold, sigmoid, etc)

More Related