1 / 25

Neural Networks

Neural Networks. Slides from: Doug Gray, David Poole. What is a Neural Network?. Information processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information

Download Presentation

Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Neural Networks Slides from: Doug Gray, David Poole

  2. What is a Neural Network? • Information processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information • A method of computing, based on the interaction of multiple connected processing elements

  3. What can a Neural Net do? • Compute a known function • Approximate an unknown function • Pattern Recognition • Signal Processing • Learn to do any of the above

  4. Basic Concepts A Neural Network generally maps a set of inputs to a set of outputs Number of inputs/outputs is variable The Network itself is composed of an arbitrary number of nodes with an arbitrary topology

  5. Basic Concepts Definition of a node: • A node is an element which performs the function y = fH(∑(wixi) + Wb) Connection Node

  6. Properties • Inputs are flexible • any real values • Highly correlated or independent • Target function may be discrete-valued, real-valued, or vectors of discrete or real values • Outputs are real numbers between 0 and 1 • Resistant to errors in the training data • Long training time • Fast evaluation • The function produced can be difficult for humans to interpret

  7. Perceptrons • Basic unit in a neural network • Linear separator • Parts • N inputs, x1 ... xn • Weights for each input, w1 ... wn • A bias input x0 (constant) and associated weight w0 • Weighted sum of inputs, y = w0x0 + w1x1 + ... + wnxn • A threshold function (activation function), i.e 1 if y > 0, -1 if y <= 0

  8. Diagram w1 x1 x0 w0 x2 w2 Σ Threshold . . . 1 if y >0 -1 otherwise y = Σ wixi xn wn

  9. Typical Activation Functions • F(x) = 1 / (1 + e –x) • Using a nonlinear function which approximates a linear threshold allows a network to approximate nonlinear functions

  10. Simple Perceptron • Binary logic application • fH(x) = u(x) [linear threshold] • Wi = random(-1,1) • Y = u(W0X0 + W1X1 + Wb) • Now how do we train it?

  11. Basic Training • Perception learning ruleΔWi = η * (D – Y) * Xi • η = Learning Rate • D = Desired Output • Adjust weights based on how well the current weights match an objective

  12. Logic Training • Expose the network to the logical OR operation • Update the weights after each epoch • As the output approaches the desired output for all cases, ΔWi will approach 0

  13. Results W0 W1 Wb

  14. Details • Network converges on a hyper-plane decision surface • X1 = (W0/W1)X0 + (Wb/W1) X1 X0

  15. Feed-forward neural networks Feed-forward neural networks are the most common models. These are directed acyclic graphs:

  16. Neural Network for the news example

  17. Axiomatizing the Network The values of the attributes are real numbers. Thirteen parameters w0; … ;w12 are real numbers. The attributes h1 and h2 correspond to the values of hidden units. There are 13 real numbers to be learned. The hypothesis space is thus a 13-dimensional real space. Each point in this 13-dimensional space corresponds to a particular logic program that predicts a value for reads given known, new, short, and home

  18. Prediction Error

  19. Neural Network Learning Aim of neural network learning: given a set of examples, find parameter settings that minimize the error. Back-propagation learning is gradient descent search through the parameter space to minimize the sum-of-squares error.

  20. Backpropagation Learning • Inputs: • A network, including all units and their connections • Stopping Criteria • Learning Rate (constant of proportionality of gradient descent search) • Initial values for the parameters • A set of classified training data • Output: Updated values for the parameters

  21. Backpropagation Learning Algorithm • Repeat • evaluate the network on each example given the current parameter settings • determine the derivative of the error for each parameter • change each parameter in proportion to its derivative • until the stopping criteria is met

  22. Gradient Descent for Neural Net Learning

  23. Bias in neural networks and decision trees • It’s easy for a neural network to represent “at least two of I1, …, Ik are true”: w0 w1 wk -15 10 10 This concept forms a large decision tree. • Consider representing a conditional: “If c then a else b”: • Simple in a decision tree. • Needs a complicated neural network to represent (c ^ a) V (~c ^ b).

  24. Neural Networks and Logic Meaning is attached to the input and output units. There is no a priori meaning associated with the hidden units. What the hidden units actually represent is something that’s learned.

More Related