Neural Networks

Neural Networks • Neural Networks (NNs) are biologically-inspired techniques for learning. • NNs can be used to solve problems: • represented as attribute-value pairs • where an acceptable hypothesis is represented as a real or discrete valued vector • where training example contain noise • where long training times are acceptable • where fast evaluation of a learned hypothesis is required • which an understanding of how input are mapped to outputs is not a requirement • NNs can be used to approximate • Any continuous function (one hidden layer) • Any discontinuous function (two hidden layers)

Neural Networks • NNs attempt to model the way the brain is structured: • 10 billion neurons that communicate via 60 trillion connections (synapses). • Parallel rather than sequential processing. • NNs are composed of the following elements: • Neuron (soma) • Inputs (dendrites) • Outputs of Neurons (axons) • Weights (synapes)

Neural Networks

Neural Networks • All the weights of a NN comprise its weight set, W = {w,v}. • The number of neurons per layer (hidden units), number of hidden layers, and the specified connections for each layer comprise the network architecture.

Neural Networks • In the preceding figure, all of the zeroth inputs to either the hidden our output layer are referred to as thresholds and are typically set to -1. • The weights of a neural network can be any positive or negative value. • The input values are multiplied by the weights that connect them to a particular neuron. • Neurons take this weighted sum as input and use an activation function to compute the neurons output. • The output of one neuron becomes the input to another neuron multiplied by a different subset of weights.

Neural Networks • The input coming into a neuron, Hj, can be calculated as: • Hj =  xi wij • Where xi represents the ith input and wij represents the weight connecting the ith input to the jth hidden unit. • The activation of Hj, f(Hj), can be computed using a variety of functions: • Sign Function (Classification, Pattern Recognition) • Step Function (Same as above) • Sigmoid Function (Squashes inputs so outputs are within 0..1) • Linear Function (Used for linear approximation)

Neural Networks

Neural Networksf(A) = 1/(1+e-A)

Neural Networks

Neural Networks • In training a neural network (in this case a backpropagation neural network) • Step 1: A training instance is selected and inputted into the neural network. • Step 2: The inputs are propagated through the network to the output layer. • Step 3: The squared error of the output is computed (desired output minus the network output). • Step 4: The error is propagated backwards through the network so that the weight can be updated. • Step 5: goto Step 1 • This process is repeated until every training instances has been processed. Then the resulting NN is checked with the validation set if it is better than the previous best NN then it is saved; • This the 5 steps are then repeated until a NN is discovered that doesn’t perform better then the current best NN.

Neural Networks:Forward Pass • Hj =  xi wij • Ok =  f(Hj) vik (where f is a sigmoid function) • Outk =  f(Ok)

Neural Networks:Backward Pass ((f’(y) = f(y)(1-f(y)) • Backward Pass (Output to Hidden Layer) • Ok = f’(Ok)(tk-Outk) • vjk = vjk + vjk • vjk = f(Hj)Ok • vjk = f(Hj)f’(Ok)(tk-Outk)

Neural Networks:Backward Pass ((f’(y) = f(y)(1-f(y)) • Backward Pass (Hidden to Input Layer) • Hj = f’(Hj) k vjkOk • wij = wij + wij • wij = xijHj • wij = xij f’(Hj) k (vjkOk) • wij = xij f’(Hj) k (vjk f’(Ok)(tk-Outk))

Neural Networks:A Simple Example Backpropagation Revisited Backpropagating the error from the output layer back to the input usually takes about 2(n+1) steps where n is the number of hidden layers. For the network with one hidden layer (like we have seen previously), we need 4 steps. Step 1: Calculate the error gradient of the output layer neurons Ok Ok = f’(Ok)(tk-Outk) In this equation, (tk-Outk) tells us which magnitude of the direction we need to move. Basically we have three choices positive (if tk > Outk), move negative (if tk < Outk), or don’t move at all (if tk = Outk). In the above equation, f’(Ok)tells us the appropriate slope/angle/direction to move along. Now, let’s take another look at the sigmoid function again!

Neural Networks:A Simple Example Step 2: Calculate the new values for weights vjk vjk = vjk + f(Hj)Ok In the above equation notice that if Hjdoes not change then vjk has to be adjusted. Notice that if our error gradient is positive then we will increase the value of the weight. If the error gradient is negative then we will decrease the value of the weight. Step 3: Calculate the error gradient of the hidden layer neurons Hj Hj = f’(Hj) k vjk Ok Notice in the above equation we need to know the direction and the magnitude. The direction is supplied by f’(Hj) and for the magnitude we need to consider all of the output neurons that Hj is connected to.

Neural Networks:A Simple Example Step 4: Calculate the new values for weights wij wij = wij + xijHj The above equation is very similar to what we had to do in Step 2. However the inputs connected to the weights are xij .

Neural Networks

Neural Networks

Presentation Transcript

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

NEURAL NETWORKS

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks