1 / 22

Principles of Back-Propagation Prof. Bart ter Haar Romeny

The relation between biological vision and computer vision. Principles of Back-Propagation Prof. Bart ter Haar Romeny. How does this actually work?. Deep Learning Convolutional Neural Networks. In. AlexNet (Alex Krizhevsky 2012). Error backpropagation. ImageNet challenge: 1.4 million

pfay
Download Presentation

Principles of Back-Propagation Prof. Bart ter Haar Romeny

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The relation between biological vision and computer vision Principles of Back-Propagation Prof. Bart ter Haar Romeny

  2. How does this actually work? Deep Learning Convolutional Neural Networks In AlexNet(Alex Krizhevsky2012) Error backpropagation ImageNet challenge: 1.4 million images, 1000 classes 75% → 94% A typical big deep NN has (hundreds of) millions of connections: weights. Convolution, ReLU, max pooling, convolution, convolution etc.

  3. From Prakash Jay, Senior Data Scientist @FractalAnalytics: https://medium.com/@14prakash/back-propagation-is-very-simple-who-made-it-complicated-97b794c97e5c A numerical example of backpropagation on a simple network:

  4. Approach • Build a small neural network as defined in the architecture right. • Initialize the weights and biases randomly. • Fix the input and output. • Forward pass the inputs. Calculate the cost. • Compute the gradients and errors. • Backprop and adjust the weights and biases accordingly. We initialize the network randomly:

  5. Forward pass layer 1:

  6. Forward pass layer 1: Matrix operation: Relu operation: Example:

  7. Forward pass layer 2:

  8. Forward pass layer 2: Matrix operation: Sigmoid operation: Example:

  9. Forward pass layer 3:

  10. Forward pass output layer: Matrix operation: Softmax operation: Example: [ 0.1985 0.2855 0.5158 ]

  11. Analysis: • The Actual Output should be [1.0, 0.0, 0.0] • but we got [0.2698, 0.3223, 0.4078]. • To calculate the error let us use cross-entropy • Error: Example: Error = -(1 * Log[0.19858]+0+0 * Log[0.28559]+1 * Log[1-0.28559] +0 *Log[0.51583]+1 * Log[1-0.51583]) = 2.67818

  12. Analysis: • The Actual Output should be [1.0, 0.0, 0.0] but we got [0.19858, 0.28559, 0.51583]. • To calculate the error let us use cross-entropy • Error: Example: Error = -(1 * Log[0.19858]+0+0 * Log[0.28559]+1 * Log[1-0.28559] +0 *Log[0.51583]+1 * Log[1-0.51583]) = 2.67818 We are done with the forward pass. We know the error of the first iteration (we go do this numerous times). Now let us study the backward pass.

  13. A chain of functions: From Rohan Kapur: https://ayearofai.com/rohan-lenny-1-neural-networks-the-backpropagation-algorithm-explained-abf4609d4f9d

  14. We recall:

  15. For gradient descent: The derivative of this function with respect to some arbitrary weight (for example w1) is calculated by applying the chain rule: For a simple error measure (p = predicted, a = actual):

  16. Important derivatives: Sigmoid: ReLU: SoftMax:

  17. = Two slides ago, we saw that

  18. Going one more layer backwards, we can determine that: With etc. And finally: 1 And iterate until convergence:

  19. Numerical example in great detail by Prakash Jay on Medium.com: • https://medium.com/@14prakash/back-propagation-is-very-simple-who-made-it-complicated-97b794c97e5c etc.

  20. Deeper reading: • https://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative • https://eli.thegreenplace.net/2018/backpropagation-through-a-fully-connected-layer/

More Related