1 / 29

CSE 473 Introduction to Artificial Intelligence Neural Networks

CSE 473 Introduction to Artificial Intelligence Neural Networks. Henry Kautz Spring 2006. Training a Single Neuron. Idea: adjust weights to reduce sum of squared errors over training set Error = difference between actual and intended output Algorithm: gradient descent

hao
Download Presentation

CSE 473 Introduction to Artificial Intelligence Neural Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 473Introduction to Artificial IntelligenceNeural Networks Henry Kautz Spring 2006

  2. Training a Single Neuron • Idea: adjust weights to reduce sum of squared errors over training set • Error = difference between actual and intended output • Algorithm: gradient descent • Calculate derivative (slope) of error function • Take a small step in the “downward” direction • Step size is the “training rate” • Single-layer network: can train each unit separately

  3. Gradient Descent

  4. Computing Partial Derivatives

  5. Single Unit Training Rule Adjust weight i in proportion to… • Training rate • Error • Derivative of the “squashing function” • Degree to which input i was active

  6. Sigmoid Units

  7. Sigmoid Unit Training Rule Adjust weight i in proportion to… • Training rate • Error • Degree to which output is ambiguous • Degree to which input i was active

  8. Expressivity of Neural Networks • Single units can learn any linear function • Single layer of units can learn any set of linear inequalities (convex region) • Two layers can learn any continuous function • Three layers can learn any computable function

  9. Character Recognition Demo

  10. BackProp Demo 1 • http://www.neuro.sfc.keio.ac.jp/~masato/jv/sl/BP.html • Local version: BP.html

  11. Backprop Demo 2 • http://www.williewheeler.com/software/bnn.html • Local version: bnn.html

  12. Modeling the Brain • Backpropagation is the most commonly used algorithm for supervised learning with feed-forward neural networks • But most neuroscientists believe that brain does not implement backprop • Many other learning rules have been studied

  13. Hebbian Learning • Alternative to backprop for unsupervised learning • Increase weights on connected neurons whenever both fire simultaneously • Neurologically plausible (Hebbs 1949)

  14. Self-Organizing Maps • Unsupervised method for clustering data • Learns a “winner take all” network where just one output neuron is on for each cluster

  15. Why “Self-Organizing”

  16. Recurrent Neural Networks • Include time-delay feedback loops • Can handle temporal data tasks, such as sequence prediction

More Related