1 / 12

Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta

Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta. Rich Sutton in AAAI-92. Outline. Remarks on learning and bias Least Mean Square (LMS) learning IDBD (new algorithm) Exp 1: Is IDBD better? Exp 2: Does IDBD find an optimal bias?

floyd
Download Presentation

Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adapting Bias by Gradient Descent: An Incremental Version of Delta-Bar-Delta Rich Sutton in AAAI-92

  2. Outline • Remarks on learning and bias • Least Mean Square (LMS) learning • IDBD (new algorithm) • Exp 1: Is IDBD better? • Exp 2: Does IDBD find an optimal bias? • Derivation of LMS as gradient descent • Derivation of IDBD as “meta” grad desc

  3. LMS (least mean square) Learning error Minimize expected

  4. Incremental Delta-Bar-Delta (IDBD)

  5. Experiment 1: Does IDBD Help? 20 inputs, 5 relevant one si switched every 20 examples 20,000 examples to wash transients, then 10,000 examples for real Average MSE measured over the 10,000 examples

  6. Exp 2a: What as are found by IDBD? Small q(=0.001), long run ≈0.13 ≈0

  7. Exp 2b: What are the optimalas? Repeat of first experiment with fixed as: a=0 for irrelevant inputs, vary a for relevant inputs Optimal a 0.13

  8. Derivation of LMS as grad descent (1) Error is Seek to minimize expected squared error using the sample squared error Ergo, the gradient descent idea:

  9. Derivation of LMS as grad descent (2)

  10. Derivation of IDBD as GD (1) assuming ≈0 for =  = where

  11. Derivation of IDBD as GD (2) = 

  12. Closing • IDBD improves over earlier DBD methods • Incremental, example-by-example updating • Only one free parameter • Can be derived as “meta” gradient descent • IDBD is a principled way of learning bias (for linear LMS methods) • As such can dramatically speed learning • With only a linear (≈x3) increase in memory & comp. • “Meta” gradient descent is a general idea • An incremental form of hold-one-out cross validation • Fundamentally different from other learning methods • Could be used to learn other kinds of biases

More Related