1 / 20

Stochastic Optimization and Simulated Annealing

Stochastic Optimization and Simulated Annealing. Psychology 85-419/719 January 25, 2001. In Previous Lecture. Discussed constraint satisfaction networks, having: Units, weights, and a “goodness” function Updating states involves computing input from other units

beck
Download Presentation

Stochastic Optimization and Simulated Annealing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stochastic Optimizationand Simulated Annealing Psychology 85-419/719 January 25, 2001

  2. In Previous Lecture... • Discussed constraint satisfaction networks, having: • Units, weights, and a “goodness” function • Updating states involves computing input from other units • Guaranteed to locally increase goodness • Not guaranteed to globally increase goodness

  3. True Optima Local Optima Goodness Activation State The General Problem: Local Optima

  4. How To Solve the Problemof Local Optima? • Exhaustive search? • Nah. Takes too long. n units have 2 to the nth power possible states (if binary) • Random re-starts? • Seems wasteful. • How about something that generally goes in the right direction, with some randomness?

  5. Sometimes It Isn’t Best ToAlways Go Straight TowardsThe Goal • Rubik’s Cube: Undo some moves in order to make progress • Baseball: sacrifice fly • Navigation: move away from goal, to get around obstacles

  6. Activation State Randomness Can Help Us Escape Bad Solutions

  7. So, How Random Do WeWant to Be? • We can take a cue from physical systems • In metallurgy, metals can reach a very strong (stable) state by: • Melting it; scrambles molecular structure • Gradually cooling it • Resulting molecular structure very stable • New terminology: reduce energy (which is kind of like the negative of goodness)

  8. The input to the unit, net The temperature, T Simulated Annealing Odds that a unit is on is a function of:

  9. Picking it Apart... • As net increases, probability that output is 1 increases • e is raised to the negative of net/T; so as net gets big, e to the negative of net/T goes to zero. So probability goes to 1/1=1.

  10. The Temperature Term • When T is big, the exponent for e goes to zero. • e (or anything) to the zero power is 1 • So, odds output is 1 goes to 1/(1+1)=0.5

  11. The Temperature Term (2) • When T gets small, exponent gets big. • Effect of net becomes amplified.

  12. Low Temp Med Temp High Temp Different Temperatures... 1 .5 Probability Output is 1 0 Net Input

  13. T 0 50 100 Ok, So At What RateDo We Reduce Temperature? In general, must decrease it very slowly to guarantee convergence to global optimum In practice, we can get away with a more aggressive annealing schedule..

  14. Putting it Together... • We can represent facts, etc. as units • Knowledge about these facts encoded as weights • Network processing fills in gaps, makes inferences, forms interpretations • Stable Attractors form; the weights and input sculpt these attractors. • Stability (and goodness) enhanced with randomness in updating process.

  15. Stable Attractors Can BeThought Of As Memories • How many stable patterns can be remembered by a network with N units? • There are 2 to the N possible patterns… • … but only about 0.15*N will be stable • To remember 100 things, need 100/0.15=666 units! • (then again, the brain has about 10 to the 12th power neurons…)

  16. Human Performance, When Damaged (some examples) • Category coordinate errors • Naming a CAT as a DOG • Superordinate errors • Naming a CAT as an ANIMAL • Visual errors (deep dyslexics) • Naming SYMPATHY as SYMPHONY • or, naming SYMPATHY as ORCHESTRA

  17. CAT CAT COT COT The Attractors We’ve TalkedAbout Can Be UsefulIn Understanding This “CAT” “CAT” Normal Performance A Visual Error (see Plaut Hinton, Shallice)

  18. Properties of Human Memory • Details tend to go first, more general things next. Not all-or-nothing forgetting. • Things tend to be forgotten, based on • Salience • Recency • Complexity • Age of acquisition?

  19. Do These Networks Have These Properties? • Sort of. • Graceful degradation. Features vanish as a function of strength of input to them. • Complexity: more complex / arbitrary patterns can be more difficult to retain • Salience, recency, age of acquisition? • Depends on learning rule. Stay tuned

  20. Next Time:Psychological Implications:The IAC Model of Word Perception • Optional reading: McClelland and Rumelhart ‘81 (handout) • Rest of this class: Lab session. Help installing software, help with homework.

More Related