latent learning in agents n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Latent Learning in Agents PowerPoint Presentation
Download Presentation
Latent Learning in Agents

Loading in 2 Seconds...

play fullscreen
1 / 10

Latent Learning in Agents - PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on

Latent Learning in Agents. iCML 03 Robotics/Vision Workshop Rati Sharma. Problem statement. Determine optimal paths in spatial navigation tasks. We use a deterministic grid environment as our world model. Various approaches have been used: ANN’s, Q-learning, Dyna. Latent Learning.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Latent Learning in Agents' - sheryl


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
latent learning in agents

Latent Learning in Agents

iCML 03

Robotics/Vision Workshop

Rati Sharma

problem statement
Problem statement
  • Determine optimal paths in spatial navigation tasks.
  • We use a deterministic grid environment as our world model.
  • Various approaches have been used: ANN’s, Q-learning, Dyna
latent learning
Latent Learning
  • Tolman proposed the idea of Cognitive Maps based on experiments on Rats in a T-maze
  • Latent Learning is described as learning that is not evident to the observer at the time it occurs but is apparent once a reinforcement is introduced.
algorithms used q learning
Algorithms used: Q-Learning
  • Q-Learning – Eventually converges to an optimal policy without ever having to learn and use an internal model of the environment
  • Update rule

Q(s,a) = r(s,a) + *max Q((s,a),a’)

r(s,a) is the reward function,

(s,a) is the new state

Disadvantage: Convergence to the optimal policy can be very slow

model based learning dyna
Model based learning- Dyna
  • A form of planning is performed in addition to learning.
  • Learning updates the appropriate value function estimates according to experience
  • Planning updates the same value function estimates for simulated transitions chosen from the world model.
problems considered
Problems considered
  • Blocking problem
  • Shortcut problem
conclusion
Conclusion
  • Model based learning performs significantly better that Q-Learning.
  • On the blocking and shortcut problems the agent demonstrates latent learning
acknowledgements
Acknowledgements
  • Prof. Littman
  • Prof. Horatiu Voicu