Latent Learning in Agents

1 / 10

# Latent Learning in Agents - PowerPoint PPT Presentation

Latent Learning in Agents. iCML 03 Robotics/Vision Workshop Rati Sharma. Problem statement. Determine optimal paths in spatial navigation tasks. We use a deterministic grid environment as our world model. Various approaches have been used: ANN’s, Q-learning, Dyna. Latent Learning.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Latent Learning in Agents' - sheryl

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Latent Learning in Agents

iCML 03

Robotics/Vision Workshop

Rati Sharma

Problem statement
• Determine optimal paths in spatial navigation tasks.
• We use a deterministic grid environment as our world model.
• Various approaches have been used: ANN’s, Q-learning, Dyna
Latent Learning
• Tolman proposed the idea of Cognitive Maps based on experiments on Rats in a T-maze
• Latent Learning is described as learning that is not evident to the observer at the time it occurs but is apparent once a reinforcement is introduced.
Algorithms used: Q-Learning
• Q-Learning – Eventually converges to an optimal policy without ever having to learn and use an internal model of the environment
• Update rule

Q(s,a) = r(s,a) + *max Q((s,a),a’)

r(s,a) is the reward function,

(s,a) is the new state

Disadvantage: Convergence to the optimal policy can be very slow

Model based learning- Dyna
• A form of planning is performed in addition to learning.
• Learning updates the appropriate value function estimates according to experience
• Planning updates the same value function estimates for simulated transitions chosen from the world model.
Problems considered
• Blocking problem
• Shortcut problem
Conclusion
• Model based learning performs significantly better that Q-Learning.
• On the blocking and shortcut problems the agent demonstrates latent learning
Acknowledgements
• Prof. Littman
• Prof. Horatiu Voicu