Latent Learning in Agents

Latent Learning in Agents iCML 03 Robotics/Vision Workshop Rati Sharma

Problem statement • Determine optimal paths in spatial navigation tasks. • We use a deterministic grid environment as our world model. • Various approaches have been used: ANN’s, Q-learning, Dyna

Latent Learning • Tolman proposed the idea of Cognitive Maps based on experiments on Rats in a T-maze • Latent Learning is described as learning that is not evident to the observer at the time it occurs but is apparent once a reinforcement is introduced.

Algorithms used: Q-Learning • Q-Learning – Eventually converges to an optimal policy without ever having to learn and use an internal model of the environment • Update rule Q(s,a) = r(s,a) + *max Q((s,a),a’) r(s,a) is the reward function, (s,a) is the new state Disadvantage: Convergence to the optimal policy can be very slow

Model based learning- Dyna • A form of planning is performed in addition to learning. • Learning updates the appropriate value function estimates according to experience • Planning updates the same value function estimates for simulated transitions chosen from the world model.

Problems considered • Blocking problem • Shortcut problem

Results

Conclusion • Model based learning performs significantly better that Q-Learning. • On the blocking and shortcut problems the agent demonstrates latent learning

Acknowledgements • Prof. Littman • Prof. Horatiu Voicu

Latent Learning in Agents