Latent Learning in Agents. iCML 03 Robotics/Vision Workshop Rati Sharma. Problem statement. Determine optimal paths in spatial navigation tasks. We use a deterministic grid environment as our world model. Various approaches have been used: ANN’s, Q-learning, Dyna. Latent Learning.
Q(s,a) = r(s,a) + *max Q((s,a),a’)
r(s,a) is the reward function,
(s,a) is the new state
Disadvantage: Convergence to the optimal policy can be very slow