Sample-based Planning for Continuous Action Markov Decision Processes [on robots]. Ari Weinstein. Reinforcement Learning (RL). Agent takes an action in the world, gets information including numerical reward; how does it learn to maximize that reward?.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Sample-based Planning for Continuous Action Markov Decision Processes[on robots]
Composing pieces in this manner is novel
<s,a>, get <r,s’>
R((p,v), a) = -(p2+a2)
+/- 0.05 units uniformly distributed noise on actions
Order is red, green, blue, yellow, magenta, cyan