1 / 42

CS b659: Intelligent Robotics

CS b659: Intelligent Robotics. Planning Under Uncertainty. Offline learning (e.g., calibration). Prior knowledge (often probabilistic) . Sensors. Perception (e.g., filtering, SLAM). Decision-making (control laws, optimization, planning). Actions. Actuators. Dealing with Uncertainty.

leoma
Download Presentation

CS b659: Intelligent Robotics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS b659: Intelligent Robotics Planning Under Uncertainty

  2. Offline learning(e.g., calibration) Prior knowledge (often probabilistic) Sensors Perception(e.g., filtering, SLAM) Decision-making(control laws, optimization, planning) Actions Actuators

  3. Dealing with Uncertainty • Sensing uncertainty • Localization error • Noisy maps • Misclassified objects • Motion uncertainty • Noisy natural processes • Odometry drift • Imprecise actuators • Uncontrolled agents (treat as state)

  4. s Motion Uncertainty obstacle obstacle g obstacle

  5. Dealing with Motion Uncertainty • Hierarchical strategies • Use generated trajectory as input into a low-level feedback controller • Reactive strategies • Approach #1: re-optimize when the state has diverged from the planned path (online optimization) • Approach #2: precompute optimal controls over state space, and just read off the new value from the perturbed state (offlineoptimization) • Proactive strategies • Explicitly consider future uncertainty • Heuristics (e.g., grow obstacles, penalize nearness) • Markov Decision Processes • Online and offline approaches

  6. s Proactive Strategies to handle Motion Uncertainty obstacle obstacle g obstacle

  7. Dynamic collision avoidance assuming worst-case behaviors • Optimizing safety in real-time under worst case behavior model T=2 Robot T=1 TTPF=2.6

  8. Markov Decision Process Approaches Alterovitz et al 2007

  9. Dealing with Sensing Uncertainty • Reactive heuristics often work well • Optimistic costs • Assume most-likely state • Penalize uncertainty • Proactive strategies • Explicitly consider future uncertainty • Active sensing • Reward actions that yield information gain • Partially Observable Markov Decision Processes (POMDPs)

  10. Assuming no obstacles in the unknown region and taking the shortest path to the goal

  11. Assuming no obstacles in the unknown region and taking the shortest path to the goal

  12. Assuming no obstacles in the unknown region and taking the shortest path to the goal Works well for navigation because the space of all maps is too huge, and certainty is monotonically nondecreasing

  13. Assuming no obstacles in the unknown region and taking the shortest path to the goal Works well for navigation because the space of all maps is too huge, and certainty is monotonically nondecreasing

  14. What if the sensor was directed (e.g., a camera)?

  15. What if the sensor was directed (e.g., a camera)?

  16. What if the sensor was directed (e.g., a camera)?

  17. What if the sensor was directed (e.g., a camera)?

  18. What if the sensor was directed (e.g., a camera)?

  19. What if the sensor was directed (e.g., a camera)?

  20. What if the sensor was directed (e.g., a camera)?

  21. What if the sensor was directed (e.g., a camera)?

  22. What if the sensor was directed (e.g., a camera)?

  23. What if the sensor was directed (e.g., a camera)?

  24. What if the sensor was directed (e.g., a camera)? At this point, it would have made sense to turn a bit more to see more of the unknown map

  25. Another Example ? Locked? Key? Key?

  26. Active Discovery of Hidden Intent No motion Perpendicular motion

  27. Main Approaches to Partial Observability • Ignore and react • Doesn’t know what it doesn’t know • Model and react • Knows what it doesn’t know • Modeland predict • Knows what it doesn’t know AND what it will/won’t know in the future • Better decisions • More components in implementation • Harder computationally

  28. Uncertainty models • Model #1: Nondeterministic uncertainty • f(x,u) -> a set of possible successors • Model #2: Probabilistic uncertainty • P(x’|x,u): a probability distribution over successors x’, given state x, control u • Markov assumption

  29. -a a t=1 -2a 2a t=2 Nondeterministic Uncertainty : Reasoning with sets • x’ = x + e,   [-a,a] t=0 Belief State: x(t)  [-ta,ta]

  30. Uncertainty with Sensing • Plan = policy (mapping from states to actions) • Policy achieves goal for every possible sensor result in belief state Observations should be chosen wisely to keep branching factor low Move 1 2 Sense 3 4 Outcomes Special case: fully observable state

  31. target robot Target Tracking • The robot must keep a target in its field of view • The robot has a prior map of the obstacles • But it does not know the target’s trajectory in advance

  32. target robot Target-Tracking Example • Time is discretized into small steps of unit duration • At each time step, each of the two agents moves by at most one increment along a single axis • The two moves are simultaneous • The robot senses the new position of the target at each step • The target is not influenced by the robot (non-adversarial, non-cooperative target)

  33. ([i,j], [u,v], t) • ([i+1,j], [u,v], t+1) • ([i+1,j], [u-1,v], t+1) • ([i+1,j], [u+1,v], t+1) • ([i+1,j], [u,v-1], t+1) • ([i+1,j], [u,v+1], t+1) right Time-Stamped States (no cycles possible) • State = (robot-position, target-position, time) • In each state, the robot can execute 5 possible actions : {stop, up, down, right, left} • Each action has 5 possible outcomes (one for each possible action of the target), with some probability distribution [Potential collisions are ignored for simplifying the presentation]

  34. Rewards and Costs The robot must keep seeing the target as long as possible • Each state where it does not see the target is terminal • The reward collected in every non-terminal state is 1; it is 0 in each terminal state [ The sum of the rewards collected in an execution run is exactly the amount of time the robot sees the target] • No cost for moving vs. not moving

  35. Expanding the state/action tree ... horizon h horizon 1

  36. Assigning rewards • Terminal states: states where the target is not visible • Rewards: 1 in non-terminal states; 0 in others • But how to estimate the utility of a leaf at horizon h? ... horizon h horizon 1

  37. d target robot Estimating the utility of a leaf ... • Compute the shortest distance d for the target to escape the robot’s current field of view • If the maximal velocity v of the target is known, estimate the utility of the state to d/v [conservative estimate] horizon h horizon 1

  38. Selecting the next action • Compute the optimal policy over the state/action tree using estimated utilities at leaf nodes • Execute only the first step of this policy • Repeat everything again at t+1… (sliding horizon) ... horizon h horizon 1

  39. Pure Visual Servoing

  40. Computing and Using a Policy

  41. Next week • More algorithm details

  42. Final information • Final presentations • 20 minutes + 10 minutes questions • Final reports due 5/3 • Must include a technical report • Introduction • Background • Methods • Results • Conclusion • May include auxiliary website with figures / examples / implementation details • I will not review code

More Related