1 / 40

Belief space planning assuming maximum likelihood observations

Belief space planning assuming maximum likelihood observations. Robert Platt Russ Tedrake, Leslie Kaelbling, Tomas Lozano-Perez Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology June 30, 2010. Planning from a manipulation perspective.

penda
Download Presentation

Belief space planning assuming maximum likelihood observations

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Belief space planning assuming maximum likelihood observations Robert Platt Russ Tedrake, Leslie Kaelbling, Tomas Lozano-Perez Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology June 30, 2010

  2. Planning from a manipulation perspective (image from www.programmingvision.com, Rosen Diankov ) • The “system” being controlled includes both the robot and the objects being manipulated. • Motion plans are useless if environment is misperceived. • Perception can be improved by interacting with environment: move head, push objects, feel objects, etc…

  3. The general problem: planning under uncertainty • Planning and control with: • Imperfect state information • Continuous states, actions, and observations most robotics problems N. Roy, et al.

  4. Strategy: plan in belief space (underlying state space) (belief space) 1. Redefine problem: “Belief” state space 2. Convert underlying dynamics into belief space dynamics goal 3. Create plan start

  5. Related work • Prentice, Roy, The Belief Roadmap: Efficient Planning in Belief Space by Factoring the Covariance, IJRR 2009 • Porta, Vlassis, Spaan, Poupart, Point-based value iteration for continuous POMDPs, JMLR 2006 • Miller, Harris, Chong, Coordinated guidance of autonomous UAVs via nominal belief-state optimization, ACC 2009 • Van den Berg, Abeel, Goldberg, LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information, RSS 2010

  6. Simple example: Light-dark domain underlying state action Underlying system: Observations: observation noise observation “dark” “light” State dependent noise: start goal

  7. Simple example: Light-dark domain underlying state action Underlying system: Observations: observation noise observation “dark” “light” State dependent noise: start Nominal information gathering plan goal

  8. Belief system state Underlying system: action (deterministic process dynamics) (stochastic observation dynamics) observation • Belief system: • Approximate belief state as a Gaussian

  9. Similarity to an underactuated mechanical system Acrobot Gaussian belief: State space: Planning objective: Underactuated dynamics: ???

  10. Belief space dynamics goal start Generalized Kalman filter:

  11. Belief space dynamics are stochastic goal unexpected observation start Generalized Kalman filter: BUT – we don’t know observations at planning time

  12. Plan for the expected observation Generalized Kalman filter: Plan for the expected observation: Model observation stochasticity as Gaussian noise We will use feedback and replanning to handle departures from expected observation….

  13. Belief space planning problem Find finite horizon path, , starting at that minimizes cost function: Minimize: • Minimize covariance at final state • Minimize state uncertainty along the directions. • Action cost • Find least effort path Subject to: Trajectory must reach this final state

  14. Existing planning and control methods apply • Now we can apply: • Motion planning w/ differential constraints (RRT, …) • Policy optimization • LQR • LQR-Trees

  15. Planning method: direct transcription to SQP 1. Parameterize trajectory by via points: • 2. Shift via points until a local minimum is reached: • Enforce dynamic constraints during shifting • 3. Accomplished by transcribing the control problem into a Sequential Quadratic Programming (SQP) problem. • Only guaranteed to find locally optimal solutions

  16. Example: light-dark problem X Y • In this case, covariance is constrained to remain isotropic

  17. Replanning New trajectory goal Original trajectory • Replan when deviation from trajectory exceeds a threshold:

  18. Replanning: light-dark problem Planned trajectory Actual trajectory

  19. Replanning: light-dark problem

  20. Replanning: light-dark problem

  21. Replanning: light-dark problem

  22. Replanning: light-dark problem

  23. Replanning: light-dark problem

  24. Replanning: light-dark problem

  25. Replanning: light-dark problem

  26. Replanning: light-dark problem

  27. Replanning: light-dark problem

  28. Replanning: light-dark problem

  29. Replanning: light-dark problem

  30. Replanning: light-dark problem

  31. Replanning: light-dark problem

  32. Replanning: light-dark problem Originally planned path Path actually followed by system

  33. Planning vs. Control in Belief Space • Given our specification, we can also apply control methods: • Control methods find a policy – don’t need to replan • A policy can stabilize a stochastic system A plan A control policy

  34. Control in belief space: B-LQR • In general, finding an optimal policy for a nonlinear system is hard. • Linear quadratic regulation (LQR) is one way to find an approximate policy • LQR is optimal only for linear systems w/ Gaussian noise. Belief space LQR (B-LQR) for light-dark domain:

  35. Combination of planning and control Algorithm: 1. repeat 2. 3. for 4. 5. if then break 6. if belief mean at goal 7. halt

  36. Analysis of replanning with B-LQR stabilization • Theorem: • Eventually (after finite replanning steps) belief state mean reaches goal with low covariance. • Conditions: • Zero process noise. • Underlying system passively critically stable • Non-zero measurement noise. • SQP finds a path with length < T to the goal belief region from anywhere in the reachable belief space. • Cost function is of correct form (given earlier).

  37. Laser-grasp domain

  38. Laser-grasp: the plan

  39. Laser-grasp: reality Initially planned path Actual path

  40. Conclusions • Planning for partially observable problems is one of the keys to robustness. • Our work is one of the few methods for partially observable planning in continuous state/action/observation spaces. • We view the problem as an underactuated planning problem in belief space.

More Related