1 / 40

# Belief space planning assuming maximum likelihood observations - PowerPoint PPT Presentation

Belief space planning assuming maximum likelihood observations. Robert Platt Russ Tedrake, Leslie Kaelbling, Tomas Lozano-Perez Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology June 30, 2010. Planning from a manipulation perspective.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

## PowerPoint Slideshow about ' Belief space planning assuming maximum likelihood observations' - baeddan-williams

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Belief space planning assuming maximum likelihood observations

Robert Platt

Russ Tedrake, Leslie Kaelbling, Tomas Lozano-Perez

Computer Science and Artificial Intelligence Laboratory,

Massachusetts Institute of Technology

June 30, 2010

Planning from a manipulation perspective observations

(image from www.programmingvision.com, Rosen Diankov )

• The “system” being controlled includes both the robot and the objects being manipulated.

• Motion plans are useless if environment is misperceived.

• Perception can be improved by interacting with environment: move head, push objects, feel objects, etc…

The general problem: planning under uncertainty observations

• Planning and control with:

• Imperfect state information

• Continuous states, actions, and observations

most robotics problems

N. Roy, et al.

Strategy: plan in belief space observations

(underlying state space)

(belief space)

1. Redefine problem:

“Belief” state space

2. Convert underlying dynamics into belief space dynamics

goal

3. Create plan

start

Related work observations

• Prentice, Roy, The Belief Roadmap: Efficient Planning in Belief Space by Factoring the Covariance, IJRR 2009

• Porta, Vlassis, Spaan, Poupart, Point-based value iteration for continuous POMDPs, JMLR 2006

• Miller, Harris, Chong, Coordinated guidance of autonomous UAVs via nominal belief-state optimization, ACC 2009

• Van den Berg, Abeel, Goldberg, LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information, RSS 2010

Simple example: Light-dark domain observations

underlying state

action

Underlying system:

Observations:

observation noise

observation

“dark”

“light”

State dependent noise:

start

goal

Simple example: Light-dark domain observations

underlying state

action

Underlying system:

Observations:

observation noise

observation

“dark”

“light”

State dependent noise:

start

Nominal information gathering plan

goal

Belief system observations

state

Underlying system:

action

(deterministic process dynamics)

(stochastic observation dynamics)

observation

• Belief system:

• Approximate belief state as a Gaussian

Similarity to an underactuated mechanical system observations

Acrobot

Gaussian belief:

State space:

Planning objective:

Underactuated dynamics:

???

Belief space dynamics observations

goal

start

Generalized Kalman filter:

Belief space dynamics are stochastic observations

goal

unexpected observation

start

Generalized Kalman filter:

BUT – we don’t know observations at planning time

Plan for the expected observation observations

Generalized Kalman filter:

Plan for the expected observation:

Model observation stochasticity as Gaussian noise

We will use feedback and replanning to handle departures from expected observation….

Belief space planning problem observations

Find finite horizon path, , starting at that minimizes cost function:

Minimize:

• Minimize covariance at final state

• Minimize state uncertainty along the directions.

• Action cost

• Find least effort path

Subject to:

Trajectory must reach this final state

Existing planning and control methods apply observations

• Now we can apply:

• Motion planning w/ differential constraints (RRT, …)

• Policy optimization

• LQR

• LQR-Trees

Planning method: direct transcription to SQP observations

1. Parameterize trajectory by via points:

• 2. Shift via points until a local minimum is reached:

• Enforce dynamic constraints during shifting

• 3. Accomplished by transcribing the control problem into a Sequential Quadratic Programming (SQP) problem.

• Only guaranteed to find locally optimal solutions

Example: light-dark problem observations

X

Y

• In this case, covariance is constrained to remain isotropic

Replanning observations

New trajectory

goal

Original trajectory

• Replan when deviation from trajectory exceeds a threshold:

Replanning: light-dark problem observations

Planned trajectory

Actual trajectory

Replanning: light-dark problem observations

Originally planned path

Path actually followed by system

Planning vs. Control in Belief Space observations

• Given our specification, we can also apply control methods:

• Control methods find a policy – don’t need to replan

• A policy can stabilize a stochastic system

A plan

A control policy

Control in belief space: B-LQR observations

• In general, finding an optimal policy for a nonlinear system is hard.

• Linear quadratic regulation (LQR) is one way to find an approximate policy

• LQR is optimal only for linear systems w/ Gaussian noise.

Belief space LQR (B-LQR) for light-dark domain:

Combination of planning and control observations

Algorithm:

1. repeat

2.

3. for

4.

5. if then break

6. if belief mean at goal

7. halt

Analysis of replanning with B-LQR stabilization observations

• Theorem:

• Eventually (after finite replanning steps) belief state mean reaches goal with low covariance.

• Conditions:

• Zero process noise.

• Underlying system passively critically stable

• Non-zero measurement noise.

• SQP finds a path with length < T to the goal belief region from anywhere in the reachable belief space.

• Cost function is of correct form (given earlier).

Laser-grasp domain observations

Laser-grasp: the plan observations

Laser-grasp: reality observations

Initially planned path

Actual path

Conclusions observations

• Planning for partially observable problems is one of the keys to robustness.

• Our work is one of the few methods for partially observable planning in continuous state/action/observation spaces.

• We view the problem as an underactuated planning problem in belief space.