1 / 32

Finding Approximate POMDP Solutions through Belief Compression

Finding Approximate POMDP Solutions through Belief Compression. Based on slides by Nicholas Roy, MIT. Estimated robot position Robot position distribution True robot position Goal position. Reliable Navigation. Conventional trajectories may not be robust to localisation error. Control.

odin
Download Presentation

Finding Approximate POMDP Solutions through Belief Compression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Finding Approximate POMDP Solutions through Belief Compression Based on slides by Nicholas Roy, MIT

  2. Estimated robot position Robot position distribution True robot position Goal position Reliable Navigation Conventional trajectories may not be robust to localisation error

  3. Control Perception World state Perception and Control Control algorithms

  4. Probabilistic PerceptionModel P(x) Control World state Probabilistic PerceptionModel P(x) argmax P(x) Control World state Perception and Control Assumed full observability Exact POMDP planning Intractable Brittle

  5. Probabilistic PerceptionModel P(x) Control World state Compressed P(x) Perception and Control Brittle Intractable Assume full observability Exact POMDP planning

  6. Probabilistic PerceptionModel P(x) Control World state Low-dimensional P(x) Main Insight Good policies for real world POMDPs can be found by planning over low-dimensional representations of the belief space.

  7. but not usually. Belief Space Structure The controller may be globally uncertain...

  8. Coastal Navigation • Represent beliefs using • Discretise into low-dimensional belief space MDP

  9. Coastal Navigation

  10. A Hard Navigation Problem Average Distance to Goal Distance in M

  11. CharacteristicBeliefs Weights Original Beliefs Dimensionality Reduction • Principal Components Analysis

  12. Principal Components Analysis ~ • Given belief bn, we want bm, m«n. Collection of beliefs drawn from 200 state problem Probability of being in state State

  13. m=9 gives this representation for one sample distribution Principal Components Analysis ~ • Given belief bn, we want bm, m«n. One sample distribution Probability of being in state State

  14. Principal Components Analysis Many real world POMDP distributions are characterised by large regions of low probability. Idea: Create fitting criterion that is (exponentially) stronger in low-probability regions (E-PCA)

  15. 3 bases 4 bases 1 basis 2 bases Example EPCA Probability of being in state State

  16. Example Reduction

  17. Finding Dimensionality • E-PCA will indicate appropriate number of bases, depending on beliefs encountered

  18. S1 Discretise E-PCA S2 S3 Original POMDP Low-dimensional belief space B Discrete belief space MDP Planning ~

  19. Model Parameters • Reward function p(s) Back-project to high dimensional belief s1 s2 s3 Compute expected reward from belief: ~ R(b) ~

  20. ~ 2. Recover full belief bi ~ bi bi bj bj 3. Propagate according toaction ~ ~ 1. For each belief bi and action a 4. Propagate according toobservation ~ 5. Recover bj ~ ~ 6. Set T(bi, a, bj) to probabilityof observation Model Parameters Low dimension Full dimension

  21. Initial Distribution Goal state True (hidden) robot position Goal position Robot Navigation Example

  22. True robot position Goal position Robot Navigation Example

  23. Policy Comparison Average Distance to Goal Distance in M 6 bases

  24. People Finding

  25. Robot position True person position People Finding as a POMDP Fully Observable Robot Position of person unknown

  26. Finding and Tracking People Robot position True person position

  27. People Finding as a POMDP • Factored belief space • 2 dimensions: fully-observable robot position • 6 dimensions: distribution over person positions Regular grid gives ≈ 1016 states

  28. Variable Resolution • Non-regular grid using samples ~ ~ T(b1, a1, b2) ~ b1 ~ b2 ~ b3 ~ b4 ~ ~ T(b1, a2, b5) ~ b5 Compute model parameters using nearest-neighbour

  29. ~ V(b1) ~ b1 ~ V(b'1) ~ b' ~ ~ Keep new belief if V(b'1) > V(b1) Refining the Grid Sample beliefs according to policy Construct new model

  30. Robot position True person position The Optimal Policy Original distribution Reconstruction using EPCA and 6 bases

  31. E-PCA Policy Comparison Average time to find person Average # of Actions to find Person Fully observable MDP E-PCA: 72 states Refined E-PCA: 260 states

  32. Nick’s Thesis Contributions • Good policies for real world POMDPs can be found by planning over a low-dimensional representation of the belief space, using E-PCA. • POMDPs can scale to bigger, more complicated real-world problems. • POMDPs can be used for real deployed robots.

More Related