1 / 33

Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding

Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding. Martin Sachenbacher July 1, 2003. Exact vs. Approximate ME. Problems of ME with incomplete belief state Dead ends (no solutions) Incorrect leading solutions Incorrect probabilities of solutions

berny
Download Presentation

Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exact Mode Estimation for POMDPs based on Constraint Decomposition and Symbolic Encoding Martin Sachenbacher July 1, 2003

  2. Exact vs. Approximate ME • Problems of ME with incomplete belief state • Dead ends (no solutions) • Incorrect leading solutions • Incorrect probabilities of solutions • Usefulness of ME with complete belief state • As accuracy reference • As performance reference • As a starting point for approximations • Key: Compact representation of belief state • Map to semiring-based CSP • Decompose Hypergraph into Hypertree • Encode Tree Nodes symbolically as ADDs

  3. Outline • SCSPs (Semiring-based CSPs) • Mapping State Constraints to SCSPs • Mapping Transition Constraints to SCSPs • ADDs (Algebraic Decision Diagrams) • Hypertree Decompositions of SCSPs • Solving Tree-structured SCSPs • Exact Mode Estimation for POMDPs as Decomposition/ADD-based SCSP Solving • Demonstration: Two Switches Example

  4. SCSPs (Semiring-based CSPs) • Generalization of CSPs [Bistarelli et al. 97] • Domain D, Variables V, Set S, Type T  V • Constraints are mappings Dk S • Operations  (for join) and  (for projection) on S • (S, , , 0, 1) must for form c-semiring • Dynamic Programming applicable to all SCSPs • Examples • ({0,1}, , , 0, 1): Classical CSPs • (R+, min, +, +, 0): Weighted CSPs • ([0,1], max, *, 0, 1): Probabilistic CSPs

  5. Encoding States as SCSPs • Example: Or-Gate • P(Or=ok) = 99%, P(Or=fty) = 1% Or ≥ 1 xt in1 in2 out f ok lo lo look lo hi hiok hi lo hiok hi hi hifty * * * 0.990.990.990.990.01

  6. Encoding Observations as SCSPs • Example: (Probabilistic) Observation Distribution over values for xi xi f 0.9 0123 0.60.90.30.0 P 0.6 0.3 xi 0 1 2 3

  7. Encoding Transitions as SCSPs • Example: (Probabilistic) CCA Transition Function cmd=off xt cmd xt+1 f 0.9 0.90.10.10.90.90.10.10.9 0 off 00 on 00 off 10 on 11 off 01 on 01 off 11 on 1 0 cmd=on cmd=off 0.9 0.9 1 0.9 cmd=on

  8. Algebraic Decision Diagrams • ADDs: Symbolic (graph-based) representation of functions {0,1}n R • Generalization of BDDs (functions {0,1}n {0,1}) • Canonicity of representation (as for BDDs) • Efficient package: CUDD A B B C C 0 1 2 3

  9. ADD Join Operations • Multiplication, addition, maximum, … • Generalization of BDD operations ABC f g f+g f*g 5*f f>1 max(f,g) 000001010011100101110111 01121223 32010001 32131224 02020003 055105101015 00010111 32121223

  10. Example • Summation of ADD f, ADD g A A A B B B B B B + = C C C C C C C C 3 2 1 0 0 1 2 3 4 3 2 1

  11. ADD Projection Operations • (f,X) (and (f,X)) obtained by summing (multiplying) values of tuples that differ only w.r.t. X ABC f AB (f,{C}) (f,{C}) 000001010011100101110111 01121223 00011011 1335 0226

  12. ADD Projection Operations • For optimization, we require operation max(f,X) that yields maximum value of tuples differing only w.r.t. X ABC f AB (f,{C}) (f,{C}) max(f,{C}) 000001010011100101110111 01121223 00011011 1335 0226 1223 Not part of CUDD, but easy to implement as variant of /(f,X).

  13. Solving SCSPs using Decomposition • Transform SCSPs into Hypertree H=(T,,) • Compute constraint (v) for each node v • Bottom-up phase for computing values • Top-down phase for extracting solutions

  14. Pseudocode for Bottom-Up Phase • Function solve(v) For Each child  children(v) (v)  (v)  max((child), (child) \ (v)) Next child Return(v) Generalization of (Semi-)Join Operation

  15. Example • Boolean Polycell X A = 1 Or1 B = 1 And1 F = 0 Y Or2 C = 1 And2 G = 1 D = 1 Z Or3 E = 0

  16. Example • Hypertree Decomposition of Boolean Polycell ok ok 1 0 0 0 0 1ok ok 1 0 0 0 1 1ok ok 1 0 0 1 0 1 … U=.98505 O3A1CEFXYZ v0 Y,Z Y C,X O2BDY A2GYZ O1ACX v1 v2 v3 ok 1 1 1fty 1 1 1fty 1 0 1fty 1 1 0fty 1 0 0 ok 1 1 1fty 1 1 1fty 1 1 0 U=.995 U=.99 U=.99 ok 1 1 1fty 1 1 1fty 1 1 0 U=.005 U=.01 U=.01

  17. Example • Initial (v0) U=.98505 ok ok 1 0 0 0 0 1ok ok 1 0 0 0 1 1ok ok 1 0 0 1 0 1 ADD with20 nodes,5 leaves fty ok 1 0 0 0 1 1fty ok 1 0 0 0 1 0fty ok 1 0 0 1 0 0fty ok 1 0 0 1 0 1fty ok 1 0 0 0 1 1fty ok 1 0 0 1 0 1 U=.00995 O3A1CEFXYZ v0 … U=.00495 … U=.00005

  18. Example • After multiplication with max((v1),{A2,G}) ok ok 1 0 0 0 1 1 U=.98012 ADD with28 nodes,7 leaves fty ok 1 0 0 0 1 1 U=.00990 ok ok 1 0 0 0 0 1ok ok 1 0 0 1 0 1… U=.00492 O3A1CEFXYZ v0 … U=2.4E-5 … U=4.9E-5 … U=2.5E-7

  19. Example • After multiplication with max((v2),{O2,B,D}) ADD with30 nodes,8 leaves ok ok 1 0 0 0 1 1 U=.97032 fty ok 1 0 0 0 1 1 U=.00980 ok fty 1 0 0 0 1 1 U=.00487 … U=4.9E-5 … O3A1CEFXYZ U=4.9E-7 v0 … U=2.4E-7 … U=2.5E-9

  20. Example • After multiplication with max((v3),{O1,A}) ADD with35 nodes,10 leaves ok ok 1 0 0 0 1 1 U=.00970 ok fty 1 0 0 1 1 1 U=.00482 fty ok 1 0 0 0 1 1 U=9.8E-5 Best Solution:Umax = .0097 … U=4.8E-5 … U=4.9E-7 O3A1CEFXYZ … v0 U=2.4E-7 … U=4.9E-9 … U=2.4E-9 … U=2.5E-11

  21. Pseudocode for Top-Down Phase No search queue necessary • Function extractSolutions(vroot) Eedges(vroot)   (vroot)  max(, vars() \ decvars()vars(E)) WhileE   Do e  choose(E) v  son-node(e) E  (E \ e)  edges(v) 0-1  (0) div  max(0-1  (v), vars())   (  (v)) -1 div   max(, vars() \ decvars()vars(E)) End While Restrict todecision andshared variables “Divisor”

  22. Example • Initial  = max((vroot),{E,F}) O3A1CXYZ ok ok 1 0 1 1 U=.00970 ok fty 1 1 1 1 U=.00482 fty ok 1 0 1 1 U=9.8E-5 … U=4.8E-5 ADD with21 tuples, 33 nodes, 10 leaves … U=4.9E-7 … U=2.4E-7 … U=4.9E-9 … U=2.4E-9 … U=2.5E-11

  23. Example • After processing edge(v0,v3) O1O3A1YZ fty ok ok 1 1 U=.00970 ok ok fty 1 1 U=.00482 fty fty ok 1 1 U=9.8E-5 … U=4.8E-5 ADD with21 tuples, 32 nodes, 10 leaves … U=4.9E-7 … U=2.4E-7 … U=4.9E-9 … U=2.4E-9 … U=2.5E-11

  24. Example • After processing edge(v0,v2) O1O2O3A1YZ fty ok ok ok 1 1 U=.00970 ok ok ok fty 1 1 U=.00482 fty fty ok ok 1 1fty ok fty ok 1 1 U=9.8E-5 ADD with30 tuples, 47 nodes, 11 leaves … U=4.8E-5 … U=9.9E-7 … U=4.9E-7 … … … U=2.5E-11

  25. Example • After processing edge(v0,v1) O1O2O3A1A2 fty ok ok ok ok U=.00970 ok ok ok fty ok U=.00482 ADD with26 tuples,35 nodes, 12 leaves fty fty ok ok okfty ok fty ok ok U=9.8E-5 … U=4.8E-5 … U=2.4E-5 #Solutions = 26 … U=9.9E-7 … Easy to focus on leading solutions. … … U=2.5E-11

  26. Application: Exact ME for POMDPs • Given: POMDP (Feasible States, Observables, Control Actions, Transitions), Observations • Approach: Complete representation of belief state (through decomposition and symbolic encoding) • Benefit: Allows for exploiting Markov property S0S1…Sn S0S1…Sn Time t Time t+1

  27. Algorithm: Exact ME for POMDPs • Construct Hypertree (offline) • Construct State-ADDs for each node (offline) • Construct Transition-ADDs for each node (offline) • Repeat for each time step: • Multiply nodes with Obs-ADDs (“Condition on Observations”) • Establish consistency in the tree (Bottom-up) • Extract leading solution(s) from the tree (Top-down) • Multiply nodes with Transition-ADDs, project on xt+1, set xt = xt+1, multiply with State-ADDs (“Transition Expansion”) • Complexity: Polynomial in width of Hypertree

  28. Example • Adapted from Jim Kurien’s thesis • t0: Sw1.cmd = on • t1: Or.out = lo, Sw1.cmd = idl, Sw2.cmd = on • t2: Or.out = lo Sw1 Or Switches more likely to fail than Or-Gate hi ≥ 1 hi Sw2

  29. Example • Switch Model cmd=on,idl cmd=off,idl 0.95 0.95 0.95 t1 t2 cmd=off t1 t2 lo lo lo hihi lohi hi on off lo lo hi hi cmd=on 0.95 0.05 0.05 fty true 1.0

  30. Example • Switch Model xt cmd xt+1 f on on onon off offon idl onon * ftyoff on onoff off offoff idl offoff * ftyfty * fty 0.950.950.950.050.950.950.950.051.0 xt t1 t2 f on lo loon hi hioff * *fty * * 1.01.01.01.0

  31. Example • Or-Gate Model xt in1 in2 out f ok lo lo look lo hi hiok hi lo hiok hi hi hifty * * * 1.01.01.01.01.0 0.99 in1 in2 out lo lo lolo hi hihi lo hihi hi hi ok 0.01 xt xt+1 f true fty ok okok ftyfty fty 0.990.011.0 1.0

  32. Example • Initial belief state (chosen): • p(Sw=on) = p(Sw=off) = 0.475, p(Sw=fty) = 0.05 • p(Or=ok) = 0.99, p(Or=fty) = 0.01 • Observations/Commands: • t0: Sw1.cmd=on • t1: Or.out=lo, Sw1.cmd=idl, Sw2.cmd=on • t2: Or.out=lo • Leading Solutions: • t0: Sw1=on/off, Sw2=on/off, Or=ok • t1: Sw1=fty, Sw2=off, Or=ok • t2: Sw1=on, Sw2=on, Or=fty

  33. Conclusion • SCSPs elegant and general representation • ADDs encoding of SCSPs efficient in average case, exponential in the number of variables in worst case • Decomposition factors problem into set of ADDs, each confined to small numbers of variables • The two methods complement each other well • How far can we get with this combination?

More Related