1 / 18

A Hybrid Linear Programming and Relaxed Plan Heuristic for Partial Satisfaction Planning Problems

A Hybrid Linear Programming and Relaxed Plan Heuristic for Partial Satisfaction Planning Problems. J. Benton. Menkes van den Briel. Subbarao Kambhampati. Arizona State University. PSP UD Partial Satisfaction Planning with Utility Dependency. (Do, et al., IJCAI 2007).

baris
Download Presentation

A Hybrid Linear Programming and Relaxed Plan Heuristic for Partial Satisfaction Planning Problems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Hybrid Linear Programming and Relaxed Plan Heuristic for Partial Satisfaction Planning Problems J. Benton Menkes van den Briel Subbarao Kambhampati Arizona State University

  2. PSPUDPartial Satisfaction Planning with Utility Dependency (Do, et al., IJCAI 2007) (Smith, ICAPS 2004; van den Briel, et al., AAAI 2004) Actions have cost Goal sets have utility loc1 loc2 150 200 100 101 loc3 Maximize Net Benefit (utility - cost) (fly plane loc2) (debark person loc2) (fly plane loc3) (at plane loc2) (in person plane) (at plane loc1) (in person plane) (at plane loc2) (at person loc2) (at plane loc3) (at person loc2) S0 S1 S2 S3 cost: 1 cost: 150 cost: 150 sum cost: 151 sum cost: 251 sum cost: 150 util(S1): 0 util(S3): 1000+1000+10=2010 util(S0): 0 util(S2): 1000 net benefit(S2): 1000-151=849 net benefit(S3): 2010-251=1759 net benefit(S1): 0-150=-150 net benefit(S0): 0-0=0 utility((at plane loc1) & (at person loc3)) = 10 utility((at plane loc3)) = 1000 utility((at person loc2)) = 1000

  3. Heuristic search for SOFT GOALS Action Cost/Goal Achievement Interaction Plan Quality (Do & Kambhampati, KCBS 2004; Do, et al., IJCAI 2007) Relaxed Planning Graph Heuristics Integer programming (IP) LP-relaxation Heuristics Cannot take all complex interactions into account Current encodings don’t scale well, can only be optimal to some plan step BBOP-LP

  4. Approach Build a network flow-based IP encoding No time indices Uses multi-valued variables Use its LP relaxation for a heuristic value Gives a second relaxation on the heuristic Perform branch and bound search Uses the LP solution to find a relaxed plan (similar to YAHSP, Vidal 2004)

  5. Building a Heuristic A network flow model on variable transitions (no time indices) Capture relevant transitions with multi-valued fluents loc1 loc2 150 initial states prevail constraints goal states cost on actions utility on goals 200 100 plane person 101 loc3 cost: 101 util: 1000 cost: 1 cost: 150 util: 10 cost: 1 cost: 100 cost: 1 cost: 200 cost: 1 util: 1000 cost: 1 cost: 1

  6. Building a Heuristic Constraints of this model 1. If an action executes, then all of its effects and prevail conditions must also. 2. If a fact is deleted, then it must be added to re-achieve a value. 3. If a prevail condition is required, then it must be achieved. 4. A goal utility dependency is achieved iff its goals are achieved. plane person cost: 101 util: 1000 cost: 1 cost: 150 util: 10 cost: 1 cost: 100 cost: 1 cost: 200 cost: 1 util: 1000 cost: 1 cost: 1

  7. Building a Heuristic Constraints of this model 1. If an action executes, then all of its effects and prevail conditions must also. action(a) = Σeffects of a in v effect(a,v,e) + Σprevails of a in v prevail(a,v,f) 2. If a fact is deleted, then it must be added to re-achieve a value. 1{if f ∈ s0[v]} + Σeffects that add f effect(a,v,e) = Σeffects that delete f effect(a,v,e) + endvalue(v,f) 3. If a prevail condition is required, then it must be achieved. 1{if f ∈ s0[v]} + Σeffects that add f effect(a,v,e) ≥ prevail(a,v,f) / M 4. A goal utility dependency is achieved iff its goals are achieved. goaldep(k) ≥ Σf in dependency k endvalue(v,f) – |Gk| – 1 goaldep(k) ≤ endvalue(v,f) ∀ f in dependency k Variables Parameters

  8. Objective Function MAX Σv∈V,f∈Dv utility(v,f) endvalue(v,f) + Σk∈K utility(k) goaldep(k) – Σa∈A cost(a) action(a) Maximize Net Benefit 2. If a fact is deleted, then it must be added to re-achieve a value. 1{if f ∈ s0[v]} + Σeffects that add f effect(a,v,e) = Σeffects that delete f effect(a,v,e) + endvalue(v,f) Updated at each search node 3. If a prevail condition is required, then it must be achieved. 1{if f ∈ s0[v]} + Σeffects that add f effect(a,v,e) ≥ prevail(a,v,f) / M Variables Parameters

  9. Search Branch and Bound Branch and bound with time limit All soft goals; all states are goal states Returns the best plan (i.e., best bound) Greedy lookahead strategy Similar to YAHSP (Vidal, 2004) To quickly find good bounds LP-solution guided relaxed plan extraction To add informedness

  10. Getting a Relaxed Plan (fly loc3 loc2) (at plane loc3) (at plane loc3) (fly loc1 loc3) (fly loc1 loc3) (at plane loc1) (at plane loc1) (at plane loc1) (fly loc1 loc2) (fly loc1 loc2) (at plane loc2) (at plane loc2) (fly loc2 loc3) (at person loc2) (drop person loc2) (in person plane) (in person plane) (in person plane)

  11. Getting a Relaxed Plan (fly loc3 loc2) (at plane loc3) (at plane loc3) (fly loc1 loc3) (fly loc1 loc3) (at plane loc1) (at plane loc1) (at plane loc1) (fly loc1 loc2) (fly loc1 loc2) (at plane loc2) (at plane loc2) (fly loc2 loc3) (at person loc2) (drop person loc2) (in person plane) (in person plane) (in person plane)

  12. Getting a Relaxed Plan (fly loc3 loc2) (at plane loc3) (at plane loc3) (fly loc1 loc3) (fly loc1 loc3) (at plane loc1) (at plane loc1) (at plane loc1) (fly loc1 loc2) (fly loc1 loc2) (at plane loc2) (at plane loc2) (fly loc2 loc3) (at person loc2) (drop person loc2) (in person plane) (in person plane) (in person plane)

  13. Getting a Relaxed Plan (fly loc3 loc2) (at plane loc3) (at plane loc3) (fly loc1 loc3) (fly loc1 loc3) (at plane loc1) (at plane loc1) (at plane loc1) (fly loc1 loc2) (fly loc1 loc2) (at plane loc2) (at plane loc2) (fly loc2 loc3) (at person loc2) (drop person loc2) (in person plane) (in person plane) (in person plane)

  14. Getting a Relaxed Plan (fly loc3 loc2) (at plane loc3) (at plane loc3) (fly loc1 loc3) (fly loc1 loc3) (at plane loc1) (at plane loc1) (at plane loc1) (fly loc1 loc2) (fly loc1 loc2) (at plane loc2) (at plane loc2) (fly loc2 loc3) (at person loc2) (drop person loc2) (in person plane) (in person plane) (in person plane)

  15. Experimental Setup Three modified IPC 3 domains: zenotravel, satellite, rovers (maximize net benefit) - action costs - goal utilities - goal utility dependencies BBOP-LP : with and without RP lookahead Compared with SPUDS, uses a relaxed plan-based heuristic , an admissible cost propagation-based heuristic Ran with 600 second time limit

  16. Results rovers satellite optimal solutions Found optimal solution in 15 of 60 problems (higher net benefit is better) zenotravel

  17. Results

  18. Summary Novel LP-based heuristic for partial satisfaction planning Branch and bound with RP lookahead Planner that is sensitive to plan quality: BBOP-LP Future Work Improve encoding Explore other lookahead methods

More Related