BDDs in Planning and General Game Playing - PowerPoint PPT Presentation

bdds in planning and general game playing n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
BDDs in Planning and General Game Playing PowerPoint Presentation
Download Presentation
BDDs in Planning and General Game Playing

play fullscreen
1 / 32
BDDs in Planning and General Game Playing
140 Views
Download Presentation
trang
Download Presentation

BDDs in Planning and General Game Playing

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. BDDs in Planning and General Game Playing Peter Kissmann and Stefan Edelkamp Graph Search Engineering Schloss Dagstuhl 2009

  2. Structure • BDDs • Symbolic Search • BDDs in Planning • Sequential Optimal Planning • Net-Benefit Planning • Conclusion • BDDs in General Game Playing • Solving Single-Player Games • Solving Two-Player Games • Results • Conclusion BDDs in Planning and General Game Playing

  3. BDDs and Symbolic Search Peter Kissmann and Stefan Edelkamp Graph Search Engineering Schloss Dagstuhl 2009

  4. Binary Decision Diagrams (BDDs) • good variable ordering crucial BDDs in Planning and General Game Playing

  5. Symbolic Search • uses (Reduced Ordered) Binary Decision Diagrams ((RO)BDDs) • set-based search: sets of states and transitions represented as relations • unique representation • no duplicate eliminiation within set required • layered exploration (e.g., BFS): duplicate elimination wrt. previous layers • advantages due to compressed representation: • save RAM • might save time BDDs in Planning and General Game Playing

  6. Symbolic Search • two sets of variables • S for current states • S’ for successor states • expansion of state sets (not single states) as relation • calculation of successors: • calculation of predecessors: • predecessors with at least one successor in states: • predecessors with all successors in states: BDDs in Planning and General Game Playing

  7. BDDs in Planning Peter Kissmann and Stefan Edelkamp Graph Search Engineering Schloss Dagstuhl 2009

  8. Structure • Sequential Optimal Planning • Symbolic Algorithms • Competition Results (IPC-6) • Net-Benefit Planning • Symbolic Algorithms • Competition Results (IPC-6) • Conclusion BDDs in Planning and General Game Playing

  9. Sequential Optimal Planning • Given: Problem <S, O, I, T, c> with • S: set of states • O⊆S x S: operators (actions) • I∈S: initial state • T⊆S: terminal states • c: O → {1, …, C}: action costs • Aim: finding of plan from initial state to one of the terminal states • no action costs: minimal plan (in plan‘s length) → Symbolic (Bidir) BFS • with action costs: Symbolic A* (BDDA*) BDDs in Planning and General Game Playing

  10. BDDA* h g BDDs in Planning and General Game Playing

  11. Competition Results (IPC 6) • Extension (of Gamer(comp) to Gamer): • use of hashmap instead of matrix for large action costs • matrix became too large while being sparse BDDs in Planning and General Game Playing

  12. Net-Benefit • challenge at IPC6 • total plan net-benefit = total achieved goal rewards - total action cost • transformation: goal rewards → costs for violating soft constraints • net-benefit = total violating cost + total action cost • to be minimized BDDs in Planning and General Game Playing

  13. Symbolic Branch-and-Bound Search • Symbolic Breadth-First Branch-and-Bound • by Jensen et al. 2006 • cost-optimal BFS → ignores action-costs • improves upper bound U • initially: sum of cost for violating all soft constraints + 1 • can be represented by a BDD: disjunction of all values from 0 to U • Symbolic Cost-First Branch-and-Bound • expansion according to action-costs, not BFS-layers • action-costs still not part of objective function BDDs in Planning and General Game Playing

  14. Symbolic Net-Benefit • adds total action-costs to objective function • net-benefit = (total action-cost f) + (sum of costs for violated soft constraints) • total-cost not bounded • no BDD representation • but: can use cost-first search‘s buckets • also stores current best net-benefit V • initialized to ∞ BDDs in Planning and General Game Playing

  15. Symbolic Net-Benefit • Algorithm: • start with initial state • check, if goals within current states • take only goals with cost < U • find goal with minimal cost U‘ (and U‘ + f < V) and calculate plan • set U = U‘, V = U‘ + f • calculate successors (image) • sort successors into corresponding buckets (f + 1, …, f + C) • repeat from 2., until no new states found (or all soft constraints satisfied or total action cost ≥ V) • return last generated plan BDDs in Planning and General Game Playing

  16. Competition Results (IPC 6) hsp*p: enumerates all possible soft constraint violations and runs ordinary planner on each sub-instance Mips XXL: external-memory algorithms BDDs in Planning and General Game Playing

  17. Conclusion and Additional Remarks • new set-based algorithm for computing optimal net-benefit • covers cost-optimal search and over-subscribed planning with preferences • Gamer can handle 0-cost actions • additional BFS for 0-cost fixpoint calculation • extension to partial initial states BDDs in Planning and General Game Playing

  18. BDDs in General Game Playing Peter Kissmann and Stefan Edelkamp Graph Search Engineering Schloss Dagstuhl 2009

  19. Structure • Solving Single-Player Games • Solving Two-Player Games • Zero-Sum Games • General Two-Player Turn-Taking Games • Results • Conclusion BDDs in Planning and General Game Playing

  20. General Game Playing - Games • Given a description of a game that is • finite • discrete • deterministic • full information • Games can be • single-player or multi-player • simultaneous or turn-taking BDDs in Planning and General Game Playing

  21. Solving Games • In General Game Playing, rewards for all players • range from 0 to 100 (higher = better) • only in goal states • Solving: find rewards for all states (in case of optimal play) • Solving to • analyze players • play optimally • use as endgame database (if not complete) BDDs in Planning and General Game Playing

  22. Solving Single-Player Games • might use Planning technology, but • in Planning (as in General Game Playing) interested in searching only necessary states • here: solve all states • approach: • calculate reachable states • start at goal states giving reward 100 • apply backward BFS • remove all found states from reachable states • go to goal states giving reward 99 and repeat steps BDDs in Planning and General Game Playing

  23. player 0‘s turn player 1‘s turn lost for player 0 lost for player 1 Solving 2-Player Zero-Sum Games • two backward searches (one for each player j∈ {0,1}): • Start with goal states lost for player j • Find all lost predecessors using two steps: • find preceding states where opponent could take move to state lost for j (pre-image) • find preceding states where any of j’s moves results in state lost for j (strong pre-image) • Repeat double-step, until no new states found BDDs in Planning and General Game Playing

  24. Solving General 2-Player Turn-Taking Games • 101x101-matrix of BDDs • BDD at (i, j) represents states achieving reward i for player 0 and j for player 1 (in case of optimal play) • only 1 backward search • alternating between players within loop BDDs in Planning and General Game Playing

  25. Algorithm-Outline • find all reachable states • initialize reward matrix with goal states • solved states: all states within matrix • while (not all states solved) do • for each player j ∈ {0, 1} do • find all solvable states of j (strongPreImage(solved)) • solve these states (pre-image from matrix’s buckets) BDDs in Planning and General Game Playing

  26. own own 0 0 … … 100 100 0 0 … … opponent opponent 100 100 Order to classify states • problem in general case: order to classify states • maximize own reward (and minimize opponent‘s)? • or maximize difference to opponent‘s reward? • might change during one competition • we chose second case for all examples BDDs in Planning and General Game Playing

  27. 0/1 player 0 0 1 2 3 0 0/1 0/3 0/1 1 player 1 0/1 3/1 0/3 0/1 2/0 2 3 2/0 2/0 0/3 player 0‘s turn 2/0 0/1 0/1 player 1‘s turn 0/1 3/1 3/1 3/1 3/1 3/1 0/1 0/1 Example BDDs in Planning and General Game Playing

  28. Results (Reachability Analyses) • Single-Player Games: • Two-Player Games: BDDs in Planning and General Game Playing

  29. Results (Peg Solitaire) • total #reachable: 375,110,246 BDDs in Planning and General Game Playing

  30. Results (Connect Four) • 85 bits to represent one state • 2 bits per cell (blank, red, yellow); 42 cells • 1 bit for active player • originally solved by Allis (’88) • estimate on total #states: 70,728,639,995,483 ≈ 70 x 1012 • complete reachability analysis using BDDs • 12 GB RAM • 2.67 GHz CPU • total time: 5:15 h • total #states: 4,531,985,219,092 ≈ 4.5 x 1012 • explicit representation: ≈ 43.5TB BDDs in Planning and General Game Playing

  31. Results (Two-Player Games) BDDs in Planning and General Game Playing

  32. Conclusion • Solving single-player games and two-player zero-sum games fairly easy • Solving general two-player games involved • first approach (Planning & Games Workshop 2007) very slow • current one needs linear number of pre-images • for playing still too slow • UCT to get good estimates faster • UCT works well with endgame databases • BDDs for complete state space can be used as perfect hash-functions BDDs in Planning and General Game Playing