1 / 128

Laurent Doyen LSV, ENS Cachan & CNRS

Partial-Observation Stochastic Games: How to Win when Belief Fails. Laurent Doyen LSV, ENS Cachan & CNRS. &. Krishnendu Chatterjee IST Austria. GT Jeux September 20th, 2012. Outline. Game model: example Challenges & Results: examples Solution insights: examples. Examples. Poker

aderyn
Download Presentation

Laurent Doyen LSV, ENS Cachan & CNRS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Partial-Observation Stochastic Games: How to Win when Belief Fails Laurent DoyenLSV, ENS Cachan & CNRS & Krishnendu ChatterjeeIST Austria GT JeuxSeptember 20th, 2012

  2. Outline • Game model: example • Challenges & Results: examples • Solution insights: examples

  3. Examples • Poker • - partial-observation • - stochastic

  4. Examples • Poker • - partial-observation • - stochastic • Bonneteau

  5. Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card

  6. Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red

  7. Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red

  8. Bonneteau 2 black card, 1 red card Initially, all are face down Goal: find the red card • Rules: • Player 1 points a card • Player 2 flips one remaining black card • Player 1 may change his mind, wins if pointed card is red

  9. Bonneteau: Game Model

  10. Bonneteau: Game Model

  11. Game Model

  12. Game Model

  13. Game Model

  14. Game Model

  15. Game Model

  16. Observations (for player 1)

  17. Observations (for player 1)

  18. Observations (for player 1)

  19. Observations (for player 1)

  20. Observations (for player 1)

  21. Observation-based strategy This strategy is observation-based, e.g. after it plays

  22. Observation-based strategy This strategy is observation-based, e.g. after it plays

  23. Optimal ? This strategy is winning with probability 2/3

  24. Example • This game is: • turn-based • (almost) non-stochastic • player 2 has perfect observation

  25. Interaction General case: concurrent & stochastic Player 1’s move Player 2’s move Players choose their moves simultaneously and independently

  26. Interaction General case: concurrent & stochastic Probability distribution on successor state Player 1’s move Player 2’s move

  27. Interaction Turn-based games Special cases: • player-1 state • player-2 state means Note:

  28. Partial-observation General case: 2-sided partial observation Two partitions and Observations: partitions induced by coloring

  29. Partial-observation General case: 2-sided partial observation Two partitions and Observations: partitions induced by coloring Player 2’s view Player 1’s view

  30. Partial-observation Observations: partitions induced by coloring Special case: 1-sided partial observation or View of perfect-observation player

  31. Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions.

  32. Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions. randomized History-depedent

  33. Strategies & objective A strategy for Player is a function that maps histories (sequences of observations) to probability distribution over actions. Reachability objective: Winning probability of :

  34. Qualitative analysis The following problem is undecidable: (already for probabilistic automata [Paz71]) Decide if there exists a strategy for player 1 that is winning with probability at least ½. [Paz71] Paz. Introduction to Probabilistic Automata. Academic Press 1971.

  35. Qualitative analysis The following problem is undecidable: (already for probabilistic automata [Paz71]) Decide if there exists a strategy for player 1 that is winning with probability at least ½. Qualitative analysis: • Almost-sure: … winning with probability 1 • Positive: … winning with probability > 0

  36. Applications in verification tank control • Control with inaccurate digital sensors • multi-process control with private variables • multi-agent protocols • planning with uncertainty/unknown void main () { int got_lock = 0; do { 1: if (*) { 2: lock (); 3: got_lock++; } 4: if (got_lock != 0) { 5: unlock (); } 6: got_lock--; } while (*); } void lock () { assert(L == 0); L = 1; } void unlock () { assert(L == 1); L = 0; }

  37. Outline • Game Model: example • Challenges & Results: examples • Solution insights: examples

  38. Randomization is necessary Player 1 partial, player 2 perfect

  39. Randomization is necessary Player 1 partial, player 2 perfect No pure strategy of Player 1 is winning with probability 1(example from [CDHR06]).

  40. Randomization is necessary Player 1 partial, player 2 perfect No pure strategy of Player 1 is winning with probability 1(example from [CDHR06]).

  41. Memory and Randomization Player 1 partial, player 2 perfect Player 1 wins with probability 1, and needs randomization Belief-based randomized strategies are sufficient

  42. Example 2 Player 1 partial, player 2 perfect

  43. Example 2 Player 1 partial, player 2 perfect To win with probability 1, player 1 needs to observe his own actions. (example from [CDH10]). Randomized action-visible strategies:

  44. Classes of strategies rand. action-visible Classification according to the power of strategies rand. action-invisible pure

  45. Classes of strategies rand. action-visible Classification according to the power of strategies rand. action-invisible pure Poly-time reduction from decision problem of rand. act.-vis. to rand. act.-inv. The model of rand. act.-inv. is more general

  46. Classes of strategies Computational complexity Strategy complexity (algorithms) (memory) rand. action-visible Classification according to the power of strategies rand. action-invisible pure

  47. Known results Reachability - Memory requirement (for player 1)

  48. Known results Reachability - Memory requirement (for player 1) [BGG09] Bertrand, Genest, Gimbert. Qualitative Determinacy and Decidability of Stochastic Games with Signals. LICS’09. [CDHR06] Chatterjee, Doyen, Henzinger, Raskin. Algorithms for ω-Regular games with Incomplete Information. CSL’06. [GS09] Gripon, Serre. Qualitative Concurrent Stochastic Games with Imperfect Information. ICALP’09.

  49. Known results Reachability - Memory requirement (for player 1)

  50. About beliefs Three prevalent beliefs: • Belief is sufficient. • Randomized action invisible or visible almost same. • The general case memory is similar (or in some cases exponential blow up) as compared to the one-sided case.

More Related