1 / 67

Concurrent Reachability Games

Concurrent Reachability Games. Peter Bro Miltersen Aarhus University. My apologies …. For not getting slides ready in time for inclusion in booklet ! Slides available at http://www.daimi.au.dk/~bromille. Concurrent reachability games.

starr
Download Presentation

Concurrent Reachability Games

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ConcurrentReachability Games Peter Bro Miltersen Aarhus University CTW 2009

  2. My apologies… • For not getting slides ready in time for inclusion in booklet! • Slides available at http://www.daimi.au.dk/~bromille CTW 2009

  3. Concurrentreachability games • Class of two-playerzero-sum games generalizingsimple stochastic games (Uri’s talk yesterday). • Studiedmainly by the formal methods (”Eurotheory”) community (but sometimes at suchvenues as FOCS and SODA). • Veryinteresting and challengingalgorithmic problems! CTW 2009

  4. Slide stolen from Uri….. min-sink MAX-sink Simple Stochastic game (SSGs)Reachability version[Condon (1992)] 1/2 ZP’96 1/2 R min MAX RAND Two Players: MAX and min Objective:MAX/min the probability of getting to the MAX-sink

  5. Another slide stolen from Uri….. Simple Stochastic games (SSGs)Strategies A generalstrategy may be randomized and history dependent A positional strategy is deterministicand history independent Positionalstrategy for MAX: choice of an outgoing edge from each MAX vertex

  6. Last slide stolen from Uri (I promise!) Simple Stochastic games (SSGs)Values Every vertex i in the game has a valuevi general positional general positional Both players have positionaloptimal strategies There are strategies that are optimal for every starting position

  7. ConcurrentReachability Games min-sink MAX-sink Simple Stochastic game (SSGs)Reachability version[Condon (1992)] 1/2 ZP’96 1/2 R min MAX RAND Two Players: MAX and min Objective:MAX/min the probability of getting to the MAX-sink

  8. (Simple) concurrentreachability game • Arena: • Finitedirectedgraph. • One Max sink (”goal”) node. • Eachnon-sink node has assigned a 2x2 matrix of outgoingarcs. • Play: • A pebblemoves from node to node as in a simple stochastic game. • In each step, Max chooses a row and Min simultaneouslychooses a column of the matrix. • The pebblemovesalong the appropriatearc. • If Max reaches the goal node hewins • If thisneverhappens, Min wins. CTW 2009

  9. Simulation MAX CTW 2009

  10. Simulation min CTW 2009

  11. Simulation 1/2 1/2 R …. Somewhat more subtlethatthisworks! CTW 2009

  12. ”Proof” of correctness • Wewantvalues in the CRG to be the same as in the SSG. • In particular, the value of the node simulating a cointossshouldbe the average of the values of the two nodes it points to. • If thesetwovaluesarethe same, this is ”clearly” the case. • If they have differentvaluesv1, v2, the simulatedcointoss nodes is a game of MatchingPennieswithpayoffsv1, v2. This game has value (v1+v2)/2. CTW 2009

  13. Simple Stochastic games (SSGs)Values Concurrent Reachability Games (CRGs) Every vertex i in the game has a valuevi general positional general positional Both players have positionaloptimal strategies There are strategies that are optimal for every starting position

  14. Simple Stochastic games (SSGs)Values Concurrent Reachability Games (CRGs) Every vertex i in the game has a valuevi sup general stationary inf general stationary Both players have stationaryoptimal strategies There are strategies that are optimal for every starting position Stationary: As positional, exceptthatweallowrandomization

  15. min-sink MAX-sink Whyrandomizedstrategies? 0-1 matrix games canbeimmediatelysiimulated CTW 2009

  16. min-sink MAX-sink Whysup/infinstead of max/min? CTW 2009

  17. min-sink MAX-sink Whysup/infinstead of max/min? CTW 2009

  18. Whysup/infinstead of max/min • ”Conditionallyrepeatedmatchingpennies”: • Min hides a penny • Max tries to guessif it is heads up ortails up. • If Max guessescorrectly, hegets the penny. • If Max incorrectlyguessestails, heloses (goesintomin-sink/trap) • If Max incorrectlyguessesheads, the game repeats. • What is the value of this game? 1 CTW 2009

  19. Almost optimal strategy for Max • Guess ”heads” withprobability 1-² and ”tails” withprobability² (every time). • Guaranteed to winwithprobability 1-². • But nostrategy of Max winswithprobability 1. CTW 2009

  20. Values and near-optimal strategies • Each position in a concurrent reachability game has a value. • For any ε>0, each player has a stationary strategy guaranteeing the value within ε (an ε-optimal strategy). • Shown in Everett, “Recursive games”, 1953.

  21. Algorithmic problems • Qualitativelysolving a CRG. • Determiningwhich nodes have value 1. • Quantitativelysolving a CRG. • Approximatelycomputing the values of the nodes. • Strategicallysolving a CRG. • Computing an ²-optimal stationarystrategy for a given ². CTW 2009

  22. QualitativelysolvingCRGs • De Alfaro, Henzinger, Kupferman, FOCS 1998. • Beautifulalgorithm! • Formal methodscommunity type algorithm! • Fixed point computationinside a fixed point computationinside a fixed point computation…. • Runs in time O(n2). • Open (I think): Canthis time boundbeimproved? (for SSGs the corresponding time is linear) CTW 2009

  23. QuantitativelysolvingCRGs • Wewant to approximate the values of the positions. • Why not computethemexactly? CTW 2009

  24. The value of a CRG maybe irrational! Ferguson, Game Theory Positive payoffsdifferent from 1 canbesimulatedwithscaling and cointossgadgets. Negative payoffsareharder to simulate but in this game wecan do it by adding a constant to all payoffs CTW 2009

  25. QuantitativelysolvingCRGs • Wewant to approximate the values of the positions. • Why not computethemexactly? • Maybewewant to look at the decision problem consisting of comparing the value to a given rational? CTW 2009

  26. SUM-OF-SQRT hardness • SUM-OF-SQRT: Given an epression E which is a weigthed (by integers) sum of squareroots (of integers), does E evaluate to a positive number? • Not known to be in P or NP oreven the polynomialhierarchy (open at leastsinceGarey and Johnson). • Etessami and Yannakakis, 2005: Comparing the value of a CRG to a rational number is hard for SUM-OF-SQRT. CTW 2009

  27. Sketch of Proof • Wealreadysawhow to make games whosevaluesare the solution to certainquadraticequations, i.e., squareroots + rationals. • Oncewe have a bunch of such games, wecaneasilymake a game whosevalue is the average by a ”cointossgadget”. CTW 2009

  28. QuantitativelysolvingCRGs • Wewant to approximate the values of the positions. • Why not computethemexactly? • Maybewewant to compare the value to a given rational? • Given ², wewant to compute an approximationwithin². CTW 2009

  29. Valueiteration • Assign all nodes ”valueapproximation” 0 • Replace pointers withvalueapproximations. Each node is now a matrix game. • Solve and replaceapproximations. • Theorem: Valueapproximationsconverge to values (from below). • Proof sketch: The valueapproximationsare the exactvalues of a time limited version of the game. • How long time to getwitin 0.01 of actualvalues? • Even for SSGsthistakesexponential time (Condon’93). • For CRGs, an open problem untilrecently (seelater). CTW 2009

  30. Anotheralgorithm for approximatingvalues • The property of being a numberlargerorsmallerthan the value of a CRG canbeexpressed by a polynomiallengthformula in the existentialfirstordertheory of the reals. • Thereexists a stationarystrategysuchthat…. • As a corollary to Renegar’89, approximating the value is in PSPACE. • This is the bestknown ”complexityclass” upper bound! • …. also the bestknownconcrete ”big-O” complexitybound (usingBasu et al instead of Renegar). CTW 2009

  31. Whyno NP ÅcoNP upper bound? • Guess a strategy and verifythat it works? • Chatterjee, Majumdar, Jurdzinski, On Nash equilibria in stochastic games, CSL’04 claims such a result. • In 2007, KoushaEtessami found a technical issue in the proof and the authors retracted the claim. CTW 2009

  32. MAX-sink Computingvalues vs. Findingstrategies • It is not obviousthatcomputing the values gives any information about the strategies. • In contrast, for SSGs, optimal strategiescanbecomputed from values in linear time (Andersson and M., ISAAC’09) CTW 2009

  33. Algorithms strategically solving concurrent reachability games Chatterjee, de Alfaro, Henzinger. Strategy improvement for concurrent reachability games. QEST’06. Chatterjee, de Alfaro, Henzinger. Termination criteria for solving concurrent safety and reachability games, SODA’09. Policy improvement! No time bounds given….

  34. “Hardness” of solving CRGs Theorem [Hansen, Koucky and M., LICS’09]: • Any algorithm that manipulates ε-optimal strategies of concurrent reachability games must use exponential space (so no NPÅcoNP algorithm comes from guessing strategies) • Value iteration requires worst case doubly exponential time to come within non-trivial distance of actual values (in contrast, value iteration on SSGs converges in only exponential time).

  35. Dante in Purgatory 7 6 Purgatory has 7 terraces. 5 4 3 2 1 Dante enters Purgatory at terrace 1.

  36. Dante in Purgatory 7 6 5 4 3 2 While in Purgatory, once a second, Dante must play Matching Pennies with Lucifer 1

  37. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  38. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  39. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  40. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  41. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  42. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  43. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  44. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  45. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  46. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  47. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  48. Dante in Purgatory 7 6 5 4 3 If Dante wins, he proceeds to the next terrace 2 1

  49. Dante in Purgatory 7 If Dante wins Matching Pennies at terrace 7, he wins the game of Purgatory. 6 5 4 3 2 1

  50. Dante in Purgatory 7 If Dante wins Matching Pennies at terrace 7, he wins the game of Purgatory. 6 5 4 3 2 1

More Related