1 / 36

360 likes | 489 Views

Uri Zwick Tel Aviv University. Simple Stochastic Games Mean Payoff Games Parity Games. TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A. Simple Stochastic Games. Mean Payoff Games. Parity Games. Randomized subexponential algorithm for SSG.

Download Presentation
## Uri Zwick Tel Aviv University

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.
Content is provided to you AS IS for your information and personal use only.
Download presentation by click this link.
While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

**Uri ZwickTel Aviv University**Simple Stochastic GamesMean Payoff GamesParity Games TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAA**Simple Stochastic Games**Mean Payoff Games Parity Games Randomizedsubexponential algorithm for SSG Deterministicsubexponential algorithm for PG**Simple Stochastic Games**Mean Payoff Games Parity Games**R**R R R A simple Simple Stochastic Game**min-sink**MAX-sink Simple Stochastic game (SSGs)Reachability version[Condon (1992)] R min MAX RAND Two Players: MAX and min Objective:MAX/min the probability of getting to the MAX-sink**Simple Stochastic games (SSGs)Strategies**A generalstrategy may be randomized and history dependent A positional strategy is deterministicand history independent Positionalstrategy for MAX: choice of an outgoing edge from each MAX vertex**Simple Stochastic games (SSGs)Values**Every vertex i in the game has a valuevi general positional general positional Both players have positionaloptimal strategies There are strategies that are optimal for every starting position**Simple Stochastic game (SSGs)[Condon (1992)]**Terminating binary games The outdegrees of all non-sinks are 2 All probabilities are ½. The game terminates with prob. 1 Easy reduction from general gamesto terminating binary games**“Solving” terminating binary SSGs**The values vi of the vertices of a game are the unique solution of the following equations: The values are rational numbersrequiring only a linear number of bits Corollary: Decision version in NP co-NP**Value iteration (for binary SSGs)**Iterate the operator: Converges to the unique solution But, may require an exponentialnumber of iterations to get close**Simple Stochastic game (SSGs)Payoff version[Shapley (1953)]**R min MAX RAND Limiting average version Discounted version**Markov Decision Processes (MDPs)**R min MAX RAND Theorem:[Epenoux (1964)] Values and optimal strategies of a MDP can be found by solving an LP**SSG NP co-NP – Another proof**Deciding whether the value of a game isat least (at most) v is in NP co-NP To show that value v,guess an optimal strategy for MAX Find an optimal counter-strategy for min by solving the resulting MDP. Is the problem in P ?**Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)]**R min MAX RAND Non-terminating version Discounted version ReachabilitySSGs PayoffSSGs MPGs Pseudo-polynomial algorithm (PZ’96)**Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)]**Again, both players have optimal positional strategies. Value(σ,) – average of cycle formed**Selecting the second largest element with only four storage**locations [PZ’96]**Parity Games (PGs) A simple example**Priorities 2 3 2 1 4 1 EVEN wins if largest priorityseen infinitely often is even**8**3 ODD EVEN Parity Games (PGs) EVEN wins if largest priorityseen infinitely often is even Equivalent to many interesting problemsin automata and verification: Non-emptyness of -tree automata modal -calculus model checking**8**3 ODD EVEN Parity Games (PGs) Mean Payoff Games (MPGs) [Stirling (1993)] [Puri (1995)] Replace priority k by payoff (n)k Move payoffs to outgoing edges**Switches**…**Strategy/Policy Iteration**Start with some strategy σ (of MAX) While there are improving switches, perform some of them As each step is strictly improving and as there is a finite number of strategies, the algorithm must end with an optimal strategy SSG PLS (Polynomial Local Search)**Strategy/Policy IterationComplexity?**Performing only one switch at a time may lead to exponentially many improvements,even for MDPs [Condon (1992)] What happens if we perform all profitable switches[Hoffman-Karp (1966)] ??? Not known to be polynomialO(2n/n) [Mansour-Singh (1999)] No non-linear examples2n-O(1) [Madani (2002)]**A randomized subexponential algorithm for simple stochastic**games**Arandomizedsubexponentialalgorithm for binary SSGs[Ludwig**(1995)][Kalai (1992)] [Matousek-Sharir-Welzl (1992)] Start with an arbitrary strategy for MAX Choose a random vertex iVMAX Find the optimal strategy ’ for MAX in the gamein which the only outgoing edge of i is (i,(i)) If switching ’ at i is not profitable, then ’ is optimal Otherwise, let (’)i and repeat**Arandomizedsubexponentialalgorithm for binary SSGs[Ludwig**(1995)][Kalai (1992)] [Matousek-Sharir-Welzl (1992)] MAX vertices All correct ! Would never be switched ! There is a hidden order of MAX vertices under which the optimal strategy returned by the first recursive call correctly fixes the strategy of MAX at vertices 1,2,…,i**The hidden order**ui(σ)- the maximum sum of values of a strategy of MAX that agrees with σ on i**The hidden order**Order the vertices such that Positions 1,..,iwere switchedand would neverbe switched again**SSGs are LP-type problems[Halman (2002)]**General (non-binary) SSGs can be solved in time Independently observed by[Björklund-Sandberg-Vorobyov (2005)] AUSO – Acyclic Unique Sink Orientations**SSGs GPLCP[Gärtner-Rüst**(2005)][Björklund-Svensson-Vorobyov (2005)] GPLCPGeneralized Linear ComplementaryProblem with a P-matrix**A deterministic subexponential algorithm for parity games**Mike PatersonMarcin JurdzinskiUri Zwick**Parity Games (PGs) A simple example**Priorities 2 3 2 1 4 1 EVEN wins if largest priorityseen infinitely often is even**8**3 ODD EVEN Parity Games (PGs) Mean Payoff Games (MPGs) [Stirling (1993)] [Puri (1995)] Replace priority k by payoff (n)k Move payoffs to outgoing edges**Exponential algorithm for PGs[McNaughton (1993)] [Zielonka**(1998)] Vertices of highest priority(even) Firstrecursivecall Vertices from whichEVEN can force thegame to enter A Lemma: (i) (ii)**Exponential algorithm for PGs[McNaughton (1993)] [Zielonka**(1998)] Second recursivecall In the worst case, both recursive calls are on games of size n1**Deterministic subexponential alg for PGsJurdzinski,**Paterson, Z (2006) Idea:Look for small dominions! Second recursivecall Dominions of size s can be found in O(ns) time Dominion Dominion: A (small) set from which one of the players can win without the play ever leaving this set**Open problems**• Polynomial algorithms? • Is the Policy Improvement algorithm polynomial? • Faster subexponential algorithmsfor parity games? • Deterministic subexponential algorithmsfor MPGs and SSGs? • Faster pseudo-polynomial algorithmsfor MPGs?

More Related