Uri zwick tel aviv university
Download
1 / 36

Uri Zwick - PowerPoint PPT Presentation


  • 309 Views
  • Uploaded on

Uri Zwick Tel Aviv University Simple Stochastic Games Mean Payoff Games Parity Games TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A A A A Simple Stochastic Games Mean Payoff Games Parity Games Randomized subexponential algorithm for SSG

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Uri Zwick' - jana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Uri zwick tel aviv university l.jpg
Uri ZwickTel Aviv University

Simple Stochastic GamesMean Payoff GamesParity Games

TexPoint fonts used in EMF.

Read the TexPoint manual before you delete this box.: AAAA


Slide2 l.jpg

Simple Stochastic Games

Mean Payoff Games

Parity Games

Randomizedsubexponential algorithm for SSG

Deterministicsubexponential algorithm for PG


Slide3 l.jpg

Simple Stochastic Games

Mean Payoff Games

Parity Games


Slide4 l.jpg

R

R

R

R

A simple Simple Stochastic Game


Simple stochastic game ssgs reachability version condon 1992 l.jpg

min-sink

MAX-sink

Simple Stochastic game (SSGs)Reachability version[Condon (1992)]

R

min

MAX

RAND

Two Players: MAX and min

Objective:MAX/min the probability of getting to the MAX-sink


Simple stochastic games ssgs strategies l.jpg
Simple Stochastic games (SSGs)Strategies

A generalstrategy may be randomized and history dependent

A positional strategy is deterministicand history independent

Positionalstrategy for MAX: choice of an outgoing edge from each MAX vertex


Simple stochastic games ssgs values l.jpg
Simple Stochastic games (SSGs)Values

Every vertex i in the game has a valuevi

general

positional

general

positional

Both players have positionaloptimal strategies

There are strategies that are optimal for every starting position


Simple stochastic game ssgs condon 1992 l.jpg
Simple Stochastic game (SSGs)[Condon (1992)]

Terminating binary games

The outdegrees of all non-sinks are 2

All probabilities are ½.

The game terminates with prob. 1

Easy reduction from general gamesto terminating binary games


Solving terminating binary ssgs l.jpg
“Solving” terminating binary SSGs

The values vi of the vertices of a game are the unique solution of the following equations:

The values are rational numbersrequiring only a linear number of bits

Corollary: Decision version in NP  co-NP


Value iteration for binary ssgs l.jpg
Value iteration (for binary SSGs)

Iterate the operator:

Converges to the unique solution

But, may require an exponentialnumber of iterations to get close


Simple stochastic game ssgs payoff version shapley 1953 l.jpg
Simple Stochastic game (SSGs)Payoff version[Shapley (1953)]

R

min

MAX

RAND

Limiting average version

Discounted version


Markov decision processes mdps l.jpg
Markov Decision Processes (MDPs)

R

min

MAX

RAND

Theorem:[Epenoux (1964)]

Values and optimal strategies of a MDP can be found by solving an LP


Ssg np co np another proof l.jpg
SSG  NP  co-NP – Another proof

Deciding whether the value of a game isat least (at most) v is in NP  co-NP

To show that value  v,guess an optimal strategy  for MAX

Find an optimal counter-strategy  for min by solving the resulting MDP.

Is the problem in P ?


Mean payoff games mpgs ehrenfeucht mycielski 1979 l.jpg
Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)]

R

min

MAX

RAND

Non-terminating version

Discounted version

ReachabilitySSGs

PayoffSSGs

MPGs

Pseudo-polynomial algorithm

(PZ’96)


Mean payoff games mpgs ehrenfeucht mycielski 197915 l.jpg
Mean Payoff Games (MPGs)[Ehrenfeucht, Mycielski (1979)]

Again, both players have optimal positional strategies.

Value(σ,) – average of cycle formed



Parity games pgs a simple example l.jpg
Parity Games (PGs) locations [PZ’96]A simple example

Priorities

2

3

2

1

4

1

EVEN wins if largest priorityseen infinitely often is even


Parity games pgs l.jpg

8 locations [PZ’96]

3

ODD

EVEN

Parity Games (PGs)

EVEN wins if largest priorityseen infinitely often is even

Equivalent to many interesting problemsin automata and verification:

Non-emptyness of -tree automata

modal -calculus model checking


Parity games pgs19 l.jpg

8 locations [PZ’96]

3

ODD

EVEN

Parity Games (PGs)

Mean Payoff Games (MPGs)

[Stirling (1993)] [Puri (1995)]

Replace priority k by payoff (n)k

Move payoffs to outgoing edges


Switches l.jpg
Switches locations [PZ’96]


Strategy policy iteration l.jpg
Strategy/Policy Iteration locations [PZ’96]

Start with some strategy σ (of MAX)

While there are improving switches, perform some of them

As each step is strictly improving and as there is a finite number of strategies, the algorithm must end with an optimal strategy

SSG  PLS (Polynomial Local Search)


Strategy policy iteration complexity l.jpg
Strategy/Policy Iteration locations [PZ’96]Complexity?

Performing only one switch at a time may lead to exponentially many improvements,even for MDPs [Condon (1992)]

What happens if we perform all profitable switches[Hoffman-Karp (1966)]

???

Not known to be polynomialO(2n/n) [Mansour-Singh (1999)]

No non-linear examples2n-O(1) [Madani (2002)]


A randomized subexponential algorithm for simple stochastic games l.jpg

A locations [PZ’96]randomized subexponential algorithm for simple stochastic games


Slide24 l.jpg
A locations [PZ’96]randomizedsubexponentialalgorithm for binary SSGs[Ludwig (1995)][Kalai (1992)] [Matousek-Sharir-Welzl (1992)]

Start with an arbitrary strategy  for MAX

Choose a random vertex iVMAX

Find the optimal strategy ’ for MAX in the gamein which the only outgoing edge of i is (i,(i))

If switching ’ at i is not profitable, then ’ is optimal

Otherwise, let  (’)i and repeat


Slide25 l.jpg
A locations [PZ’96]randomizedsubexponentialalgorithm for binary SSGs[Ludwig (1995)][Kalai (1992)] [Matousek-Sharir-Welzl (1992)]

MAX vertices

All correct !

Would never be switched !

There is a hidden order of MAX vertices under which the optimal strategy returned by the first recursive call correctly fixes the strategy of MAX at vertices 1,2,…,i


The hidden order l.jpg
The locations [PZ’96]hidden order

ui(σ)- the maximum sum of values of a strategy of MAX that agrees with σ on i


The hidden order27 l.jpg
The locations [PZ’96]hidden order

Order the vertices such that

Positions 1,..,iwere switchedand would neverbe switched again


Ssgs are lp type problems halman 2002 l.jpg
SSGs are LP-type problems locations [PZ’96][Halman (2002)]

General (non-binary) SSGs can be solved in time

Independently observed by[Björklund-Sandberg-Vorobyov (2005)]

AUSO – Acyclic Unique Sink Orientations


Ssgs gplcp g rtner r st 2005 bj rklund svensson vorobyov 2005 l.jpg
SSGs GPLCP locations [PZ’96][Gärtner-Rüst (2005)][Björklund-Svensson-Vorobyov (2005)]

GPLCPGeneralized Linear ComplementaryProblem with a P-matrix


A deterministic subexponential algorithm for parity games l.jpg

A locations [PZ’96]deterministic subexponential algorithm for parity games

Mike PatersonMarcin JurdzinskiUri Zwick


Parity games pgs a simple example31 l.jpg
Parity Games (PGs) locations [PZ’96]A simple example

Priorities

2

3

2

1

4

1

EVEN wins if largest priorityseen infinitely often is even


Parity games pgs32 l.jpg

8 locations [PZ’96]

3

ODD

EVEN

Parity Games (PGs)

Mean Payoff Games (MPGs)

[Stirling (1993)] [Puri (1995)]

Replace priority k by payoff (n)k

Move payoffs to outgoing edges


Exponential algorithm for pgs mcnaughton 1993 zielonka 1998 l.jpg
Exponential algorithm for PGs locations [PZ’96][McNaughton (1993)] [Zielonka (1998)]

Vertices of highest priority(even)

Firstrecursivecall

Vertices from whichEVEN can force thegame to enter A

Lemma: (i)

(ii)


Exponential algorithm for pgs mcnaughton 1993 zielonka 199834 l.jpg
Exponential algorithm for PGs locations [PZ’96][McNaughton (1993)] [Zielonka (1998)]

Second recursivecall

In the worst case, both recursive calls are on games of size n1


Deterministic subexponential alg for pgs jurdzinski paterson z 2006 l.jpg
Deterministic locations [PZ’96] subexponential alg for PGsJurdzinski, Paterson, Z (2006)

Idea:Look for small dominions!

Second recursivecall

Dominions of size s can be found in O(ns) time

Dominion

Dominion: A (small) set from which one of the players can win without the play ever leaving this set


Open problems l.jpg
Open problems locations [PZ’96]

  • Polynomial algorithms?

  • Is the Policy Improvement algorithm polynomial?

  • Faster subexponential algorithmsfor parity games?

  • Deterministic subexponential algorithmsfor MPGs and SSGs?

  • Faster pseudo-polynomial algorithmsfor MPGs?


ad