Polynomial Time Algorithms for Rank-Based Subclasses in Bilinear Games

Bilinear Games: Polynomial Time Algorithms for Rank Based Subclasses Ruta Mehta Indian Institute of Technology, Bombay Joint work with JugalGarg and Albert X. Jiang

A Game: Rock-Paper-Scissor

Rock-Paper-Scissor: A Play Winner $1

Rock-Paper-Scissor Payoffs

Bimatrix Game S1 = { R, P, C } S2 = { R, P, C } A B Steady State: No player gains by unilateral deviation

Bimatrix Game S1 = { R, P, C } S2 = { R, P, C } A B No Steady State

Mixed Play S1 = { R, P, C } S1 = { R, P, C } ∆1={r1, p1, c1≥0; r1+p1+c1=1} ∆2={r2, p2, c2≥0; r2+p2+c2=1} B A Steady State

John Nash (1951) • Finite Game: Finitely many players, each with finitely many strategies. • Nash: Every finite game has a steady state in mixed strategy. Hence forth called Nash equilibrium (NE) • Proved using Kakutani fixed point theorem: Highly non-constructive.

Nash Equilibrium Computation • Papadimitriou (JCSS’94): PPAD-class • Problems where existence is guaranteed like fixed point, Sperner’s Lemma, Nash equilibrium. • Chen and Deng (FOCS’06): It is PPAD-hard. • CDT (FOCS’06): Even approximation is PPAD-hard.

Rank and Computation • Kannanand Theobald(SODA’07): • Define rank of (A,B) as rank(A+B). • FPTAS for fixed rank games. • Polynomial time algorithms for exact Nash. • Dantzig(1963): Zero-sum (rank-0) is equiv. to LP. • AGMS (STOC’11): Rank-1 games.

Bilinear Games Bimatrix Game with polyhedral strategy sets. • Two players: 1and 2 • Polyhedral strategy sets: • X={x | Ex = e; x ≥ 0}, Y={y | Fy=f; y ≥ 0} • Payoff matrices: A, B • Bilinear Payoff: (x, y) fetches xTAyto player 1, and xTBy to player 2. Motivation: Koller et al. (STOC’94) for two-player extensive form game with perfect recall.

Nash Equilibrium in Bilinear • NE: No player gains by unilateral deviation. • Existence: Corollary of Glicksberg’s result. • Symmetric Game:B=AT and Y=X. • (x, y) is a symmetric profile if y=x. • Existence of symmetric NE: An adaptation of Nash’s proof for symmetric bimatrix games.

Bilinear Contains: • Bimatrix, Polymatrix, Bayesian, etc. • Bimatrix: X = ∆1, Y = ∆2 • Polymatrix: • N players. Each pair plays a bimatrix game. • Player i: Si finite strategy set, ∆i Mixed strategy set. • Goal of i: Choose xi from ∆i to maximize total payoff. i Aij j

Polymatrix to Bilinear • M= |S1|+ … + |Sn|. X = {(x1,…,xn) | xi in ∆i}, Y=X. • A , B=AT Symmetric NE of (A,B) maps to a NE of the polymatrix game j i A =

Best Response (Koller et al.) • Fix a strategy y of player 2. • Player 1 solves max: xT(Ay) min: eTp Ex = e pTE≥ (Ay)T x ≥ 0 At optimal: p s.t. Aiy ≤ pTEi& xi > 0 => Aiy = pTEi • Given x X, for player 2 we get At optimal: q s.t. Bjx ≤ qTFj& yj> 0 => qTFj = Bjx

Best Response Polytopes (BRPs) • (x,y) is a NE iff p: Ay ≤ETp; xi > 0 => Aiy = pTEi q: xTB≤qTF; yj> 0 => qTFj = Bjx xT(Ay - ETp) ≤ 0 and (xTB - qTF)y ≤ 0 xT(A+B)y – eTp – fTy ≤ 0

Nash Equilibrium in BRPs NE iffxT(Ay - ETp)=0 and (xTB - qTF)y=0 xT(A+B)y – eTp – fTy=0 Assumption: P and Q are non-degnerate. (u, v) of P x Q gives a NE => (u, v) is a vertex.

QP Formulation max: xT(A+B)y – eTp – fTy s.t. (y, p) P (x, q) Q • Optimal value 0. • Only vertex solutions.

Our Results • Rank-1 games: rank(A+B)=1 • Extend Adsul et al. algorithm for exact NE. • Fixed rank games: rank(A+B)=k • Extend FPTAS of Kannan et al. • Rank of A or B is constant • Enumerate all NE in polynomial time.

Rank-1 Case • Zero-sum ~ rank(A+B)=0: LP formulation (Charnes’53) • rank(A+B)=1 then A+B = a.bT • The QP formulation: max: (xTa)(bTy) – eTp – fTy s.t. (y, p) P (x, q) Q

Rank-1 Case • Replace (xTa) by z. Recall B = -A + a.bT xT(A+B)y – eTp – fTy=0 z(bTy) – eTp – fTy=0 • N = Points of P x Q’ with z(bTy) – eTp – fTy=0 • Forms paths and cycles, since z gives one degree of freedom. NE of (A,B): Points in intersection of N and z – xTa =0.

Parameterized LP LP(z) = max: z(bTy) – eTp – fTy s.t. (y, p) P (x, z, q) Q’ • Given any c, Optimal value of LP(c) is 0. • OPT(c) lies on N, and • Let N(c)={Points of N with z=c}, then OPT(c)=N(c). • N is a single path on which z is monotonic.

Rank-1: The Algorithm • NE: Intersection of N and H: z – xTa =0. • . c1=amin, c2=amax H N(c1) N H+ H– NE N(c2)

Rank-1: Binary Search Algorithm • NE of (A,B): Points in intersection of N and H. • c=c1+c2/2. H N(c1) N H+ H– NE N(c) N(c2)

Rank-1: Binary Search Algorithm • NE of (A,B): Points in intersection of N and H. • c=c1+c2/2. If N(c) in H–,then c1=c else c2=c. H N H+ H– NE N(c1) N(c2)

Analysis • Terminates because, • z is monotonic on N. • Increase in z on each edge is lower bounded by 1/d where d is polynomial sized in the input. • Time complexity: • Solve LP(c) to get N(c) in each pivot. • log(d) * log(amax – amin) pivots.

Conclusions • Bilinear games: • Bimatrix with polytopal strategy sets. • Fairly general. Contains polymatrix, bayesian, etc. • Polynomial time algorithm for rank based subclasses. • Open problems: • Designing a Lemke-Howson type algorithm. • Degree, index, stability concepts. • Computation of approximate equilibrium.

Thank You

Polynomial Time Algorithms for Rank-Based Subclasses in Bilinear Games

Polynomial Time Algorithms for Rank-Based Subclasses in Bilinear Games

Presentation Transcript

Evolution and Coevolution of ANNs playing Go

CWG GAMES COUNTDOWN BEGINS

BWT-Based Compression Algorithms compress better than you have thought

On the Unique Games Conjecture

The 3 subclasses of mammals differ strikingly in their modes of reproduction

Learning to Rank (part 1)

CULTURAL ALGORITHMS: A TUTORIAL

Algorithm Design and Analysis (ADA)

Chapter 9 - Object-Oriented Programming

IT’S ALL GAMES NOW the convergence of games and social media

CSCE 411H Design and Analysis of Algorithms

Object-Oriented Programming: Inheritance

CSE 373: Data Structures and Algorithms

NP-Completeness

CS 3343: Analysis of Algorithms

CLASSIFYING POLYNOMIALS

Randomized Algorithms and Motif Finding

Dialogue Games

Genetic Algorithms

Algorithms

CPSC 411 Design and Analysis of Algorithms

Chapter 3: The Fundamentals: Algorithms, the Integers, and Matrices