Computing nash equilibrium
Download
1 / 36

Computing Nash Equilibrium - PowerPoint PPT Presentation


  • 138 Views
  • Uploaded on

Computing Nash Equilibrium. Presenter: Yishay Mansour. Outline. Problem Definition Notation Last week: Zero-Sum game This week: Zero Sum: Online algorithm General Sum Games Multiple players – approximate Nash 2 players – exact Nash. Model. Multiple players N={1, ... , n}

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Computing Nash Equilibrium' - burton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Computing nash equilibrium

Computing Nash Equilibrium

Presenter: Yishay Mansour


Outline
Outline

  • Problem Definition

  • Notation

  • Last week: Zero-Sum game

  • This week:

    • Zero Sum: Online algorithm

    • General Sum Games

      • Multiple players – approximate Nash

      • 2 players – exact Nash


Model
Model

  • Multiple players N={1, ... , n}

  • Strategy set

    • Player i has m actions Si = {si1, ... , sim}

    • Siare pure actions of player i

    • S = i Si

  • Payoff functions

    • Player i ui : S  


Strategies
Strategies

  • Pure strategies: actions

  • Mixed strategy

    • Player i : pi distribution over Si

    • Game : P = i pi

      • Product distribution

  • Modified distribution

    • P-i = probability P except for player i

    • (q, P-i ) = player i plays q other player pj


Notations
Notations

  • Average Payoff

    • Player i: ui(P) = Es~P[ui(s)] =  P(s)ui(s)

    • P(s) = i pi (si)

  • Nash Equilibrium

    • P* is a Nash Eq. If for every player i

    • For any distribution qi

    • ui(qi,P*-i)  ui(P*)

      • Best Response


Two player games
Two player games

  • Payoff matrices (A,B)

    • m rows and n columns

    • player 1 has m action, player 2 has n actions

  • strategies p and q

  • Payoffs: u1(pq)=pAqtand u2(pq)= pBqt

  • Zero sum game

    • A= -B


Online learning
Online learning

  • Playing with unknown payoff matrix

  • Online algorithm:

    • at each step selects an action.

      • can be stochastic or fractional

    • Observes all possible payoffs

    • Updates its parameters

  • Goal: Achieve the value of the game

    • Payoff matrix of the “game” define at the end


Online learning algorithm
Online learning - Algorithm

  • Notations:

    • Opponent distribution Qt

    • Our distribution Pt

    • Observed cost M(i, Qt)

      • Should be MQt, and M(Pt,Qt) = Pt M Qt

      • cost on [0,1]

    • Goal: minimize cost

  • Algorithm: Exponential weights

    • Action i has weight proportional to bL(i,t)

    • L(i,t) = loss of action i until time t


Online algorithm notations
Online algorithm: Notations

  • Formally:

    • Number of total steps T is known

    • parameter: b 0< b < 1

    • wt+1(i) = wt(i) bM(i,Qt)

    • Zt =  wt(i)

    • Pt+1(i) = wt+1(i) / Zt

    • Initially, P1(i) > 0 , for every i


Online algorithm theorem
Online algorithm: Theorem

  • Theorem

    • For any matrix M with entries in [0,1]

    • Any sequence of dist. Q1 ... QT

    • The algorithm generates P1, ... , PT

    • RE(A||B) = Ex~A [ln (A(x) / B(x) ) ]


Relative entropy
Relative Entropy

  • For any two distributions A and B

  • RE(A||B) = Ex~A [ln (A(x) / B(x) ) ]

    • can be infinite

      • B(x) = 0 and A(x)  0

    • Always non-negative

      • log is concave

      •  ai log bi  log  ai bi

      •  A(x) ln B(x) / A(x)  ln  A(x) B(x) / A(x) = 0


Online algorithm analysis
Online algorithm: Analysis

  • Lemma

    • For any mixed strategy P

  • Corollary


Online algorithm optimization
Online Algorithm: Optimization

  • b= 1/(1 + sqrt{2 (ln n) / T})

    • additional loss

    • O(sqrt{(ln n )/T})

  • Zero sum game:

    • Average Loss: v

    • additional loss O(sqrt{(ln n )/T})



Two players general sum games
Two players General sum games

  • Input matrices (A,B)

  • No unique value

  • Computational issues:

    • find some Nash,

    • all Nash

      • Can be exponentially many

      • identity matrix

  • Example 2xN


Computational complexity
Computational Complexity

  • Complexity of finding a sample equilibrium is unknown

    • “…no proof of NP-completeness seems possible” (Papadimitriou, 94)

  • Equilibria with certain properties are NP-Hard

    • e.g., max-payoff, max-support

  • (Even) for symmetric 2-player games:

    •  NE with expected social welfare at least k?

    •  NE with least payoff at least k?

    •  Pareto-optimal NE?

    •  NE with player 1 EU of at least k?

    •  multiple NE?

    •  NE where player 1 plays (or not) a particular strategy?

Gilboa & Zemel,

Conitzer & Sandholm


Two players general sum games1
Two players General sum games

  • player 1 best response:

    • Like for zero sum:

    • Fix strategy q of player 2

    • maximize p (Aqt) such that j pj = 1 and pj 0

    • dual LP: minimize u such that u  Aqt

    • Strong Duality: p(Aqt) = u = p u

      • p( u – Aq) = 0

      • complementary system

  • Player 2: q(v- pB) =0


Nash linear complementary system
Nash: Linear Complementary System

  • Find distributions p and q and values u and v

    • u  Aqt

    • v  pB

    • p( u – Aq) = 0

    • q(v- pB) =0

    • j pj = 1 and pj  0

    • j qj = 1 and qj  0


Two players general sum games2
Two players General sum games

  • Assume the support of strategies known.

    • p has support Sp and q has support Sq

    • Can formulate the Nash as LP:


Approximate nash
Approximate Nash

  • Assume we are given Nash

    • strategies (p,q)

  • Show that there exists:

    • small support

    • epsilon-Nash

  • Brute force search

    • enumerate all small supports!

    • Each one requires only poly. time

  • Proof!


Nash linear complementary system1
Nash: Linear Complementary System

  • Find distributions p and q and values u and v

    • u  Aqt

    • v  pB

    • p( u – Aq) = 0

    • q(v- pB) =0

    • j pj = 1 and pj  0

    • j qj = 1 and qj  0


Lemke howson
Lemke & Howson

  • Define labeling

  • For strategy p (player 1):

    • Label i : if (pi=0) where i action of player 1

    • Label j : if action j (payer 2) is best response to p

      • bj p  bkp

  • Similar for player 2

    • Label j : if (qj=0) where j action of player 2

    • Label i : if action i (payer 1) is best response to q

      • ai q  ajq


Lm algo
LM algo

  • strategy (p,q) is Nash if and only if:

    • Each label k is either a label of p or q (or both)

  • Proof!

  • Example


Lemke howson example
Lemke-Howson: Example

G1:

G2:

a3

a5

(0,0,1)

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

U2=

U1=


Lemke howson example1
Lemke-Howson: Example

G1:

G2:

a3

a5

(0,0,1)

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

U2=

U1=


Lm non degenerate
LM: non-degenerate

  • Two player game is non-degenerate if

  • given a strategy (p or q)

    • with support k

  • At most k pure best responses

  • Many equivalent definitions

  • Theorem: For a non-degenerate game

    • finite number of p with m labels

    • finite number of q with n labels


Lm graphs
LM: Graphs

  • Consider distributions where:

    • player 1 has m labels

    • player 2 has n labels

  • Graph (per player):

    • join nodes that share all but 1 label

  • Product graph:

    • nodes are pair of nodes (p,q)

    • edges: if (p,p’) an edge then (p,q)-(p’,q) edge


LM

  • completely labeled node:

    • node that has m+n labels

    • Nash!

  • node: k-almost completely labeled

    • all labeling but label k.

  • edge: k-almost completely labeled

    • all labels on both sides except label k

  • artificial node: (0,0)


Lm paths
LM : Paths

  • Any Nash Eq.

    • connected to exactly one vertex which is

    • k-almost completely labeled

  • Any k-almost completely labeled node

    • has two neighbors in the graph

  • Follows from the non-degeneracy!


Lm algo1
LM: algo

  • start at (0,0)

  • drop label k

  • follow a path

  • end of the path is a Nash


Lemke howson algorithm
Lemke-Howson: Algorithm

a3

a5

G1:

(0,0,1)

G2:

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2


Lemke howson algorithm1
Lemke-Howson: Algorithm

a3

a5

G2:

G1:

(0,0,1)

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2


Lemke howson algorithm2
Lemke-Howson: Algorithm

a3

a5

G1:

(0,0,1)

G2:

(0,1)

1

2

(0,1/3,2/3)

4

4

2

1

(1/3,2/3)

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2


Lemke howson other equilibria
Lemke-Howson: Other Equilibria

a3

a5

G1:

(0,0,1)

G2:

(0,1)

1

2

(0,1/3,2/3)

4

4

2

1

(1/3,2/3)

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2


Lm theorem
LM: Theorem

  • Consider a non-degenerate game

  • Graph consists of disjoint paths and cycles

  • End points of paths are Nash

    • or (0,0)

  • Number of Nash is odd.


Lm sketch of proof
LM: Sketch of Proof

  • Deleting a label k

    • making support larger

    • making BR smaller

  • Smaller BR

    • solve for the smaller BR

    • subtract from dist. until one component is zero

  • Larger support

    • unique solution (since non-degenerate)


ad