computing nash equilibrium
Download
Skip this Video
Download Presentation
Computing Nash Equilibrium

Loading in 2 Seconds...

play fullscreen
1 / 36

Computing Nash Equilibrium - PowerPoint PPT Presentation


  • 138 Views
  • Uploaded on

Computing Nash Equilibrium. Presenter: Yishay Mansour. Outline. Problem Definition Notation Last week: Zero-Sum game This week: Zero Sum: Online algorithm General Sum Games Multiple players – approximate Nash 2 players – exact Nash. Model. Multiple players N={1, ... , n}

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Computing Nash Equilibrium' - burton


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
computing nash equilibrium

Computing Nash Equilibrium

Presenter: Yishay Mansour

outline
Outline
  • Problem Definition
  • Notation
  • Last week: Zero-Sum game
  • This week:
    • Zero Sum: Online algorithm
    • General Sum Games
      • Multiple players – approximate Nash
      • 2 players – exact Nash
model
Model
  • Multiple players N={1, ... , n}
  • Strategy set
    • Player i has m actions Si = {si1, ... , sim}
    • Siare pure actions of player i
    • S = i Si
  • Payoff functions
    • Player i ui : S  
strategies
Strategies
  • Pure strategies: actions
  • Mixed strategy
    • Player i : pi distribution over Si
    • Game : P = i pi
      • Product distribution
  • Modified distribution
    • P-i = probability P except for player i
    • (q, P-i ) = player i plays q other player pj
notations
Notations
  • Average Payoff
    • Player i: ui(P) = Es~P[ui(s)] =  P(s)ui(s)
    • P(s) = i pi (si)
  • Nash Equilibrium
    • P* is a Nash Eq. If for every player i
    • For any distribution qi
    • ui(qi,P*-i)  ui(P*)
      • Best Response
two player games
Two player games
  • Payoff matrices (A,B)
    • m rows and n columns
    • player 1 has m action, player 2 has n actions
  • strategies p and q
  • Payoffs: u1(pq)=pAqtand u2(pq)= pBqt
  • Zero sum game
    • A= -B
online learning
Online learning
  • Playing with unknown payoff matrix
  • Online algorithm:
    • at each step selects an action.
      • can be stochastic or fractional
    • Observes all possible payoffs
    • Updates its parameters
  • Goal: Achieve the value of the game
    • Payoff matrix of the “game” define at the end
online learning algorithm
Online learning - Algorithm
  • Notations:
    • Opponent distribution Qt
    • Our distribution Pt
    • Observed cost M(i, Qt)
      • Should be MQt, and M(Pt,Qt) = Pt M Qt
      • cost on [0,1]
    • Goal: minimize cost
  • Algorithm: Exponential weights
    • Action i has weight proportional to bL(i,t)
    • L(i,t) = loss of action i until time t
online algorithm notations
Online algorithm: Notations
  • Formally:
    • Number of total steps T is known
    • parameter: b 0< b < 1
    • wt+1(i) = wt(i) bM(i,Qt)
    • Zt =  wt(i)
    • Pt+1(i) = wt+1(i) / Zt
    • Initially, P1(i) > 0 , for every i
online algorithm theorem
Online algorithm: Theorem
  • Theorem
    • For any matrix M with entries in [0,1]
    • Any sequence of dist. Q1 ... QT
    • The algorithm generates P1, ... , PT
    • RE(A||B) = Ex~A [ln (A(x) / B(x) ) ]
relative entropy
Relative Entropy
  • For any two distributions A and B
  • RE(A||B) = Ex~A [ln (A(x) / B(x) ) ]
    • can be infinite
      • B(x) = 0 and A(x)  0
    • Always non-negative
      • log is concave
      •  ai log bi  log  ai bi
      •  A(x) ln B(x) / A(x)  ln  A(x) B(x) / A(x) = 0
online algorithm analysis
Online algorithm: Analysis
  • Lemma
    • For any mixed strategy P
  • Corollary
online algorithm optimization
Online Algorithm: Optimization
  • b= 1/(1 + sqrt{2 (ln n) / T})
    • additional loss
    • O(sqrt{(ln n )/T})
  • Zero sum game:
    • Average Loss: v
    • additional loss O(sqrt{(ln n )/T})
two players general sum games
Two players General sum games
  • Input matrices (A,B)
  • No unique value
  • Computational issues:
    • find some Nash,
    • all Nash
      • Can be exponentially many
      • identity matrix
  • Example 2xN
computational complexity
Computational Complexity
  • Complexity of finding a sample equilibrium is unknown
    • “…no proof of NP-completeness seems possible” (Papadimitriou, 94)
  • Equilibria with certain properties are NP-Hard
    • e.g., max-payoff, max-support
  • (Even) for symmetric 2-player games:
    •  NE with expected social welfare at least k?
    •  NE with least payoff at least k?
    •  Pareto-optimal NE?
    •  NE with player 1 EU of at least k?
    •  multiple NE?
    •  NE where player 1 plays (or not) a particular strategy?

Gilboa & Zemel,

Conitzer & Sandholm

two players general sum games1
Two players General sum games
  • player 1 best response:
    • Like for zero sum:
    • Fix strategy q of player 2
    • maximize p (Aqt) such that j pj = 1 and pj 0
    • dual LP: minimize u such that u  Aqt
    • Strong Duality: p(Aqt) = u = p u
      • p( u – Aq) = 0
      • complementary system
  • Player 2: q(v- pB) =0
nash linear complementary system
Nash: Linear Complementary System
  • Find distributions p and q and values u and v
    • u  Aqt
    • v  pB
    • p( u – Aq) = 0
    • q(v- pB) =0
    • j pj = 1 and pj  0
    • j qj = 1 and qj  0
two players general sum games2
Two players General sum games
  • Assume the support of strategies known.
    • p has support Sp and q has support Sq
    • Can formulate the Nash as LP:
approximate nash
Approximate Nash
  • Assume we are given Nash
    • strategies (p,q)
  • Show that there exists:
    • small support
    • epsilon-Nash
  • Brute force search
    • enumerate all small supports!
    • Each one requires only poly. time
  • Proof!
nash linear complementary system1
Nash: Linear Complementary System
  • Find distributions p and q and values u and v
    • u  Aqt
    • v  pB
    • p( u – Aq) = 0
    • q(v- pB) =0
    • j pj = 1 and pj  0
    • j qj = 1 and qj  0
lemke howson
Lemke & Howson
  • Define labeling
  • For strategy p (player 1):
    • Label i : if (pi=0) where i action of player 1
    • Label j : if action j (payer 2) is best response to p
      • bj p  bkp
  • Similar for player 2
    • Label j : if (qj=0) where j action of player 2
    • Label i : if action i (payer 1) is best response to q
      • ai q  ajq
lm algo
LM algo
  • strategy (p,q) is Nash if and only if:
    • Each label k is either a label of p or q (or both)
  • Proof!
  • Example
lemke howson example
Lemke-Howson: Example

G1:

G2:

a3

a5

(0,0,1)

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

U2=

U1=

lemke howson example1
Lemke-Howson: Example

G1:

G2:

a3

a5

(0,0,1)

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

U2=

U1=

lm non degenerate
LM: non-degenerate
  • Two player game is non-degenerate if
  • given a strategy (p or q)
    • with support k
  • At most k pure best responses
  • Many equivalent definitions
  • Theorem: For a non-degenerate game
    • finite number of p with m labels
    • finite number of q with n labels
lm graphs
LM: Graphs
  • Consider distributions where:
    • player 1 has m labels
    • player 2 has n labels
  • Graph (per player):
    • join nodes that share all but 1 label
  • Product graph:
    • nodes are pair of nodes (p,q)
    • edges: if (p,p’) an edge then (p,q)-(p’,q) edge
slide28
LM
  • completely labeled node:
    • node that has m+n labels
    • Nash!
  • node: k-almost completely labeled
    • all labeling but label k.
  • edge: k-almost completely labeled
    • all labels on both sides except label k
  • artificial node: (0,0)
lm paths
LM : Paths
  • Any Nash Eq.
    • connected to exactly one vertex which is
    • k-almost completely labeled
  • Any k-almost completely labeled node
    • has two neighbors in the graph
  • Follows from the non-degeneracy!
lm algo1
LM: algo
  • start at (0,0)
  • drop label k
  • follow a path
  • end of the path is a Nash
lemke howson algorithm
Lemke-Howson: Algorithm

a3

a5

G1:

(0,0,1)

G2:

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

lemke howson algorithm1
Lemke-Howson: Algorithm

a3

a5

G2:

G1:

(0,0,1)

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

lemke howson algorithm2
Lemke-Howson: Algorithm

a3

a5

G1:

(0,0,1)

G2:

(0,1)

1

2

(0,1/3,2/3)

4

4

2

1

(1/3,2/3)

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

lemke howson other equilibria
Lemke-Howson: Other Equilibria

a3

a5

G1:

(0,0,1)

G2:

(0,1)

1

2

(0,1/3,2/3)

4

4

2

1

(1/3,2/3)

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

lm theorem
LM: Theorem
  • Consider a non-degenerate game
  • Graph consists of disjoint paths and cycles
  • End points of paths are Nash
    • or (0,0)
  • Number of Nash is odd.
lm sketch of proof
LM: Sketch of Proof
  • Deleting a label k
    • making support larger
    • making BR smaller
  • Smaller BR
    • solve for the smaller BR
    • subtract from dist. until one component is zero
  • Larger support
    • unique solution (since non-degenerate)
ad