1 / 36

# Computing Nash Equilibrium - PowerPoint PPT Presentation

Computing Nash Equilibrium. Presenter: Yishay Mansour. Outline. Problem Definition Notation Last week: Zero-Sum game This week: Zero Sum: Online algorithm General Sum Games Multiple players – approximate Nash 2 players – exact Nash. Model. Multiple players N={1, ... , n}

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Computing Nash Equilibrium' - burton

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Computing Nash Equilibrium

Presenter: Yishay Mansour

• Problem Definition

• Notation

• Last week: Zero-Sum game

• This week:

• Zero Sum: Online algorithm

• General Sum Games

• Multiple players – approximate Nash

• 2 players – exact Nash

• Multiple players N={1, ... , n}

• Strategy set

• Player i has m actions Si = {si1, ... , sim}

• Siare pure actions of player i

• S = i Si

• Payoff functions

• Player i ui : S  

• Pure strategies: actions

• Mixed strategy

• Player i : pi distribution over Si

• Game : P = i pi

• Product distribution

• Modified distribution

• P-i = probability P except for player i

• (q, P-i ) = player i plays q other player pj

• Average Payoff

• Player i: ui(P) = Es~P[ui(s)] =  P(s)ui(s)

• P(s) = i pi (si)

• Nash Equilibrium

• P* is a Nash Eq. If for every player i

• For any distribution qi

• ui(qi,P*-i)  ui(P*)

• Best Response

• Payoff matrices (A,B)

• m rows and n columns

• player 1 has m action, player 2 has n actions

• strategies p and q

• Payoffs: u1(pq)=pAqtand u2(pq)= pBqt

• Zero sum game

• A= -B

• Playing with unknown payoff matrix

• Online algorithm:

• at each step selects an action.

• can be stochastic or fractional

• Observes all possible payoffs

• Updates its parameters

• Goal: Achieve the value of the game

• Payoff matrix of the “game” define at the end

• Notations:

• Opponent distribution Qt

• Our distribution Pt

• Observed cost M(i, Qt)

• Should be MQt, and M(Pt,Qt) = Pt M Qt

• cost on [0,1]

• Goal: minimize cost

• Algorithm: Exponential weights

• Action i has weight proportional to bL(i,t)

• L(i,t) = loss of action i until time t

• Formally:

• Number of total steps T is known

• parameter: b 0< b < 1

• wt+1(i) = wt(i) bM(i,Qt)

• Zt =  wt(i)

• Pt+1(i) = wt+1(i) / Zt

• Initially, P1(i) > 0 , for every i

• Theorem

• For any matrix M with entries in [0,1]

• Any sequence of dist. Q1 ... QT

• The algorithm generates P1, ... , PT

• RE(A||B) = Ex~A [ln (A(x) / B(x) ) ]

• For any two distributions A and B

• RE(A||B) = Ex~A [ln (A(x) / B(x) ) ]

• can be infinite

• B(x) = 0 and A(x)  0

• Always non-negative

• log is concave

•  ai log bi  log  ai bi

•  A(x) ln B(x) / A(x)  ln  A(x) B(x) / A(x) = 0

• Lemma

• For any mixed strategy P

• Corollary

• b= 1/(1 + sqrt{2 (ln n) / T})

• O(sqrt{(ln n )/T})

• Zero sum game:

• Average Loss: v

• additional loss O(sqrt{(ln n )/T})

• Input matrices (A,B)

• No unique value

• Computational issues:

• find some Nash,

• all Nash

• Can be exponentially many

• identity matrix

• Example 2xN

• Complexity of finding a sample equilibrium is unknown

• “…no proof of NP-completeness seems possible” (Papadimitriou, 94)

• Equilibria with certain properties are NP-Hard

• e.g., max-payoff, max-support

• (Even) for symmetric 2-player games:

•  NE with expected social welfare at least k?

•  NE with least payoff at least k?

•  Pareto-optimal NE?

•  NE with player 1 EU of at least k?

•  multiple NE?

•  NE where player 1 plays (or not) a particular strategy?

Gilboa & Zemel,

Conitzer & Sandholm

• player 1 best response:

• Like for zero sum:

• Fix strategy q of player 2

• maximize p (Aqt) such that j pj = 1 and pj 0

• dual LP: minimize u such that u  Aqt

• Strong Duality: p(Aqt) = u = p u

• p( u – Aq) = 0

• complementary system

• Player 2: q(v- pB) =0

• Find distributions p and q and values u and v

• u  Aqt

• v  pB

• p( u – Aq) = 0

• q(v- pB) =0

• j pj = 1 and pj  0

• j qj = 1 and qj  0

• Assume the support of strategies known.

• p has support Sp and q has support Sq

• Can formulate the Nash as LP:

• Assume we are given Nash

• strategies (p,q)

• Show that there exists:

• small support

• epsilon-Nash

• Brute force search

• enumerate all small supports!

• Each one requires only poly. time

• Proof!

• Find distributions p and q and values u and v

• u  Aqt

• v  pB

• p( u – Aq) = 0

• q(v- pB) =0

• j pj = 1 and pj  0

• j qj = 1 and qj  0

• Define labeling

• For strategy p (player 1):

• Label i : if (pi=0) where i action of player 1

• Label j : if action j (payer 2) is best response to p

• bj p  bkp

• Similar for player 2

• Label j : if (qj=0) where j action of player 2

• Label i : if action i (payer 1) is best response to q

• ai q  ajq

• strategy (p,q) is Nash if and only if:

• Each label k is either a label of p or q (or both)

• Proof!

• Example

G1:

G2:

a3

a5

(0,0,1)

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

U2=

U1=

G1:

G2:

a3

a5

(0,0,1)

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

U2=

U1=

• Two player game is non-degenerate if

• given a strategy (p or q)

• with support k

• At most k pure best responses

• Many equivalent definitions

• Theorem: For a non-degenerate game

• finite number of p with m labels

• finite number of q with n labels

• Consider distributions where:

• player 1 has m labels

• player 2 has n labels

• Graph (per player):

• join nodes that share all but 1 label

• Product graph:

• nodes are pair of nodes (p,q)

• edges: if (p,p’) an edge then (p,q)-(p’,q) edge

• completely labeled node:

• node that has m+n labels

• Nash!

• node: k-almost completely labeled

• all labeling but label k.

• edge: k-almost completely labeled

• all labels on both sides except label k

• artificial node: (0,0)

• Any Nash Eq.

• connected to exactly one vertex which is

• k-almost completely labeled

• Any k-almost completely labeled node

• has two neighbors in the graph

• Follows from the non-degeneracy!

• start at (0,0)

• drop label k

• follow a path

• end of the path is a Nash

a3

a5

G1:

(0,0,1)

G2:

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

a3

a5

G2:

G1:

(0,0,1)

(0,1)

1

2

(0,1/3,2/3)

4

4

2

(1/3,2/3)

1

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

a3

a5

G1:

(0,0,1)

G2:

(0,1)

1

2

(0,1/3,2/3)

4

4

2

1

(1/3,2/3)

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

a3

a5

G1:

(0,0,1)

G2:

(0,1)

1

2

(0,1/3,2/3)

4

4

2

1

(1/3,2/3)

a1

3

(2/3,1/3)

5

(1,0,0)

a4

(2/3,1/3,0)

(1,0)

5

3

(0,1,0)

a2

• Consider a non-degenerate game

• Graph consists of disjoint paths and cycles

• End points of paths are Nash

• or (0,0)

• Number of Nash is odd.

• Deleting a label k

• making support larger

• making BR smaller

• Smaller BR

• solve for the smaller BR

• subtract from dist. until one component is zero

• Larger support

• unique solution (since non-degenerate)