1 / 28

Recent progress in computing approximate Nash equilibria

Recent progress in computing approximate Nash equilibria. Paul W. Goldberg Dept. of Computer Science University of Liverpool. Nash equilibrium. 2 players, each with a set of n pure strategies

haru
Download Presentation

Recent progress in computing approximate Nash equilibria

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Recent progress in computing approximate Nash equilibria Paul W. Goldberg Dept. of Computer Science University of Liverpool

  2. Nash equilibrium 2 players, each with a set of n pure strategies • For each pair (i,j), a payoff is specified for each player: R(i, j) for player 1 and C(i, j) for player 2. • These payoffs can be placed into 2 n×n matrices R and C. • We want probability distributions x and y over the players’ strategies such that their expected payoffs cannot be increased by either player changing his distribution: xTRy ≥ (x’)T Ry for all distributions x’ over player 1’s strategies xTCy ≥ xTC(y’) for all distributions y’ over player 2’s strategies

  3. Nash equilibrium 1/3 1/3 1/3 C R R P S 1/3 R -1 1 1 -1 1/3 P 1 -1 -1 1 1/3 S -1 1 1 -1 Rock-paper-scissors

  4. Nash equilibrium 1/3 5/12 1/4 C R R P S 1/3 R -1 2 1 -1 1/3 P 1 -1 -1 1 1/3 S -1 1 1 -1 (thanks to Rahul Savani’s on-line NE program.)

  5. Computing Nash equilibria • Some pre-history: Nash equilibria are “hard” to compute exactly • But, there are notions of approximate NE… (ε-Nash equilibrium) • So, for what values of ε can we compute approximate NE? • (obvious analogy with approximation algorithms for NP-complete problems)

  6. ε-Nash equilibrium • exact NE: “no incentive to deviate” • ε-NE: gain of at most ε when you deviate • let x and y denote the row and column players’ mixed strategies; let eibe vector with 1 in compt i, zero elsewhere. • For all i, xTRy ≥ eiTRy-ε. • For all j, xTCy ≥ xTCei-ε. • Assume payoffs are re-scaled into [0,1]

  7. A simple algorithm [Daskalakis, Mehta and Papadinitriou, WINE 2006] C R 1 2 3 1/2 1 0 0.1 0.2 0.2 0.9 0.2 2 0.3 0.4 0.5 0.2 0.1 0.2 3 0.6 0.7 0.8 0.2 0.2 0.8 ●1 Player 1 chooses arbitrary strategy i; gives it probability ½

  8. A simple algorithm [Daskalakis, Mehta and Papadinitriou, WINE 2006] 1 C R 1 2 3 1/2 1 0 0.1 0.2 0.2 0.9 0.2 2 0.3 0.4 0.5 0.2 0.1 0.2 3 0.6 0.7 0.8 0.2 0.2 0.8 ●2 Player 2 chooses best response j; gives it probability 1

  9. A simple algorithm [Daskalakis, Mehta and Papadinitriou, WINE 2006] 1 C R 1 2 3 1/2 1 0 0.1 0.2 0.2 0.9 0.2 2 0.3 0.4 0.5 0.2 0.1 0.2 1/2 3 0.6 0.7 0.8 0.2 0.2 0.8 ●3 Player 1 chooses best response to j; gives it probability ½

  10. Can we improve this algorithm? (i.e. is there an “incremental” improvement?) e.g. Player 1 did not choose a “good” strategy to begin with…

  11. No! [Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ●1 Consider a u.a.r. zero-sum win-lose n×n game

  12. [Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof (continued): 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 1/n 1 1 1 1 1 1 ●2 If player 1 uses uniform dist, he gets payoff about ½, whatever player 2 does…

  13. [Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof (continued): 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ●3 If player 1 uses only one strategy, player 2’s best response leaves him with nothing!

  14. [Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof (continued): 1 1 1 1 1 1 1 0.4 1 1 1 1 1 1 1 1 1 1 1 1 0.6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ●4 Indeed, if player 1 mixes just 2 strategies, w.h.p. player 2 has a response that leaves player 1 with nothing…

  15. [Feder, Nazerzadeh and Saberi, EC 2007]: To get a better approximation than ½, strategies need support of size Θ(log n), where n is number of strategies. Proof (continued): 1 1 1 1 1 1 1 0.4 1 1 1 1 1 1 1 1 1 1 1 1 0.6 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Similarly for any constant-sized support, indeed less than (say) log(n)/2, in general.

  16. How big a support do you need? • O(log(n)) is also an upper bound (for any constant ε) [Althofer 1994; Lipton, Markakis and Mehta, EC 2003 (extended result from 2-player to multi-player)] ●Define an “empirical NE” as: draw N samples from x and y; replace x,y with resulting empirical distributions x and y.

  17. Example 0.27 0.30 0.43 C R R P S 0.36 R -1 1 1 -1 0.29 P 1 -1 -1 1 0.35 S -1 1 1 -1 If N=100, empirical NE for rock-paper-scissors might look like this

  18. From player 1’s perspective, suppose player 2 replaces y with an empirical distribution based on N = O(log(n/ε2). With high probability, any pure strategy i gets about the same payoff as before, eiTRy= eiTRy +O(ε) yhas support O(log(n/ε2)), so if we do the same thing with x we get the desired result.

  19. Support enumeration Note that it follows that for any εwe can find ε-NE in time O(nlog(n)). (This was pointed out in the Lipton et al paper; another context where support enumeration “works” is on randomly-generated games [Bárány, Vempala and Vetta, FOCS ’05].)

  20. Breaking the ε=½ Barrier [Bosse, Byrka and Markakis, WINE 07] Recall player 1’s initial strategy i may be poor, but now we know that alternative pure strategy won’t necessarily help. Original game is (R,C); solve zero-sum game (R-C,C-R); let x0and y0be player 1 and 2’s strategies in the solution. Let α be a parameter of the algorithm; if x0and y0are an α-NE, use them, else continue…

  21. Breaking the ε=½ Barrier [Bosse, Byrka and Markakis, WINE 07] Let j be player 2’s best response to x0; player 2 uses pure strategy j. (BTW, assume player 2’s regret is at least player 1’s) Let k be player 1’s pure best response to j; player 1 uses a mixture of x0and k. Mixture coefficient of k is (1-r)/(2-r) where r is player 1’s regret in the solution to the zero-sum game.

  22. Breaking the ε=½ Barrier [Bosse, Byrka and Markakis, WINE 07] Optimal choice of α is (3-√5)/2=0.382… . Comment: Why does this work? When player 2 changes his mind (from using y0) he is to some extent helping player 1; y0arose from a game where player 2 tries to hurt player 1 as well as help himself. In the paper, they tweak the algorithm to reduce the ε-value down to 0.364. If fact, a previous paper obtained 0.384+ζ…

  23. 0.384+ζ approximation [Daskalakis, Mehta and Papadimitriou, EC 2007] General idea:construct a LP that is satisfied by approximate solutions (x,y) to the game (R,C) Suppose (x*,y*) is a NE with payoffs v1, v2 to players 1 and 2 resp. Suppose (x, y)is an empirical NE for N= 4/ζ2. We can assume we have been given v1, v2, (x, y). ●(1): check that xTRy≈ v1 (and similarly for column player)

  24. 0.384+ζ approximation (x*,y*) is a NE with payoffs v1, v2 to players 1 and 2 resp. (x, y)is an empirical NE for N= 4/ζ2. ●(1): check that xTRy≈ v1 (and similarly for column player) ●2: Find (x’,y’) that satisfy xTRy’ ≥ v1 - 3ζ/2 For all i, eiTRy’ ≤ v1 + ζ/2 x’TRy ≥ v1 - 3ζ/2 plus a similar set of constraints for the C matrix.

  25. 0.384+ζ approximation ●(1): check that xTRy≈ v1 (and similarly for column player) ●2: Find (x’,y’) that satisfy xTRy’ ≥ v1 - 3ζ/2 For all i, eiTRy’ ≤ v1 + ζ/2 x’TRy ≥ v1 - 3ζ/2 plus a similar set of constraints for the C matrix. ●3: If max(v1,v2) ≥ ⅓ then return a certain mixture of x with x’; y with y’; else return (x’,y’).

  26. ●(1): check that xTRy≈ v1 (and similarly for column player) ●2: Find (x’,y’) that satisfy xTRy’ ≥ v1 - 3ζ/2 For all i, eiTRy’ ≤ v1 + ζ/2 x’TRy ≥ v1 - 3ζ/2 plus a similar set of constraints for the C matrix. ●3: If max(v1,v2) ≥ ⅓ then return a certain mixture of x with x’; y with y’; else return (x’,y’). If v1 and v2 are both <⅓, these constraints ensure that there is not too much to gain by defecting to any pure strategy.

  27. ●(1): check that xTRy≈ v1 (and similarly for column player) ●2: Find (x’,y’) that satisfy xTRy’ ≥ v1 - 3ζ/2 For all i, eiTRy’ ≤ v1 + ζ/2 x’TRy ≥ v1 - 3ζ/2 plus a similar set of constraints for the C matrix. ●3: If max(v1,v2) ≥ ⅓ then return a certain mixture of x with x’; y with y’; else return (x’,y’). If v1 (say) is at least ⅓, these constraints ensure that the mixture distribution has a good performance.

  28. Conclusions • The algorithms are not randomized, but the analysis often uses randomness • plenty of open problems…

More Related