1 / 20

Computing Best-Response Strategies in Infinite Games of Incomplete Information

Computing Best-Response Strategies in Infinite Games of Incomplete Information. Daniel Reeves and Michael Wellman University of Michigan. Definitions. Infinite Game = infinite action spaces Incomplete Information = payoffs depend on information that is private to the players

Download Presentation

Computing Best-Response Strategies in Infinite Games of Incomplete Information

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Computing Best-Response Strategies in Infinite Games of Incomplete Information Daniel Reeves and Michael Wellman University of Michigan

  2. Definitions • Infinite Game = infinite action spaces • Incomplete Information = payoffs depend on information that is private to the players • Type = a player’s private information • One-shot Game = players each choose a single action simultaneously and then immediately receive a payoff • Strategy = a mapping from type to action • Best-Response Strategy = optimal strategy given known strategies of the other players • Nash Equilibrium = profile of strategies such that each strategy is a best response to the others • Bayes-Nash Equilibrium = generalization of NE to the case of incomplete information, for expected-utility maximizing players

  3. Finite Game Approximations • Finite game solvers: • Gambit • Gala • Gametracer • Why not discretize? • Introduces qualitative differences • Computationally intractable

  4. Our Class of Games • 2-player, one-shot, infinite games of incomplete information • Piecewise uniform type distributions • Payoff functions of the form:

  5. Games in our Class Other games: War of Attrition, Incomplete info versions of Cournot and Bertrand games

  6. Piecewise Linear Strategies • Specified by the vectors c, m, b

  7. Existence and Computation of Piecewise Linear Best Responses • Theorem 1: Given a payoff function with I regions as above, an opponent type distribution with cdf F that is piecewise uniform with J pieces, and a piecewise linear strategy function with K pieces, the best response is itself a piecewise linear function with no more than 2(I-1)(J+K-2) piece boundaries.

  8. The Proof • For arbitrary own type t, and opponent type a random variable T, find own action a maximizing ET[u(t,a,T,s(T))] • (Numerical maximization not applicable due to parameter t) • Above works out to be a piecewise polynomial in a (parameterized by t) • For given t, finding optimal a is straightforward • Remains to find partitioning of type space such that within each type range, optimal action is a linear function of t • This can be done in polynomial time

  9. Example: First-Price Sealed Bid Auction (FPSB) • Types (valuations) drawn from U[0,1] • Payoff function: • Known Bayes-Nash equilibrium (McAfee & McMillan, 1987): a(t)=t/2 • Found in as few as one iteration from a variety of seed strategies

  10. Example: Supply-chain Game • Producers’ Costs U[0,1] • Consumer’s Valuation v in [1.5,3] (known) • Payoff function: bid-cost if bid+bid2 <= v 0 otherwise Consumer Producer 1 Producer 2

  11. Proving a Bayes-Nash Equilibrium • Candidate Strategy: • Compute best response… 2/3 v – 1/2 if cost < 2/3 v – 1 cost/2 + v/3 otherwise

  12. Computing Best Response Expected payoff , EP(b) =(b-c)*p(b+b2<=v) =(b-c)*[p(c2<=2/3v-1)*p(b+2/3v-1/2<=v | c2<=2/3v-1) +p(c2>2/3v-1) * p(b+c2/2+v/3<=v | c2 > 2/3v-1)] =(b-c)*[(2/3v-1)*p(b<=v/3+1/2) +p(2/3v-1 < c2 < 4/3v-2b)] Case 1: b<=2/3v-1/2 EP(b) = (b-c)*[(2/3v-1)*1 + (2-2/3v)] = (b-c) ==> b* = 2/3v-1/2 ==> EP1(b*) = 2/3v-1/2-c Case 2: 2/3v-1/2 < b < v/3+1/2 EP(b) = (b-c)*[(2/3v-1)+(2/3v-2b+1)] = (b-c)*(4/3v-2b) ==> b* = c/2+v/3 ==> EP2(b*) = (3c-2v)^2/18 Case 3: b > v/3+1/2 ==> EP3(b) = 0

  13. Computing Best Response (2) EP1(b*) > EP2(b*) iff c < 2/3 v – 1 Therefore, best-response is… 2/3 v – 1/2 if c < 2/3 v – 1 c/2 + v/3 otherwise

  14. Example: Bargaining Game • (aka, sealed-bid k-double auction) • Buyer and seller place bids, transaction happens iff they overlap • Transaction price is some linear combination of the bids • Known equilibrium (Chatterjee & Samuelson, 1983) for seller (1) and buyer (2): • Found in several iterations from truthful bidding

  15. Provision Point Mechanism • (aka, Public Good or Voluntary Participation game) • 2 agents want to jointly acquire a good costing C • Mechanism: simultaneously offer contributions; buy iff sum > C and split the excess (C – sum) evenly • Nash: 2/3 t + C/4 – 1/6

  16. Shared-Good Auction • New mechanism, similar to the divorce-settlement game; undoes provision-point • Agents place bids for a good they currently share, valuations ~U[A,B] • High bidder gets the good and pays half its bid to the low bidder in compensation

  17. Equilibrium in Shared-Good Auction • Found in one iteration from truthful bidding (for any specific [A,B])

  18. Vicious Vickrey Auction • Generalization of a Vickrey Auction (Brandt & Weiss, 2001) to allow for disutility from opponent’s utility (eg, business competitors) • Brandt & Weiss consider only the complete information version

  19. Equilibrium in Vicious Vickrey • a(t) = (k+t)/(k+1) • Reduces to truthful bidding for the standard Vickrey Auction (k=0) • Iterated best-response solver finds this equilibrium (for specific values of k) within several iterations from a variety of seed strategies

  20. Conclusions • First algorithm for finding best-response strategies in a broad class of infinite games of incomplete information • Confirms known equilibria (eg, FPSB), confirms equilibria we derive here (Supply-Chain game), discovers new equilibria (Shared-good auction, Vicious Vickrey) • Goal: characterize the class of games for which iterated best-response converges

More Related