1 / 33

Evolution and Repeated Games

Evolution and Repeated Games. D. Fudenberg (Harvard) E. Maskin (IAS, Princeton). Theory of repeated games important central model for explaining how self-interested agents can cooperate used in economics, biology, political science and other fields. But theory has a serious flaw:

marcel
Download Presentation

Evolution and Repeated Games

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evolution and Repeated Games D. Fudenberg (Harvard) E. Maskin (IAS, Princeton)

  2. Theory of repeated games important • central model for explaining how self-interested agents can cooperate • used in economics, biology, political science and other fields

  3. But theory has a serious flaw: • although cooperative behavior possible, so is uncooperative behavior (and everything in between) • theory doesn’t favor one behavior over another • theory doesn’t make sharp predictions

  4. Evolution (biological or cultural) can promote efficiency • might hope that uncooperative behavior will be “weeded out” • this view expressed in Axelrod (1984)

  5. Basic idea: • Start with population of repeated game strategy Always D • Consider small group of mutants using Conditional C (Play C until someone plays D, thereafter play D) • does essentially same against Always D as Always D does • does much better against Conditional C than Always D does • Thus Conditional C will invade Always D • uncooperative behavior driven out

  6. But consider ALT Alternate between C and D until pattern broken, thereafter play D • can’t be invaded by some other strategy • other strategy would have to alternate or else would do much worse against ALT than ALT does • Thus ALT is “evolutionarily stable” • But ALT is quite inefficient (average payoff 1)

  7. Still, ALT highly inflexible • relies on perfect alternation • if pattern broken, get D forever • What if there is a (small) probability of mistake in execution?

  8. Consider mutant strategy identical to ALT except if (by mistake) alternating pattern broken • “intention” to cooperate by playing C in following period • if other strategy plays C too, • if other strategy plays D,

  9. Main results in paper (for 2-player symmetric repeated games) • If s evolutionarily stable and • discount rate r small (future important) • mistake probability p small (but p > 0) then s (almost) “efficient” (2) If payoffs (v, v) “efficient”, then exists ES strategy s (almost) attaining (v, v) provided • r small • p small relative to r • generalizes Fudenberg-Maskin (1990), in which r = p = 0

  10. Finite symmetric 2–player game • if • normalize payoffs so that

  11. strongly efficient if

  12. Repeated game: g repeated infinitely many times • period t history • H = set of all histories • repeated game strategy • assume finitely complex (playable by finite computer) • in each period, probability p that i makes mistake • chooses (equal probabilities for all actions) • mistakes independent across players

  13. informally, s evolutionarily stable (ES), if no mutant can invade population with big proportion s and small proportion • formally, s is ES w.r.t. if for all and all • evolutionary stability • expressed statically here • but can be given precise dynamic meaning

  14. population of • suppose time measure in “epochs” T = 1, 2, . . . • strategy state in epoch T • most players in population use • group of mutants (of size a)plays s' a drawn randomly from s' drawn randomly from finitely complex strategies • M random drawings of pairs of players • each pair plays repeated game • = strategy with highest average score

  15. Theorem 1: For any exists such that, for all there exists such that, for all (i) if s not ES, (ii) if

  16. Let Theorem 2: Given such that, for all if s is ES w.r.t. then

  17. Proof: Suppose • will construct mutant s' that can invade • let • if s = ALT, = any history for which alternating pattern broken

  18. Construct s' so that • if h not a continuation of • after , strategy s' • “signals” willingness to cooperate by playing differently from s for 1 period (assume s is pure strategy) • if other player responds positively, plays strongly efficiently thereafter • if not, plays according to s thereafter • after • responds positively if other strategy has signaled, and thereafter plays strongly efficiently • plays according to s otherwise

  19. because is already worst history, s' loses for only 1 period by signaling (small loss if r small) • if p small, probability that s' “misreads” other player’s intention is small • hence, s' does nearly as well against s as s does against itself (even after ) • s' does very well against itself (strong efficiency), after

  20. remains to check how well s does against s' • by definition of • Ignoring effect of p, Also, after deviation by s', punishment started again, and so Hence • so s does appreciably worse against s' than s' does against s'

  21. Summing up, we have: • s is not ES

  22. Theorem 2 implies for Prisoner’s Dilemma that, for any • doesn’t rule out punishments of arbitrary (finite) length

  23. Consider strategy s with “cooperative” and “punishment” phases • in cooperative phase, play C • stay in cooperative phase until one player plays D, in which case go to punishment phase • in punishment phase, play D • stay in punishment phase for m periods (and then go back to cooperative phase) unless at some point some player chooses C, in which case restart punishment • For any m,

  24. Can sharpen Theorem 2 for Prisoner’s Dilemma: Given , there exist such that, for all if s is ES w.r.t. then it cannot entail a punishment lasting more than periods Proof: very similar to that of Theorem 2

  25. For r and p too big, ES strategy s may not be “efficient” • if • if fully cooperative strategies in Prisoner’s Dilemma generate payoffs

  26. Theorem 3: Let For all for all for all

  27. Proof: Construct s so that • along equilibrium path of (s, s), payoffs are (approximately) (v, v) • punishments are nearly strongly efficient • deviating player (say 1) minimaxed long enough wipe out gain • thereafter go to strongly efficient point • overall payoffs after deviation: • if r and p small (s, s) is a subgame perfect equilibrium

  28. In Prisoner’s Dilemma, consider s that • plays C the first period • thereafter, plays C if and only if either both players played C previous period or neither did • strategy s • is efficient • entails punishments that are as short as possible • is modification of Tit-for-Tat (C the first period; thereafter, do what other player did previous period) • Tit-for-Tat not ES • if mistake (D, C) occurs then get wave of alternating punishments: (C, D), (D, C), (C, D), ... until another mistake made

  29. Let s = play d as long as in all past periods • both players played d • neither played d if single player deviates from d • henceforth, that player plays b • other player plays a • s is ES even though inefficient • any attempt to improve on efficiency, punished forever • can’t invade during punishment, because punishment efficient

  30. Consider potential invader s' For any h, s' cannot do better against s than s does against itself, since (s, s) equilibrium hence, for all h, and so For s' to invade, need Claim:implies h' involves deviation from equil path of (s, s) only other possibility: • s' different from s on equil path • then s' punished by • violates we thus have Hence, from rhs of

  31. For Theorem 3 to hold, p must be small relative to r • consider modified Tit-for-Tat against itself (play C if and only if both players took same action last period) • with every mistake, there is an expected loss of 2 – (½ · 3 + ½ (−1)) = 1 the first period 2 – 0 = 2 the second period • so over-all the expected loss from mistakes is approximately • By contrast, a mutant strategy that signals, etc. and doesn’t punish at all against itself loses only about • so if r is small enough relative to p, mutant can invade

More Related