Understanding the Grim Trigger Strategy in Repeated Games

Game Theory Episode 6The Grim Trigger

Agenda • Main ideas • Key terms • Basic strategy for solving a repeated game • Example Problem • Set-up • Mathematical detour (infinite series) • Solution and implications • Conclusion

Main Ideas • Cooperation is possible in a repeated game when it is repeated indefinitely (or infinitely), and when people place sufficient value on future payoffs • Sometimes we talk about the “shadow of the future” being sufficiently small • Equivalent to saying that the discount rate is sufficiently small • Typically requires a strategy based on the credible threat of punishment in response to defection • Reintroduces the notion of continuation value • Typically allows for a multiplicity of strategies • Reputation can matter in a repeated game

Key Terms • Grim Trigger: • Begin by cooperating, with defection rewarded by the punishment of never-ending pain • Continuation value: • present value of a payoff stream • Discount factor: • degree to which future is valued, δ, bounded by [0, 1] • may think of it as probability you’re in the game next round • History: • past actions taken by players • Subgame-Perfect Equilibrium: • An equilibrium that specifies a Nash-equilibrium strategy in every subgame, particularly appropriate for repeated games

Basic Strategy • Read the problem, noting strategic setting • Identify the question • Usually, “Find conditions sufficient to sustain a certain stable pattern of behavior; describe the equilibrium.” • FIRST: Guess at what the equilibrium might be • Usually a grim trigger strategy, where cheating is rewarded with never-ending punishment • THEN: Check to make sure that this strategy fulfills equilibrium conditions • That it is a best response for players to perform the prescribed behavioral pattern. • Write down the entire set of strategies and sufficient conditions • Usually the condition is a bound on the discount rate

Example Problem • Consider infinite repetition of the Prisoner’s Dilemma: CD C 3, 3 0, 10 D 10, 0 1, 1 Each player’s payoff in the infinitely repeated game is the discounted sum of its payoffs in each period (ie., the standard case we considered in class). Assume that the players have common discount rate δ, where 0 > δ > 1.

For what values of δ is CC sustainable? Player 2 CD Player 1 C 3, 3 0, 10 D 10, 0 1, 1 It’s round 1. Take the role of player 2. You know that the grim trigger is a potential solution. Assume player 1 begins by cooperating. Your move: If player 1 is cooperating, and you defect, you get: 10 + δ*1 + δ2*1 + δ3*1 … If player 1 is cooperating, and you cooperate, you get: 3 + δ*3 + δ2*3 + δ3*3 …

Mathematical Detour If player 1 is cooperating, and you defect, you get: PD = 10+ δ*1 + δ2*1 + δ3*1 + … (once you defect, you always defect: it’s a dominant strategy given the other player always defects) If player 1 is cooperating, and you cooperate, you get: PC = 3 +δ3 +*3 + δ2*3 + δ3*3 + … Notice the summation of an infinite series? You might think that this sum would be incalculable. But no. We have ways of dealing with this… it won’t hurt a bit… Call the infinite series PC, like so: PC = 3 + 3δ + 3δ + 3δ3 + … Now multiply PC by δ, like so: δ* PC = 3δ + 3δ2 + 3δ3 + 3δ4 + … Now subtract one from the other: (PC – δ PC) = 3 Factor out the PC : PC *(1 – δ) = 3 Divide by (1 – δ) PC = 3 / (1 – δ) Therefore, 3 + δ*3 + δ2*3 + … = 3 / (1 – δ)

Mathematical Detour If player 1 is cooperating, and you defect, you get: PD = 10+ δ*1 + δ2*1 + δ3*1 + … Call the infinite series PD, like so: PD – 10 = δ + δ2 + δ3 + … Now multiply PD – 10 by δ, like so: δ*(PD – 10) = δ2 + δ3 + δ4 … Now subtract one from the other: PD – 10– δ*(PD – 10) = δ Simplify: PD – 10– δPD + 10δ = δ Get everything with PD to one side: PD– δPD = δ - 10δ + 10 Simplify: PD– δPD = - 9δ + 10 Factor out the PD : PD *(1 – δ) = 10 - 9δ Divide by (1 – δ) PD = (10 - 9δ) / (1 – δ) Therefore, 10 + δ*10 + δ2*10 + … = (10 - 9δ) / (1 – δ)

Define the Equilibrium Strategy If player 1 is cooperating, and you defect, you get: PD = 10+ δ*1 + δ2*1 + δ3*1 + … PD = (10 - 9δ) / (1 – δ) If player 1 is cooperating, and you cooperate, you get: PC = 3 +δ3 +*3 + δ2*3 + δ3*3 + … PC = 3 / (1 – δ) These are the continuation values of these strategies Cooperation is sustained if the expected value of cooperation exceeds the expected value of defection: 3 / (1 – δ) > (10 - 9δ) / (1 – δ), or 3 > 10 - 9δ, or 7 < 9δ δ > or equal to 7 / 9, then cooperation can be sustained. Strategy is: Play C as long as opponent played C in all previous rounds; if opponent played D in a previous round, then play D from now on.

Can we sustain (C,D) in odd periods and (D,C) in even periods? Let’s get funky with it… Player 2 CD Player 1 C 3, 3 0, 10 D 10, 0 1, 1 State the strategies: Player 1:Play C in odd periods as long as opponent has played C in all previous even periods; play D in odd periods if opponent has ever played D in an even period. Play D in all even periods. Player 2:Play C in even periods as long as opponent has played C in all previous odd periods; play D in even periods if opponent has ever played D in an even period. Play D in all odd periods.

Can we sustain (C,D) in odd periods and (D,C) in even periods? Let’s get funky with it… Player 2 CD Player 1 C 3, 3 0, 10 D 10, 0 1, 1 Player 1’s payoffs would be P1 = 10 + δ*0 + δ2*10 + δ3*0 + … = 10/(1- δ2)] Given that we begin in an even period (alternatively, these are player 2’s payoffs if we start in an odd period.) Player 2’s payoffs would be P2 = 0 + δ*10 + δ2*0 + δ3*10 + … = δ[10/(1- δ2)] Given that we begin in an even period (alternatively, these are player 1’s payoffs if we start in an odd period.) These are the continuation values of these strategies Cheating is defined as playing defect. Given a grim trigger, Player 2’s payoffs for cheating: P2, cheating = 1 + δ*1 + δ2*1 + δ3*1 + … = 1/(1- δ) Player 2 will play C in even periods as long as δ[10/(1- δ2)] > or equal to 1/(1- δ) 10δ/(1- δ)(1+ δ) > 1/(1- δ), 10δ/(1+ δ) > 1, 10δ > (1+ δ), 9δ > 1 δ > 1/9

Can we sustain (C,D) in odd periods and (D,C) in even periods? Let’s get funky with it… Player 2 CD Player 1 C 3, 3 0, 10 D 10, 0 1, 1 What about the other player, the one who begins with 10? As before, cheating is defined as playing defect. Given a grim trigger, Pl 1’s payoffs (alternating): P2, cheating = 10 + δ*0 + δ2*10 + δ3*0 + … = 10 /(1- δ2) Pl 1’s payoffs (cheating): P2, cheating = 10 + δ*1 + δ2*1 + δ3*1 + … = δ /(1- δ) + 10 Player 1 will play C in even periods as long as 10 /(1- δ2)> or equal to δ /(1- δ) + 10 10/(1- δ2) > δ /(1- δ) + 10 10/(1+ δ)(1- δ) > δ/(1- δ) + (10 -10δ) / (1- δ) 10/(1+ δ) > δ + (10 -10δ) 10 > (-9δ + 10) (1+ δ) 10 > -9δ -9δ2 + 10 +10 δ 0 > -9δ -9δ2 +10 δ -9δ2 -9δ +10 δ = - 9δ2 +δ 9δ2 –δ > 0 9δ > 1 δ > 1/9, just like before.

When will alternating be preferred to fully cooperative equilibria? One Last Question Player 2 CD Player 1 C 3, 3 0, 10 D 10, 0 1, 1 Recall that when they fully cooperate, BOTH players get 3/(1- δ), and When a player cooperates in alternating equilibria, that player gets 10/(1- δ2) or δ[10/(1- δ2)], depending on which is the first to cooperate. For both players to prefer alternating equilibrium strategies, both 10/(1- δ2) and δ[10/(1- δ2)] must be greater than 3/(1- δ). Since the discount factor is a number between zero and one, 10/(1- δ2) > δ[10/(1- δ2)], and so if δ[10/(1- δ2)] > 3/(1- δ), then 3/(1- δ) > 10/(1- δ2), so, for both players to prefer alternating equilibrium strategies, δ[10/(1- δ2)] > 3/(1- δ) is sufficient.

One Last Question Player 2 CD Player 1 C 3, 3 0, 10 D 10, 0 1, 1 For both players to prefer alternating equilibrium strategies, δ[10/(1- δ2)] > 3/(1- δ) is sufficient. δ[10/(1- δ2)] > 3/(1- δ) 10δ/(1+ δ)(1- δ) > 3/(1- δ) 10δ/(1+ δ) > 3 10δ > 3 + 3δ 7δ > 3, so δ > or equal to 3/7 Note that the player who begins by defecting in the alternating eqm will always prefer this eqm to the cooperative equilibrium. 10/(1- δ2)] > 3/(1- δ) becomes δ > 7/3, which is true for all allowable δ

Do you believe in the Grim Trigger? • Don’t underestimate the power of beliefs • Beliefs do all the work • We saw this in Myerson & Weber • Nash Equilibrium (and SPE) rest on beliefs • These beliefs provide support for strategies played in equilibrium • The Grim Trigger hinges upon beliefs • Belief in enforcement is what makes compliance a best-response

Understanding the Grim Trigger Strategy in Repeated Games

Understanding the Grim Trigger Strategy in Repeated Games

Presentation Transcript

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game Theory

Game theory