320 likes | 757 Views
Introduction to Game Theory. Lecture 3: Mixed Strategy Nash Equilibrium. Review. Ideal world: Believes derived from past experience of playing the game Experience sufficient to know how opponents will play
E N D
Introduction to Game Theory Lecture 3: Mixed Strategy Nash Equilibrium
Review • Ideal world: • Believes derived from past experience of playing the game • Experience sufficient to know how opponents will play • Does not know action of her particular opponent, but knows how “typical” opponent behaves • Nash Equilibrium – Non of the players can be better off by changing her action, when all other players stick to their actions • bulletproof against unilateral deviations
Review • Best response – Set of actions that are best responses to given combination of all other players’ actions. • Best response correspondence – Assignment of best responses to all possible combinations of all other players’ actions. • Nash Equilibrium – Action profile that is intercept of best response correspondences of all players.
Example of Driver vs. Pedestrian • Game • Best response functions: • BD(Walk)={Stop}, BD(Wait)={Go} • BP(Stop)={Walk}, BP(Go)={Wait} • Two Nash Equilibria:{Walk, Stop} and {Wait, Go}
Review • Non existence of Nash Equilibrium • Penalty: Goalie vs. Kicker • No intercept of Best response correspondences
Today • Mixed strategy Nash equilibrium in games with 2 players with 2 actions • Players’ choice of action may vary • We observe two distinct actions played by one type of player. • Different members of population chooses different actions. • Customers of Vodafone, T-mobile and O2 • One player “flips a coin” and chooses action based on the result of flipping. • Rabbit is running zigzag when hunted by eagle. • How do players face such opponents? Preferences in the world of uncertainty. • Experiment – Beauty Contest game
Mixed Strategies - Motivation • Rabbit vs. Hunter • Forest is on the right side and meadow on the left. Rabbit can dodge to the left or to the right (run straight is dominated, for example ). • Hunter wants to target running rabbit. • Rabbit prefers to run different direction than hunter targets and wants to get to forest. • Hunter prefers to imitate rabbits direction. • Rabbit: If hunter knows that I want to dodge right, he targets right. It is better to dodge left. • Rabbit: Hunter is equally smart guy and he knows this, so I would do better to dodge right. • Rabbit: Hunter knows all of that, … Whatever I do, hunter gets me
Mixed Strategies - Motivation • No Nash equilibrium • Rabbit: If I chose action in random, hunter cannot imitate my decision and I get chance to live . • Dodge left with probability p and right with probability 1-p. • Thinking process of hunter is analogous and thus hunter randomizes as well.
Mixed Strategy • Player chooses probability distribution (p1,p2,p3,…) over the set of his available strategies. • Previous Lecture: Choice was one of the actions. • Today: Choice is going to be probability distribution! • In the case of rabbit – single actions ‘Left’ and ‘Right’ • With probability p1=p he plays ‘Left’ and with probability p2=1-p he plays ‘Right’. • Probabilities over all available actions must sum up to 1!!! • Special case of mixed strategy where p=0 and rabbit always plays ‘Right’ is called pure strategy.
Preferences Over Uncertain Payoff • How to express hunters preferences over his actions left and right (lotteries) when payoff is not certain? • If rabbit sets p=0.7 hunter is offered by rabbit two lotteries: Lottery ‘left’: prob. 0.7 u=1 and prob. 0.3 u=-2 Lottery ‘right’: prob. 0.7 u=-1 and prob. 0.3 u=1 • Von Neumann and Morgenstern preferences: Preferences regarding lotteries can be represented by the expected value of the payoff function. • Each player prefers lottery that brings with higher expected value. • Hunter targets ‘left’: 1*0.7-2*0.3=0.1>-0.4=-1*0.7+1*0.3
Preferences - Comparison • Lecture 2 – payoff has only ordinal meaning • Left is preferred to Right: u(Left)=2 u(Left)=10000 u(Right)=0 u(Right)=-1 • Von Neumann and Morgenstern preferences: • More than ordinal meaning • Order of expected values from lotteries matters • Represented by same table as in previous lectures.
Preferences - Comparison • Both tables represents the same game with ordinal preferences • If rabbit plays p=0.55 U1 (left)=1*0.55-2*0.45=-0.35< -0.1=-1*0.55+1*0.45 =U1 (right) U2 (left)=2*0.55-8*0.45=-2.5> -2.95=-7*0.55+2*0.45 =U2 (right) • Different vNM preferences
Static Game with Mixed Strategies • Set of players • For each player, set of actions • For each player, preferences regarding lotteries over action profiles that can be represented by the expected value of the payoff function over action profiles.
Mixed Strategy - Notation • ai – action of ith player • a – action profile (set of all players’ actions) • a-i = (a1,a2,a3,…, ai-1,ai+1,…, aN-1,aN) – combination of opponents’ actions • i = (p1,p2,p3,…) – mixed strategy of player i • - mixed strategy profile (set of all players’ mixed strategies) • -i = ( 1, 2, 3,…, i-1, i+1,…, N-1, N)
Mixed Strategy NE– Definition • MSNE is mixed strategy profile *, such that no player i has available mixed strategy iwhich she would prefer to play, given that other players stick to their mixed strategies. • i.e. MSNE is bulletproof against unilateral deviations in space of mixed strategies EU(*i)EU(i, *-i) for every i of every player I • If p=0.3 and g=0.2 is MSNE of game Rabbit vs. Hunter, then Rabbit cannot do better by choosing different p. • Neither can do hunter.
MSNE Deviation– Example • Rabbit vs. Hunter • =((0.1,0.9),(0.1,0.9)) is not MSNE • Rabbit: 0.1(-1*0.1+1*0.9)+0.9(2*0.1-1*0.9)=0.71 • Rabbit plays Left: -1*0.1+1*0.9=0.8 • Rabbit does better by deviation to pure strategy (1,0) • How to find MSNE? Check for all possibilities one by one?
Best Response Correspondence • If Hunter plays q, what is the best response of Rabbit? u(Left)=-1*q+1*(1-q) u(Right)=2*q-1*(1-q) • If q is such that -1*q+1*(1-q)> 2*q-1*(1-q) then Rabbit’s best response is to play pure strategy (1,0) - dodge Left. • If q is such that -1*q+1*(1-q)<2*q-1*(1-q) then Rabbit’s best response is to play pure strategy (0,1) - dodge Right.
Best Response Correspondence • If q is such that -1*q+1*(1-q)= 2*q-1*(1-q) , then Rabbit is indifferent between dodging Left and dodging Right. • Rabbit is indifferent between whatever probability distribution over his actions ‘Left’ and ‘Right’. • Rabbit’s best response is whatever mixed strategy. BR(q)={1} if q<2/5 BR (q)={0} if q>2/5 BR (q)=[0,1] if q=2/5
Best Response Correspondence • Similarly for Hunter: • Hunters’ best response is Left if p is such that 1*p-2*(1-p)>-1*p+1*(1-p). • Hunters’ best response is Right if p is such that 1*p-2*(1-p)<-1*p+1*(1-p). • Hunter is indifferent if p is such that 1*p-2*(1-p)=1*p+1*(1-p). • BH(p)={1} if p>3/5 • BH (p)={0} if p<3/5 • BH (p)=[0,1] if p=3/5
MSNE - Example • * is MSNE if and only if *i is best response correspondence for *-i. q - Hunter 1 MSNE 2/5 p - Rabbit 0,0 3/5 1
MSNE - Example • Adding mixed strategy to the table • Other mixed strategies are missing • Not necessarily leads to solution
MSNE – Battle of Sexes • Battle of Sexes • Two NE in pure strategies: ((1,0),(1,0)) and ((0,1),(0,1)) • Any other mixed strategies?
Battle of Sexes – Best Resp. Corr. Pure strategies NE • Additional one in mixed strategies: 2*q+0*(1-q)=0*q+1*(1-q) q*=1/3 1*p+0*(1-p)=0*p+2*(1-p) p*=2/3 q – Player 2 1 MSNE 2/3 1 1/3 p – Player 1 0,0
MSNE – Prisoners’ Dilemma • Prisoners Dilemma • One NE in pure strategies ((1,0),(1,0)) • Any equilibrium in mixed strategies?
MSNE – Prisoners’ Dilemma 1*q+3*(1-q)=0*q+2*(1-q) 3=2 1*p+3*(1-p)=0*p+2*(1-p) 3=2 No MSNE q – Player 2 1 Pure strategies NE p – Player 1 0,0 1
Summary • von Neumann and Morgenstern preferences • Expected utility function – more than just ordinal meaning • Mixed strategy – Player chooses probability distribution over strategies • Rabbit runs zigzag • MSNE – bulletproof against unilateral deviations in mixed strategies
Experiment • Beauty Contest game • Players submit decimal number between [0,100] including 0 and 100. • Player who submits number closest to 2/3 of average wins. • Example: • If numbers submitted are a,b,c,d,e then number closest to (2/3)*(a+b+c+d)/4 wins. • Experiment consists of several rounds. • After each round the average is announced as well as the optimal number (2/3 of average).
Empirics Source: Camerer (1997), Progress in Behavioral Game Theory, Journal of Economic Perspectives
Empirics Source: Camerer (1997), Progress in Behavioral Game Theory, Journal of Economic Perspectives