belief learning in an unstable infinite game l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Belief Learning in an Unstable Infinite Game PowerPoint Presentation
Download Presentation
Belief Learning in an Unstable Infinite Game

Loading in 2 Seconds...

play fullscreen
1 / 32

Belief Learning in an Unstable Infinite Game - PowerPoint PPT Presentation


  • 265 Views
  • Uploaded on

Belief Learning in an Unstable Infinite Game Paul J. Healy CMU Issue #3 Issue #2 Belief Learning in an Unstable Infinite Game Issue #1 Issue #1: Infinite Games Typical Learning Model: Finite set of strategies Strategies get weight based on ‘fitness’

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Belief Learning in an Unstable Infinite Game' - Thomas


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
issue 1 infinite games
Issue #1: Infinite Games
  • Typical Learning Model:
    • Finite set of strategies
    • Strategies get weight based on ‘fitness’
    • Bells & Whistles: experimentation, spillovers…
  • Many important games have infinite strategies
    • Duopoly, PG, bargaining, auctions, war of attrition…
  • Quality of fit sensitive to grid size?
  • Models don’t use strategy space structure
previous work
Previous Work
  • Grid size on fit quality:
    • Arifovic & Ledyard
      • Groves-Ledyard mechanisms
      • Convergence failure of RL with |S| = 51
  • Strategy space structure:
    • Roth & Erev AER ’99
  • Quality-of-fit/error measures
    • What’s the right metric space?
      • Closeness in probs. or closeness in strategies?
issue 2 unstable game
Issue #2: Unstable Game
  • Usually predicting convergence rates
    • Example: p–beauty contests
  • Instability:
    • Toughest test for learning models
    • Most statistical power
previous work6
Previous Work
  • Chen & Tang ‘98
    • Walker mechanism & unstable Groves-Ledyard
    • Reinforcement > Fictitious Play > Equilibrium
  • Healy ’06
    • 5 PG mechanisms, predicting convergence or not
  • Feltovich ’00
    • Unstable finite Bayesian game
    • Fit varies by game, error measure
issue 3 belief learning
Issue #3: Belief Learning
  • If subjects are forming beliefs, measure them!
  • Method 1: Direct elicitation
    • Incentivized guesses about s-i
  • Method 2: Inferred from payoff table usage
    • Tracking payoff ‘lookups’ may inform our models
previous work8
Previous Work
  • Nyarko & Schotter ‘02
    • Subjects BR to stated beliefs
    • Stated beliefs not too accurate
  • Costa-Gomes, Crawford & Boseta ’01
    • Mouselab to identify types
    • How players solve games, not learning
this paper
This Paper
  • Pick an unstable infinite game
  • Give subjects a calculator tool & track usage
  • Elicit beliefs in some sessions
  • Fit models to data in standard way
  • Study formation of “beliefs”
    • “Beliefs” <= calculator tool
    • “Beliefs” <= elicited beliefs
the game
The Game
  • Walker’s PG mechanism for 3 players
  • Added a ‘punishment’ parameter
parameters equilibrium
Parameters & Equilibrium
  • vi(y) = biy – aiy2 + ci
  • Pareto optimum: y = 7.5
  • Unique PSNE: si* = 2.5
  • Punishment γ= 0.1
  • Purpose: Not too wild, payoffs rarely negative
  • Guessing Payoff: 10 – |gL - sL|/4 - |gR - sR|/4
  • Game Payoffs: Pr(<50) = 8.9%

Pr(>100) = 71%

properties of the game
Properties of the Game
  • Best response:
  • BR Dynamics: unstable
    • One eigenvalue is +2
design
Design
  • PEEL Lab, U. Pittsburgh
  • All Sessions
    • 3 player groups, 50 periods
    • Same group, ID#s for all periods
    • Payoffs etc. common information
    • No explicit public good framing
    • Calculator always available
    • 5 minute ‘warm-up’ with calculator
  • Sessions 1-6
    • Guess sL and sR.
  • Sessions 7-13
    • Baseline: no guesses.
does elicitation affect choice
Does Elicitation Affect Choice?
  • Total Variation:
    • No significant difference (p=0.745)
  • No. of Strategy Switches:
    • No significant difference (p=0.405)
  • Autocorrelation (predictability):
    • Slightly more without elicitation
  • Total Earnings per Session:
    • No significant difference (p=1)
  • Missed Periods:
    • Elicited: 9/300 (3%) vs. Not: 3/350 (0.8%)
does play converge
Does Play Converge?

Average | si – si* | per Period Average | y – yo | per Period

accuracy of beliefs
Accuracy of Beliefs
  • Guesses get better in time

Average || s-i – s-i(t) || per Period

Elicited guesses Calculator inputs

model 1 parametric ewa
Model 1: Parametric EWA
  • δ : weight on strategy actually played
  • φ : decay rate of past attractions
  • ρ : decay rate of past experience
  • A(0): initial attractions
  • N(0): initial experience
  • λ : response sensitivity to attractions
model 1 self tuning ewa
Model 1’: Self-Tuning EWA
  • N(0) = 1
  • Replace δ and φ with deterministic functions:
stewa setup
STEWA: Setup
  • Only remaining parameters: λ and A0
    • λ will be estimated
    • 5 minutes of ‘Calculator Time’ gives A0
      • Average payoff from calculator trials:
stewa fit
STEWA: Fit
  • Likelihoods are ‘zero’ for all λ
    • Guess: Lots of near misses in predictions
  • Alternative Measure: Quad. Scoring Rule
    • Best fit: λ = 0.04 (previous studies: λ>4)
    • Suggests attractions are very concentrated
stewa adjustment attempts
STEWA: Adjustment Attempts
  • The problem: near misses in strategy space,

not in time

  • Suggests: alter δ (weight on hypotheticals)
    • original specification : QSR* = 1.193 @ λ*=0.04
    • δ = 0.7 (p-beauty est.): QSR* = 1.056 @ λ*=0.03
    • δ = 1 (belief model): QSR* = 1.082 @ λ*=0.175
    • δ(k,t) = % of B.R. payoff: QSR* = 1.077 @ λ*=0.06
  • Altering φ:
    • 1/8 weight on surprises: QSR* = 1.228 @ λ*=0.04
stewa other modifications
STEWA: Other Modifications
  • Equal initial attractions: worse
  • Smoothing
    • Takes advantage of strategy space structure
      • λ spreads probability across strategies evenly
      • Smoothing spreads probability to nearby strategies
    • Smoothed Attractions
    • Smoothed Probabilities
    • But… No Improvement in QSR* or λ* !
  • Tentative Conclusion:
    • STEWA: not broken, or can’t be fixed…
other standard models
Other Standard Models
  • Nash Equilibrium
  • Uniform Mixed Strategy (‘Random’)
  • Logistic Cournot BR
  • Deterministic Cournot BR
  • Logistic Fictitious Play
  • Deterministic Fictitious Play
  • k-Period BR
new models
“New” Models
  • Best respond to stated beliefs (S1-S6 only)
  • Best respond to calculator entries
    • Issue: how to aggregate calculator usage?
    • Decaying average of input
  • Reinforcement based on calculator payoffs
    • Decaying average of payoffs
model comparisons
Model Comparisons

* Estimates on the grid of integers {-10,-9,…,9,10}

In = periods 1-35 Out = periods 36-End

the take homes
The “Take-Homes”
  • Methodological issues
    • Infinite strategy space
    • Convergence vs. Instability
    • Right notion of error
  • Self-Tuning EWA fits best.
  • Guesses & calculator input don’t seem to offer any more predictive power… ?!?!