slide1 n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) April 6, 2006 Rotterdam PowerPoint Presentation
Download Presentation
Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) April 6, 2006 Rotterdam

Loading in 2 Seconds...

play fullscreen
1 / 36

Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) April 6, 2006 Rotterdam - PowerPoint PPT Presentation


  • 105 Views
  • Uploaded on

Adapting de Finetti's Proper Scoring Rules for Measuring Bayesian Subjective Probabilities when Those Probabilities Are not Bayesian. Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) April 6, 2006 Rotterdam.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans) April 6, 2006 Rotterdam' - tacy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

Adapting de Finetti's Proper Scoring Rules for Measuring Bayesian Subjective Probabilities when Those Probabilities Are not Bayesian

Peter P. Wakker (& Gijs van de Kuilen, Theo Offerman, Joep Sonnemans)

April 6, 2006Rotterdam

Topic: How to measure your subjective probability p of event E ("rain tomorrow")?

Such that also acceptable to frequentists?

slide2

2

Method 1. Ask directly (introspection).

Problem. No clear empirical meaning; no incentive(-compatibility).

Economists don't like such things.

slide3

3

Method 2. Reveal (binary) preferences.

€0.30  (€1 under E)  €0.20 

0.30 > p > 0.20 (if EV = expected value)

Problem. Much work. Get only inequalities.

(And: EV may be violated.)

Method 3. Reveal indifferences.

€0.25 ~ (€1 under E) 

p = 0.25 (if EV …)

Problem. Indifferences are hard to observe (BDM …).

(And: EV may be violated.)

slide4

4

Method 4. de Finetti's book making. Not explained here.

Problem. Not implementable in practice.

(And: EV may be violated.)

slide5

5

Method 5.Proper scoring rules.

Choose 0  r  1 as you like

(called reported probability).

Next, you receive

E Ec  1 – (1– r)2 1 – r2

EV: Take probability p of E;

subjective if must be. Then maximize EV.

Optimization of r:

slide6

6

EV = p(1 – (1– r)2) + (1 – p)(1 – r2)1st order condition:

2p(1– r) – (1 – p)2r = 0

2p(1– r) = (1 – p)2r

r = p!

Wow!

Avoids all problems mentioned above, except

Problem. EV may be violated …

Proper scoring rules tractable & widely used:

See Hanson (Nature, 2002), Prelec (Science 2005).

In accounting (Wright 1988), Bayesian statistics (Savage 1971), business (Stael von Holstein 1972), education (Echternacht 1972), medicine (Spiegelhalter 1986), psychology (Liberman & Tversky 1993; McClelland & Bolger 1994).

slide7

7

Before analyzing nonEV descriptively, a:

Theoretical Example [EV].Urn K ("known") with 100 balls:

25 G(reen), 25 R(ed), 25 S(ilver), 25 Y(ellow). One ball is drawn randomly.

E: ball is not red; E = {G,S,Y}.

p = 0.75.

Under expected value,

optimal rE is 0.75.

rG + rS + rY = 0.25 + 0.25 + 0.25 = 0.75 = rE:

r satisfies additivity; as probabilities should!

Example reanalyzed later.

slide8

8

1

R(p)

rEV

rEU

0.69

0.61

rnonEU

0.75

0.50

rnonEUA

nonEU

EU

0.25

EV

0

1

0

0.25

0.50

0.75

p

reward: if E true if not E true EV

 rEV=0.75

0.94 0.44 0.8125

 rEU=0.69

go to p. 11, Example EU

0.91 0.52 0.8094

go to p. 20, Example nonEUA

go to p. 15, Example nonEU

 rnonEU=0.61

0.85 0.63 0.7920

next p.

 rnonEUA=0.52

0.77 0.73 0.7596

Reported probability R(p) = rE as function of true probability p, under:

(a) expected value (EV);

(b) expected utility with U(x) = x0.5 (EU);

(c) nonexpected utility for known probabilities, with U(x) = x0.5 and with w(p) as common;

rnonEUA refers to nonexpected utility for unknown probabilities ("Ambiguity").

slide9

9

Deviation 1[Utility curvature].

Bernoulli (1738): risk aversion! U concave

(if EU …).

Now optimize

pU(1 – (1– r)2) + (1 – p)U(1 – r2)

slide10

(1–p)

p +

U´(1–r2)

U´(1–r2)

U´(1–(1–r)2)

U´(1–(1–r)2)

r

p =

(1–r)

r +

10

p

r =

Explicit expression:

EV: r is additive.

EU: r is nonadditive (U nonsymmetric about 0.5).

Gives critical test of EV versus EU

slide11

go to p. 8, with figure of R(p)

11

Theoretical Example continued [Expected Utility].

(urn K, 25 G, 25 R, 25 S, 25 Y).

E: {G,S,Y}; p = 0.75.

EV: rEV = 0.75.

Expected utility, U(x) = x:

rEU = 0.69.

rG+rS+rY = 0.31+0.31+0.31 = 0.93 > 0.69 = rE:

additivity violated!

Such data prove that expected value cannot hold.

Example reanalyzed later.

slide12

12

Deviation 2 from EV: nonexpected utility (Allais 1953, Machina 1982, Kahneman & Tversky 1979, Quiggin 1982, Schmeidler 1989, Gilboa 1987, Gilboa & Schmeidler 1989, Gul 1991, Tversky & Kahneman 1992, etc.)

Fortwo-gainprospects,all theories areas follows:

For r  0.5, nonEU(r) =

w(p)U(1 – (1–r)2) + (1–w(p))U(1–r2).

r < 0.5, symmetry; soit!

Different treatment of highest and lowest outcome: "Rank-dependence."

slide13

.51

1/3

2/3

13

1

w(p)

1

0

1/3

p

Figure.The common weighting fuction w.

w(p) = exp(–(–ln(p))) for  = 0.65.

w(1/3)  1/3;

w(2/3)  .51

slide14

w(p)

r =

(1–w(p))

w(p) +

U´(1–r2)

U´(1–r2)

U´(1–(1–r)2)

U´(1–(1–r)2)

Explicit expression:

)

r

(

p =

w–1

(1–r)

r +

14

Now

slide15

go to p. 8, with figure of R(p)

15

Examplecontinued[nonEU].

(urn K, 25 G, 25 R, 25 S, 25 Y).

E: {G,S,Y}; p = 0.75.

EV: rEV = 0.75.

EU: rEU = 0.69.

Nonexpected utility, U(x) = x,

w(p) = exp(–(–ln(p))0.65).

rnonEU = 0.61.

rG+rS+rY = 0.39+0.39+0.39 = 1.17 > 0.61 = rE:

additivity is strongly violated!

Deviations from EV and Bayesianism so far were at level of behavior; not of beliefs.

Now for something different; more fundamental.

slide16

16

3rd violation of EV: Ambiguity (unknown probabilities)

Preparation for theoretical example.

Random draw from additional urn A.

100 balls, ? Ga, ? Ra, ? Sa, ? Ya.

Unknown composition ("Ambiguous").

Ea: {Ga,Sa,Ya}; p = ?.

How deal with unknown probabilities?

Have to give up Bayesian beliefs.

slide17

>

<

Probabilistic soph. Bayesian beliefs!

17

1st Proposal (trying to maintain Bayesian beliefs)

Assign "subjective" probabilities to events.

Then behave as if known probs (may be nonEU)

("probabilistic sophistication," Machina & Schmeidler 1992; they did it for normative, not as we do now descriptively).

By symmetry then

P(G) = P(R) = P(S) = P(Y) = ¼.

Treated same as known urn K.

Empirically violated (Ellsberg)!

slide18

18

Instead of additive beliefs p = P(E), nonadditive beliefs B(E), with

B(E)  B(G) + B(S) + B(Y).

(Dempster&Shafer, Tversky&Koehler, etc.)

All currently existing decision models:

For r  0.5, nonEU(r) =

w(B(E))U(1–(1–r)2) + (1–w(B(E)))U(1–r2).

(Commonly written with W(E) for w(B(E)), so

B(E) = w–1(W(E)); matter of notation.)

slide19

w(B(E))

rE =

(1–w(B(E)))

w(B(E)) +

U´(1–r2)

U´(1–r2)

U´(1–(1–r)2)

U´(1–(1–r)2)

Explicit expression:

r

)

(

w–1

B(E) =

(1–r)

r +

19

slide20

go to p. 8, with figure of R(p)

20

Examplecontinued[Ambiguity, nonEUA].

(urn A, 100 balls, ? G, ? R, ? S, ? Y).

E: {G,S,Y}; p = ?

rEV = 0.75.

rEU = 0.69.

rnonEU = 0.61 (under plausible assumptions).

Typically,w(B(E))<<w(B(E)) (E from known urn).

rnonEUA is, say, 0.52.

rG+rS+rY = 0.48+0.48+0.48 = 1.44 > 0.52 = rE:

additivity is very extremely violated!

r's are close to always saying fifty-fifty.

Belief component B(E) = 0.62.

slide21

21

B(E): ambiguity attitude /=/ beliefs??

Before entering that debate, first:

How measure B(E)?

Our contribution: through proper scoring rules with "risk correction."

Earlier proposals for measuring B:

slide22

22

Proposal 1 (common in decision theory).

Measure U,W, and w from behavior.

Derive B(E) = w–1(W(E)) from it.

Problem: Much and difficult work!!!

slide23

23

Proposal 2 (common in decision analysis of the 1960s, and in modern experimental economics).

Measure "canonical probabilities":

For ambiguous event Ea,

find objective prob. p such that

(€100 if Ea) ~ (€100 with prob. p). Then (algebra …)

B(E) = p.

Problem: measuring indifferences is difficult.

slide24

24

Proposal 3 (common in proper scoring rules): Calibration …

Problem: Need many repeated observations.

slide25

25

Our proposal: Take the best of all worlds!

Get B(E) = w–1(W(E)) without measuring U,W, and w from decision making.

Get canonical probability without measuring indifferences, or BDM.

Calibrate without needing long repeated observations.

Do all that with no more than simple proper-scoring-rule questions.

slide26

)

(

w–1

p =

)

(

U´(1–r2)

U´(1–r2)

w–1

B(E) =

U´(1–(1–r)2)

U´(1–(1–r)2)

r

r

(1–r)

(1–r)

r +

r +

26

We reconsider explicit expressions:

Corollary.p = B(E) if related to the same r!!

slide27

27

Because p = B(E) if related to same:

We simply measure the R(p) curves, and use their inverses:

B(E) = R–1(rE).

Appling R–1 is called risk correction.

Directly implementable empirically.

We did so in an experiment, and found plausible results.

slide28

28

Experimental Test of Our Correction Method

slide29

29

Method

Participants. N = 93 students.

Procedure. Computarized in lab. Groups of 15/16 each. 4 practice questions.

slide30

30

Stimuli 1. First we did proper scoring rule for unknown probabilities. 72 in total.

For each stock two small intervals, and, third, their union. Thus, we test for additivity.

slide31

31

Stimuli 2. Known probabilities:

Two 10-sided dies thrown.

Yield random nr. between 01 and 100.

Event E: nr.  75 (etc.).

Done for all probabilities j/20.

Motivating subjects. Real incentives.

Two treatments.

1. All-pay. Points paid for all questions.

6 points = €1.

Average earning €15.05.

2. One-pay (random-lottery system).

One question, randomly selected afterwards, played for real. 1 point = €20. Average earning: €15.30.

slide32

32

Results

slide34

0.7

0.6

0.5

treatment

one

0.4

0.3

0.2

treatment

all

0.1

0

34

F(

ρ

)

1

0.9

Individual corrections.

0.8

ρ

-2.0

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

slide35

35

uncorrected

uncorrected

Figure 9.1. Empirical density of additivity bias for the two treatments

Fig. b. Treatment t=ALL

Fig. a. Treatment t=ONE

160

160

140

140

corrected

120

120

100

100

80

80

60

60

corrected

40

40

20

20

0

0

0.2

0.6

0.4

0

0.2

0.4

0.6

0.6

0.4

0.2

0

0.2

0.4

0.6

For each interval [, ] of length 0.05 around , we counted the number of additivity biases in the interval, aggregated over 32 stocks and 89 individuals, for both treatments. With risk-correction, there were @ > 60 additivity biases between 0.375 and 0.425 in the treatment t=ONE, and without risk-correction there were @<100 such; etc.

slide36

36

Summary and Conclusion

Modern decision theories show that proper scoring rules are heavily biased.

We present ways to correct for these biases (get right beliefs even if EV violated).

Experiment shows that correction improves quality and reduces deviations from ("rational"?) Bayesian beliefs.

Correction does not remove all deviations from Bayesian beliefs. Beliefs seem to be genuinely nonadditive/nonBayesian/sensitive-to-ambiguity.