An answer and a question limits combining 2 results significance does 2 give 2
This presentation is the property of its rightful owner.
Sponsored Links
1 / 16

An Answer and a Question Limits: Combining 2 results Significance: Does  2 give  2 ? PowerPoint PPT Presentation


  • 45 Views
  • Uploaded on
  • Presentation posted in: General

An Answer and a Question Limits: Combining 2 results Significance: Does  2 give  2 ?. Roger Barlow BIRS meeting July 2006. Revisit s+b. Calculator (used in BaBar) based on Cousins and Highland: frequentist for s, Bayesian integration for  and b

Download Presentation

An Answer and a Question Limits: Combining 2 results Significance: Does  2 give  2 ?

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


An answer and a question limits combining 2 results significance does 2 give 2

An Answer and a QuestionLimits: Combining 2 resultsSignificance: Does 2 give 2?

Roger Barlow

BIRS meeting

July 2006


Revisit s b

Revisit s+b

  • Calculator (used in BaBar) based on Cousins and Highland: frequentist for s, Bayesian integration for  and b

  • See http://www.slac.stanford.edu/~barlow/java/statistics2.html and C.P.C. 149 (2002) 97

  • 3 different priors (uniform in  ,1/ , ln  )


Combining limits

Combining Limits?

With 2 measurements

x=1.1  0.1 and x=1.2  0.1

the combination is obvious

With 2 measurements

x<1.1 @ 90% CL and x<1.2 @ 90% CL

all we can say is x<1.1 @ 90% CL


Frequentist problem

Frequentist problem

Given N1 events with effcy 1 , background b1

N2 events with effcy 2 , background b2

(Could be 2 experiments, or 2 channels in same experiment)

For significance need to calculate, given source strength s, probability of result {N1 ,N2 } or less.


What does or less mean

What does “Or less” mean?

  • Is (3,4) larger or smaller than (2,5) ?

More

??

N2

Less

??

N1


Constraint

Constraint

If 1 = 2 and b1=b2 then N1+N2 is sufficient. So cannot just take lower left quadrant as ‘less’.

(And the example given yesterday is trivial)


Suggestion

Suggestion

  • Could estimate s by maximising log (Poisson) likelihood

    -(i s +bi) + Ni ln (i s +bi)

    Hence

     Ni i /( i s +bi) - i =0

  • Order results by the value of s they give from solving this

  • Easier than it looks. For a given {Ni } this quantity is monotonic decreasing with s. Solve once to get sdata , explore s space generating many {Ni } : sign of  Ni i /( i sdata +bi) - i tells you whether this estimated s is greater or less than sdata


Message

Message

  • This is implemented in the code – ‘Add experiment’ button (up to 10)

  • Comments as to whether this is useful are welcome


Significance

Significance

Analysis looking for bumps

Pure background gives 2old of 60 for 37 dof (Prob 1%).

Not good but not totally impossible

Fit to background+bump (4 new parameters) gives better 2new of 28

Question: Is this significant?

Answer: Yes

Question: How much?

Answer:

Significance is(2 new- 2old )

= (60-28)=5.65

Schematic only!!

No reference to any experimental data, real or fictitious

Puzzle. How does a 3 sigma discrepancy become a 5 sigma discovery?


Justification

Justification?

  • ‘We always do it this way’

  • ‘Belle does it this way’

  • ‘CLEO does it this way’


Possible justification

Possible Justification

Likelihood Ratio Test

a.k.a. Maximum Likelihood Ratio Test

If M1 and M2 are models with max. likelihoods L1 and L2 for the data, then 2ln(L2 / L1) is distributed as a 2 with N1 - N2 degrees of freedom

Provided that

  • M2 contains M1 

  • Ns are large 

  • Errors are Gaussian 

  • Models are linear 


Does it matter

Does it matter?

  • Investigate with toy MC

  • Generate with Uniform distribution in 100 bins, <events/bin>=100. 100 is large and Poisson is reasonably Gaussian

  • Fit with

    • Uniform distribution (99 dof)

    • Linear distribution (98 dof)

    • Cubic (96 dof):a0+a1 x + a2 x2 + a3 x3

    • Flat+Gaussian (96 dof): a0+a1 exp(-0.5(x- a2)2/a32)

      Cubic is linear: Gaussian is not linear in a2and a3


One experiment

One ‘experiment’

Flat +Gauss

Flat

linear

Cubic


Calculate 2 probabilities of differences in models

Calculate 2 probabilities of differences in models

Compare linear and uniform models. 1 dof. Probability flat Method OK

Compare flat+gaussian and uniform models. 3 dof. Probability very unflat

Method invalid

Peak at low P corresponds to large 2 i.e. false claims of significant signal

Compare cubic and uniform models. 3 dof. Probability flat Method OK


Not all parameters are equally useful

Not all parameters are equally useful

If 2 models have the same number of parameters and both contain the true model, one can give better results than the other.This tells us nothing about the data

Shows 2 for flat+gauss v. cubic

Same number of parameters

Flat+gauss tends to be lower

Conclude: 2 does not give 2?


But surely

But surely…

  • In the large N limit, ln L

    is parabolic in fitted

    parameters.

  • Model 2 contains Model 1

    with a2=0 etc. So expect ln L

    to increase by equivalent of 3

    in chi squared.

    Question. What is wrong with this argument? Asymptopic? Different probability? Or is it right and the previous analysis is wrong?


  • Login