Loading in 5 sec....

An Answer and a Question Limits: Combining 2 results Significance: Does 2 give 2 ?PowerPoint Presentation

An Answer and a Question Limits: Combining 2 results Significance: Does 2 give 2 ?

Download Presentation

An Answer and a Question Limits: Combining 2 results Significance: Does 2 give 2 ?

Loading in 2 Seconds...

- 60 Views
- Uploaded on
- Presentation posted in: General

An Answer and a Question Limits: Combining 2 results Significance: Does 2 give 2 ?

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

An Answer and a QuestionLimits: Combining 2 resultsSignificance: Does 2 give 2?

Roger Barlow

BIRS meeting

July 2006

- Calculator (used in BaBar) based on Cousins and Highland: frequentist for s, Bayesian integration for and b
- See http://www.slac.stanford.edu/~barlow/java/statistics2.html and C.P.C. 149 (2002) 97
- 3 different priors (uniform in ,1/ , ln )

With 2 measurements

x=1.1 0.1 and x=1.2 0.1

the combination is obvious

With 2 measurements

x<1.1 @ 90% CL and x<1.2 @ 90% CL

all we can say is x<1.1 @ 90% CL

Given N1 events with effcy 1 , background b1

N2 events with effcy 2 , background b2

(Could be 2 experiments, or 2 channels in same experiment)

For significance need to calculate, given source strength s, probability of result {N1 ,N2 } or less.

- Is (3,4) larger or smaller than (2,5) ?

More

??

N2

Less

??

N1

If 1 = 2 and b1=b2 then N1+N2 is sufficient. So cannot just take lower left quadrant as ‘less’.

(And the example given yesterday is trivial)

- Could estimate s by maximising log (Poisson) likelihood
-(i s +bi) + Ni ln (i s +bi)

Hence

Ni i /( i s +bi) - i =0

- Order results by the value of s they give from solving this
- Easier than it looks. For a given {Ni } this quantity is monotonic decreasing with s. Solve once to get sdata , explore s space generating many {Ni } : sign of Ni i /( i sdata +bi) - i tells you whether this estimated s is greater or less than sdata

- This is implemented in the code – ‘Add experiment’ button (up to 10)
- Comments as to whether this is useful are welcome

Analysis looking for bumps

Pure background gives 2old of 60 for 37 dof (Prob 1%).

Not good but not totally impossible

Fit to background+bump (4 new parameters) gives better 2new of 28

Question: Is this significant?

Answer: Yes

Question: How much?

Answer:

Significance is(2 new- 2old )

= (60-28)=5.65

Schematic only!!

No reference to any experimental data, real or fictitious

Puzzle. How does a 3 sigma discrepancy become a 5 sigma discovery?

- ‘We always do it this way’
- ‘Belle does it this way’
- ‘CLEO does it this way’

Likelihood Ratio Test

a.k.a. Maximum Likelihood Ratio Test

If M1 and M2 are models with max. likelihoods L1 and L2 for the data, then 2ln(L2 / L1) is distributed as a 2 with N1 - N2 degrees of freedom

Provided that

- M2 contains M1
- Ns are large
- Errors are Gaussian
- Models are linear

- Investigate with toy MC
- Generate with Uniform distribution in 100 bins, <events/bin>=100. 100 is large and Poisson is reasonably Gaussian
- Fit with
- Uniform distribution (99 dof)
- Linear distribution (98 dof)
- Cubic (96 dof):a0+a1 x + a2 x2 + a3 x3
- Flat+Gaussian (96 dof): a0+a1 exp(-0.5(x- a2)2/a32)
Cubic is linear: Gaussian is not linear in a2and a3

Flat +Gauss

Flat

linear

Cubic

Compare linear and uniform models. 1 dof. Probability flat Method OK

Compare flat+gaussian and uniform models. 3 dof. Probability very unflat

Method invalid

Peak at low P corresponds to large 2 i.e. false claims of significant signal

Compare cubic and uniform models. 3 dof. Probability flat Method OK

If 2 models have the same number of parameters and both contain the true model, one can give better results than the other.This tells us nothing about the data

Shows 2 for flat+gauss v. cubic

Same number of parameters

Flat+gauss tends to be lower

Conclude: 2 does not give 2?

- In the large N limit, ln L
is parabolic in fitted

parameters.

- Model 2 contains Model 1
with a2=0 etc. So expect ln L

to increase by equivalent of 3

in chi squared.

Question. What is wrong with this argument? Asymptopic? Different probability? Or is it right and the previous analysis is wrong?