Statistics
This presentation is the property of its rightful owner.
Sponsored Links
1 / 82

Statistics PowerPoint PPT Presentation


  • 58 Views
  • Uploaded on
  • Presentation posted in: General

Statistics. We collect a sample of data, what do we do with it? Estimate parameters (possibly of some model) Test whether a particular theory is consistent with our data (hypothesis testing) Statistics is a set of tools that allows us to achieve these goals. Statistics. Preliminaries.

Download Presentation

Statistics

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Statistics

Statistics

  • We collect a sample of data, what do we do with it?

    • Estimate parameters (possibly of some model)

    • Test whether a particular theory is consistent with our data (hypothesis testing)

  • Statistics is a set of tools that allows us to achieve these goals


Statistics1

Statistics

  • Preliminaries


Statistics2

Statistics

  • Some common estimators are for the mean and variance


C 2 distribution

c2 Distribution

  • A common situation is that you have a set of measurements xi and you know the true value of each xit

    • How good are our measurements?

  • Similarly you may be comparing a histogram of data with another that contains expectation values under some hypothesis

    • How well do the data agree with this hypothesis?

  • Or if parameters of a function were estimated using the method of least squares, a minimum value of c2 was obtained

    • How good was the fit?


C 2 distribution1

c2 Distribution

  • Assuming

    • The measurements are independent of each other

    • The measurements come from a Gaussian distribution

  • One can use the “goodness-of-fit” statistic c2 to answer these questions

    • In the case of Poisson distributed numbers, si2=xti, this is called Pearson’s c2 statistic


C 2 distribution2

c2 Distribution

  • Chi-square distribution


C 2 distribution3

c2 Distribution


C 2 distribution4

c2 Distribution

  • The integrals (or cumulative distributions) between arbitrary points for both the Gaussian and c2 distributions cannot be evaluated analytically and must be looked up

    • What is the probability of getting a c2 > 10 with 4 degrees of freedom?

    • This number tells you the probability that random fluctuations (chance fluctuations) in the data would give a value of c2 > 10


C 2 distribution5

c2 Distribution

  • Note the p-value is defined as

  • We’ll come back to p-values in a moment


C 2 distribution6

c2 Distribution

  • 1- cumulative c2distribution


C 2 distribution7

c2 Distribution

  • Often one uses the reduced c2 = c2/n


Hypothesis testing

Hypothesis Testing

  • Hypothesis tests provide a rule for accepting or rejecting hypotheses depending on the outcome of a measurement


Hypothesis testing1

Hypothesis Testing

  • Normally we define regions in x-space that define where the data is compatible with H or not


Hypothesis testing2

Hypothesis Testing

  • Let’s say there is just one hypothesis H

  • We can define some test statistic t whose value in some way reflects the level of agreement between the data and they hypothesis

  • We can quantify the goodness-of-fit by specifying a p-value given an observed tobs in the experiment

    • Assumes t is defined such that large values correspond to poor agreement with the hypothesis

    • g is the pdf for t


Hypothesis testing3

Hypothesis Testing

  • Notes

    • p is not the significance level of the test

    • p is not the confidence level of a confidence interval

    • p is not the probability that H is true

      • That’s Bayesian speak

    • p is the probability, under the assumption of H, of obtaining data (x or t(x)) having equal or lesser compatibility with H as xobs


Hypothesis testing4

Hypothesis Testing

  • Flip coins

    • Hypothesis H is coin is fair (random) so ph=pt=0.5

    • We could take t=|nh-N/2|

  • Toss coin N=20 times and observe nh=17

  • Is H false?

    • Don’t know

    • We can say that probability of observing 17 or more heads assuming H is 0.0026

    • p is the probability of observing this result “by chance”


Kolmogorov smirnov k s test

Kolmogorov-Smirnov (K-S) Test

  • The K-S test is an alternative to the c2test when the data sample is small

  • It is also more powerful than the c2test since it does not rely on bins – though one commonly uses it that way

    • A common use is to quantify how well data and Monte Carlo distributions agree

  • It also does not depend on the underlying cumulative distribution function being tested


K s test

K-S Test

  • Data – Monte Carlo comparison


K s test1

K-S Test

  • The K-S test is based on the empirical distribution function (ECDF) Fn(x)

    • For n ordered data points yi

  • This is a step function that increases by 1/N at the value of each ordered data point


K s test2

K-S Test

  • The K-S statistic is given by

  • If D > some critical value obtained from tables, the hypothesis (data and theory distributions agree) is rejected


K s test3

K-S Test


Statistics3

Statistics

  • Suppose N independent measurements xi are drawn from a pdf f(x;q)

  • We want want to estimate the parameters q

    • The most important method for doing this is the method of maximum likelihood

    • A related method in the case of least squares


Hypothesis testing5

Hypothesis Testing

  • Example

    • Properties of some selected events

    • Hypothesis H is these are top quark events

  • Working in x-space is hard so usually one constructs a test statistic t instead whose value reflects the compatibility between the data vector x and H

    • Low t – data more compatible with H

    • High t – data less compatible with H

  • Since f(x,H) is known, g(t,H) can be determined


Hypothesis testing6

Hypothesis Testing

  • Notes

    • p is not the significance level of the test

    • p is not the confidence level of a confidence interval

    • p is not the probability that H is true

      • That’s Bayesian speak

    • p is the probability, under the assumption of H, of obtaining data (x or t(x)) having equal or lesser compatibility with H as xobs

  • Since p is a function of r.v. x, p itself is a r.v

    • If H is true, p is uniform in [0,1]

    • If H is not true, p is peaked closer to 0


Hypothesis testing7

Hypothesis Testing

  • Suppose we observe nobs=ns+nb events

    • ns, nb are Poisson r.v.’s with means ns,nb

    • nobs=ns+nb is Poisson r.v. with mean n=ns+nb


Hypothesis testing8

Hypothesis Testing

  • Suppose nb=0.5 and we observe nobs=5

    • Publish/NY Times headline or not?

  • Often we take H to be the null hypothesis – assume it’s random fluctuation of background

    • Assume ns=0

    • This is the probability of observing 5 or more resulting from chance fluctuations of the background


Hypothesis testing9

Hypothesis Testing

  • Another problem, instead of counting events say we measure some variable x

    • Publish/NY Times headline or not?


Hypothesis testing10

Hypothesis Testing

  • Again take H to be the null hypothesis – assume it’s random fluctuation of background

    • Assume ns=0

  • Again p is the probability of observing 11 or more events resulting from chance fluctuations of the background

    • How did we know where to look / how to bin?

    • Is the observed width consistent with the resolution in x?

    • Would a slightly different analysis still show a peak?

    • What about the fact that the bins on either side of the peak are low?


Least squares

Least Squares

  • Another approach is to compare a histogram with a hypothesis that provides expectation values

    • In this case we’d compare a vector of Poisson distributed numbers (the histogram) with their expectation values ni=E[ni]

    • This is called Pearson’s statistic

    • If the ni are not too small (e.g. ni > 5) then the observed c2 will follow the chi-square pdf for N dof

      • Or more generally for N – number of fitted parameters

      • Same will hold true for N independent measurements yi that are Gaussian distributed


Least squares1

Least Squares

  • We can calculate the p-value as

  • In our example


Least squares2

Least Squares

  • In our example though we have many bins with a small number of counts or 0

  • We can still use Pearson’s test but we need to determine the pdf f(c2) by Monte Carlo

    • Generate ni from Poisson, mean niin each bin

    • Compute c2 and record in a histogram

    • Repeat for a large number of times (see next slide)


Least squares3

Least Squares

  • Using the modified pdf would give p=0.11 rather than p=0.073

    • In either case, we won’t publish


K s test4

K-S Test

  • Usage in ROOT

    • TFile * data

    • TFile * MC

    • TH1F * jet_pt = data → Get(“h_jet_pt”)

    • TH1F * MCjet_pt = MC → Get(“h_jet_pt”)

    • Double_t KS=MCjet_pt→KolmogorovTest(jet_pt)

  • Notes

    • The returned value is the probability of the test

      • << 1 means the two histograms are not compatable

    • The returned value is not the maximum KS distance though you can return this with option “M”

  • Also available in statistical toolbox in MatLab


Limiting cases

Limiting Cases

Binomial

Poisson

Gaussian


Nobel prize or ignobel prize

Nobel Prize or IgNobel Prize?

  • CDF result


Kaplan meier curve

Kaplan-Meier Curve

  • A patient is treated for a disease. What is the probability of an individual surviving or remaining disease-free?

    • Usually patients will be followed for various lengths of time after treatment

    • Some will survive or remain disease-free while others will not. Some will leave the study.

    • A nonparametric method can be found using

      • Kaplan-Meier curve

      • Life table

      • Survival curve

36


Kaplan meier curve1

Kaplan-Meier Curve

  • Calculate a conditional probability

    • S(tN) = P(t1) x P(t2) x P(t3) x … P(tN)

      • The survival function S(t) is equivalent to the empirical distribution function F(t)

    • We can write this as

37


Kaplan meier curve2

Kaplan-Meier Curve


Kaplan meier curve3

Kaplan-Meier Curve

  • The square root of the variance of S(t) can be calculated as

  • Assuming the pk follow a Gaussian (normal) distribution, then the 95% CL will be

39


Gaussian confidence interval

Gaussian Confidence Interval


Gaussian confidence interval1

Gaussian Confidence Interval


Gaussian distribution

Gaussian Distribution

  • Some useful properties of the Gaussian distribution are

    • P(x in range m±s) = 0.683

    • P(x in range m±2s) = 0.9555

    • P(x in range m±3s) = 0.9973

    • P(x outside range m±3s) = 0.0027

    • P(x outside range m±5s) = 5.7x10-7

    • P(x in range m±0.6745s) = 0.5


Gaussian distribution1

Gaussian Distribution


Confidence intervals

Confidence Intervals

  • Suppose you have a bag of black and white marbles and wish to determine the fraction f that are white. How confident are you of the initial composition? How does your confidence change after extracting n black balls?

  • Suppose you are tested for a disease. The test is 100% accurate if you have the disease. The test gives 0.2% false positive if you do not. The test comes back positive. What is the probability that you have the disease?


Confidence intervals1

Confidence Intervals

  • Suppose you are searching for the Higgs and have a well-known expected background of 3 events. What 90% confidence limit can you set on the Higgs cross section

    • if you observe 0 events?

    • if you observe 3 events?

    • if you observe 10 events?

  • The ability to set confidence limits (or claim discovery) is an important part of frontier physics

  • How to do this the “correct” way is somewhat/very controversial


Confidence intervals2

Confidence Intervals

  • Questions

    • What is the mass of the top quark?

    • What is the mass of the tau neutrino

    • What is the mass of the Higgs

  • Answers

    • Mt = 172.5 ± 2.3 GeV

    • Mv < 18.2 MeV

    • MH > 114.3 GeV

  • More correct answers

    • Mt = 172.5 ± 2.3 GeV with CL = 0.683

    • 0 < Mv < 18.2 MeV with CL = 0.95

    • Infinity > MH > 114.3 GeV with CL = 0.95


Confidence interval

Confidence Interval

  • A confidence interval reflects the statistical precision of the experiment and quantifies the reliabiltiy of a measurement

  • For a sufficiently large data sample, the mean and standard deviation of the mean provide a good provide a good interval

    • What if the pdf isn’t Gaussian?

    • What if there are physical boundaries?

    • What if the data sample is small?

  • Here we run into problems


Confidence interval1

Confidence Interval

  • A dog has a 50% probability of being 100m from its master

    • You observe the dog, what can you say about its master?

      • With 50% probability, the master is within 100m of the dog

      • But this assumes

        • The master can be anywhere around the dog

        • The dog has no preferred direction of travel


Confidence intervals3

Confidence Intervals

  • Neyman’s construction

    • Consider a pdf f(x;θ) = P(x|θ)

    • For each value of θ, we construct a horizontal line segment [x1,x2] such that P(x Î[x1,x2]|θ) = 1-a

    • The union of such intervals for all values of θ is called the confidence belt


Confidence intervals4

Confidence Intervals

  • Neyman’s construction

    • After performing an experiment to measure x, a vertical line is drawn through the experimentally measured value x0

    • The confidence interval for θis the set of all values of θfor which the corresponding line segment [x1,x2] is intercepted by the vertical line


Confidence intervals5

Confidence Intervals


Confidence interval2

Confidence Interval

  • Notes

    • The coverage condition is not unique

      • P(x<x1|θ) = P(x>x2|θ) = a/2

        • Called central confidence intervals

      • P(x<x1|θ) = a

        • Called upper confidence limits

      • P(x>x2|θ) = a

        • Called lower confidence limits


Poisson confidence interval

Poisson Confidence Interval

  • We previously mentioned that the number of events produced in a reaction with cross section σ and fixed luminosity L follows a Poisson distribution with mean n=σ∫Ldt

    • P(n;v) = e-nnn / n!

    • If the variables are discrete by convention one constructs the confidence belt by requiring P(x1<x<x2|θ) >= 1-a

  • Example: Measuring the Higgs production cross section assuming no background


Poisson confidence interval1

Poisson Confidence Interval


Poisson confidence interval2

Poisson Confidence Interval

Poisson Distribution


Poisson confidence interval3

Poisson Confidence Interval


Poisson confidence interval4

Poisson Confidence Interval

  • Assume signal s and background b


Poisson confidence interval5

Poisson Confidence Interval


Confidence intervals6

Confidence Intervals

  • Sometimes though confidence intervals

    • Are empty

    • Reduce in size when the background estimate increases

    • Are smaller for a poorer experiment

    • Exclude parameters for which the experiment is insensitive

  • Example

    • We know that P(x=0|v=2.3) = 0.1

    • v < 2.3 @ 90% CL

    • If the number of background events b is 3, then since v = s + b, number of signal events s < -0.7 at 90% CL?


Confidence intervals7

Confidence Intervals


Confidence intervals8

Confidence Intervals


Confidence interval3

Confidence Interval

  • Experiment X uses a fit to extract the neutrino mass

    • Mv = -4 ± 2 eV

    • => P (Mv < 0 eV) = 0.98?


Confidence interval4

Confidence Interval

  • What is probability?

    • Frequentist approach

      • Developed by Venn, Fisher, Neyman, von Mises

      • The relative frequency with which something happens

      • number of successes / number of trials

        • Venn limit (n trials to infinity)

      • Assumes success appeared in the past and will occur in the future with the same probability

    • It will rain tomorrow in Tucson and P(S) = 0.01

      • The relative frequency it rains on Mondays in April is 0.01


Confidence interval5

Confidence Interval

  • What is probability

    • Bayesian approach

      • Developed by Bayes, Laplace, Gauss, Jeffreys, de Finetti

      • The degree of belief or confidence of a statement or measurement

      • Closer to what is used in everyday life

        • Is the Standard Model correct

      • Similar to betting odds

      • Not “scientific”?

    • It will rain tomorrow in Tucson and P(S) = 0.01

      • The plausibility of the above statement is 0.01 (ie the same as if I were to draw a white ball out of a container of 100 balls, 1 of which is white)


Confidence interval6

Confidence Interval

  • Usually

    • Confidence interval == frequentist confidence interval

    • Credible interval == Bayesian posterior probability interval

      • But you’ll also hear Bayesian confidence interval

  • Probability

    • P = 1 – a

      • a = 0.05 => P = 95%


Confidence interval7

Confidence Interval

  • Suppose you wish to determine a parameter θ whose true value is θt is unknown

  • Assume we make a single measurement of an observable x whose pdf P(x|θ) depends on θ

    • Recall this is the probability of obtaining x given θ

  • Say we measure x0, then we obtain P(x0|θ)

  • Frequentist

    • Makes statements about P(x|θ)

  • Bayesian

    • Makes statements about P(θt|x0)

    • P(θt|x0) = P(x0|θt) P(θt) / P(x0)

  • We’ll stick with the frequentist approach for the moment


Confidence interval8

Confidence Interval

  • (Frequentist) confidence intervals are constructed to include the true value of the parameter (θt) with a probability of 1-α

    • In fact this is true for any value of θ

  • A confidence interval [θ1,θ2] is a member of a set, such that the set has the property that P(θÎ [θ1,θ2])= 1-α

    • Perform an ensemble of experiments with fixed θ

    • The interval [θ1,θ2] will vary and cover the fixed value θ in a fraction of 1-α of the experiments

  • Presumably when we make a measurement we are selecting it at random from the ensemble that contains the true value of θ, θt

  • Note we haven’t said anything about the probability of θt being in the interval [θ1,θ2] as a Bayesian would


Confidence interval9

Confidence Interval

  • If P(θ Î[θ1,θ2]) = 1-a is true we say the intervals “cover” θat the stated confidence

  • If there are values of θfor which P(θ Î[θ1,θ2]) < 1-a we say the intervals “undercover” for that θ

  • If there are values of θfor which P(θ Î[θ1,θ2]) > 1-a we say the intervals “overcover” for that θ

  • Undercoverage is bad

  • Overcoverage is conservative


Confidence intervals9

Confidence Intervals

  • Neyman’s construction

    • Consider a pdf f(x;θ) = P(x|θ)

    • For each value of θ, we construct a horizontal line segment [x1,x2] such that P(x Î[x1,x2]|θ) = 1-a

    • The union of such intervals for all values of θ is called the confidence belt


Confidence intervals10

Confidence Intervals

  • Neyman’s construction

    • After performing an experiment to measure x, a vertical line is drawn through the experimentally measured value x0

    • The confidence interval for θis the set of all values of θfor which the corresponding line segment [x1,x2] is intercepted by the vertical line


Confidence intervals11

Confidence Intervals


Confidence interval10

Confidence Interval

  • Notes

    • The coverage condition is not unique

      • P(x<x1|θ) = P(x>x2|θ) = a/2

        • Called central confidence intervals

      • P(x<x1|θ) = a

        • Called upper confidence limits

      • P(x>x2|θ) = a

        • Called lower confidence limits


Confidence intervals12

Confidence Intervals

  • These confidence intervals have a confidence level = 1-a

  • By construction, P(θ Î[θ1,θ2]) > 1-a is satisfied for all θ including θt

  • Another method is to consider a test of the hypothesis that the parameters true value is θ

  • If the variables are discrete by convention one constructs the confidence belt by requiring P(x1<x<x2|θ) >= 1-a


Examples

Examples

  • Data consisting of a single random variable x that follows a Gaussian distribution

  • Counting experiments


Poisson confidence interval6

Poisson Confidence Interval

  • We previously mentioned that the number of events produced in a reaction with cross section σ and fixed luminosity L follows a Poisson distribution with mean v=σ∫Ldt

    • P(n;v) = e-v vn / n!

    • If the variables are discrete by convention one constructs the confidence belt by requiring P(x1<x<x2|θ) >= 1-a

  • Example: Measuring the Higgs production cross section assuming no background


Poisson confidence interval7

Poisson Confidence Interval


Poisson confidence interval8

Poisson Confidence Interval

Poisson Distribution


Poisson confidence interval9

Poisson Confidence Interval


Poisson confidence interval10

Poisson Confidence Interval


Confidence intervals13

Confidence Intervals

  • Sometimes though confidence intervals

    • Are empty

    • Reduce in size when the background estimate increases

    • Are smaller for a poorer experiment

    • Exclude parameters for which the experiment is insensitive

  • Example

    • We know that P(x=0|v=2.3) = 0.1

    • v < 2.3 @ 90% CL

    • If the number of background events b is 3, then since v = s + b, number of signal events s < -0.7 at 90% CL?


Confidence intervals14

Confidence Intervals


Confidence intervals15

Confidence Intervals


  • Login