1 / 82

Statistics

This article provides an overview of statistics, discussing the use of data sampling, parameter estimation, and hypothesis testing. It also explains the c2 distribution, p-values, and the Kolmogorov-Smirnov test.

Download Presentation

Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics • We collect a sample of data, what do we do with it? • Estimate parameters (possibly of some model) • Test whether a particular theory is consistent with our data (hypothesis testing) • Statistics is a set of tools that allows us to achieve these goals

  2. Statistics • Preliminaries

  3. Statistics • Some common estimators are for the mean and variance

  4. c2 Distribution • A common situation is that you have a set of measurements xi and you know the true value of each xit • How good are our measurements? • Similarly you may be comparing a histogram of data with another that contains expectation values under some hypothesis • How well do the data agree with this hypothesis? • Or if parameters of a function were estimated using the method of least squares, a minimum value of c2 was obtained • How good was the fit?

  5. c2 Distribution • Assuming • The measurements are independent of each other • The measurements come from a Gaussian distribution • One can use the “goodness-of-fit” statistic c2 to answer these questions • In the case of Poisson distributed numbers, si2=xti, this is called Pearson’s c2 statistic

  6. c2 Distribution • Chi-square distribution

  7. c2 Distribution

  8. c2 Distribution • The integrals (or cumulative distributions) between arbitrary points for both the Gaussian and c2 distributions cannot be evaluated analytically and must be looked up • What is the probability of getting a c2 > 10 with 4 degrees of freedom? • This number tells you the probability that random fluctuations (chance fluctuations) in the data would give a value of c2 > 10

  9. c2 Distribution • Note the p-value is defined as • We’ll come back to p-values in a moment

  10. c2 Distribution • 1- cumulative c2distribution

  11. c2 Distribution • Often one uses the reduced c2 = c2/n

  12. Hypothesis Testing • Hypothesis tests provide a rule for accepting or rejecting hypotheses depending on the outcome of a measurement

  13. Hypothesis Testing • Normally we define regions in x-space that define where the data is compatible with H or not

  14. Hypothesis Testing • Let’s say there is just one hypothesis H • We can define some test statistic t whose value in some way reflects the level of agreement between the data and they hypothesis • We can quantify the goodness-of-fit by specifying a p-value given an observed tobs in the experiment • Assumes t is defined such that large values correspond to poor agreement with the hypothesis • g is the pdf for t

  15. Hypothesis Testing • Notes • p is not the significance level of the test • p is not the confidence level of a confidence interval • p is not the probability that H is true • That’s Bayesian speak • p is the probability, under the assumption of H, of obtaining data (x or t(x)) having equal or lesser compatibility with H as xobs

  16. Hypothesis Testing • Flip coins • Hypothesis H is coin is fair (random) so ph=pt=0.5 • We could take t=|nh-N/2| • Toss coin N=20 times and observe nh=17 • Is H false? • Don’t know • We can say that probability of observing 17 or more heads assuming H is 0.0026 • p is the probability of observing this result “by chance”

  17. Kolmogorov-Smirnov (K-S) Test • The K-S test is an alternative to the c2test when the data sample is small • It is also more powerful than the c2test since it does not rely on bins – though one commonly uses it that way • A common use is to quantify how well data and Monte Carlo distributions agree • It also does not depend on the underlying cumulative distribution function being tested

  18. K-S Test • Data – Monte Carlo comparison

  19. K-S Test • The K-S test is based on the empirical distribution function (ECDF) Fn(x) • For n ordered data points yi • This is a step function that increases by 1/N at the value of each ordered data point

  20. K-S Test • The K-S statistic is given by • If D > some critical value obtained from tables, the hypothesis (data and theory distributions agree) is rejected

  21. K-S Test

  22. Statistics • Suppose N independent measurements xi are drawn from a pdf f(x;q) • We want want to estimate the parameters q • The most important method for doing this is the method of maximum likelihood • A related method in the case of least squares

  23. Hypothesis Testing • Example • Properties of some selected events • Hypothesis H is these are top quark events • Working in x-space is hard so usually one constructs a test statistic t instead whose value reflects the compatibility between the data vector x and H • Low t – data more compatible with H • High t – data less compatible with H • Since f(x,H) is known, g(t,H) can be determined

  24. Hypothesis Testing • Notes • p is not the significance level of the test • p is not the confidence level of a confidence interval • p is not the probability that H is true • That’s Bayesian speak • p is the probability, under the assumption of H, of obtaining data (x or t(x)) having equal or lesser compatibility with H as xobs • Since p is a function of r.v. x, p itself is a r.v • If H is true, p is uniform in [0,1] • If H is not true, p is peaked closer to 0

  25. Hypothesis Testing • Suppose we observe nobs=ns+nb events • ns, nb are Poisson r.v.’s with means ns,nb • nobs=ns+nb is Poisson r.v. with mean n=ns+nb

  26. Hypothesis Testing • Suppose nb=0.5 and we observe nobs=5 • Publish/NY Times headline or not? • Often we take H to be the null hypothesis – assume it’s random fluctuation of background • Assume ns=0 • This is the probability of observing 5 or more resulting from chance fluctuations of the background

  27. Hypothesis Testing • Another problem, instead of counting events say we measure some variable x • Publish/NY Times headline or not?

  28. Hypothesis Testing • Again take H to be the null hypothesis – assume it’s random fluctuation of background • Assume ns=0 • Again p is the probability of observing 11 or more events resulting from chance fluctuations of the background • How did we know where to look / how to bin? • Is the observed width consistent with the resolution in x? • Would a slightly different analysis still show a peak? • What about the fact that the bins on either side of the peak are low?

  29. Least Squares • Another approach is to compare a histogram with a hypothesis that provides expectation values • In this case we’d compare a vector of Poisson distributed numbers (the histogram) with their expectation values ni=E[ni] • This is called Pearson’s statistic • If the ni are not too small (e.g. ni > 5) then the observed c2 will follow the chi-square pdf for N dof • Or more generally for N – number of fitted parameters • Same will hold true for N independent measurements yi that are Gaussian distributed

  30. Least Squares • We can calculate the p-value as • In our example

  31. Least Squares • In our example though we have many bins with a small number of counts or 0 • We can still use Pearson’s test but we need to determine the pdf f(c2) by Monte Carlo • Generate ni from Poisson, mean niin each bin • Compute c2 and record in a histogram • Repeat for a large number of times (see next slide)

  32. Least Squares • Using the modified pdf would give p=0.11 rather than p=0.073 • In either case, we won’t publish

  33. K-S Test • Usage in ROOT • TFile * data • TFile * MC • TH1F * jet_pt = data → Get(“h_jet_pt”) • TH1F * MCjet_pt = MC → Get(“h_jet_pt”) • Double_t KS=MCjet_pt→KolmogorovTest(jet_pt) • Notes • The returned value is the probability of the test • << 1 means the two histograms are not compatable • The returned value is not the maximum KS distance though you can return this with option “M” • Also available in statistical toolbox in MatLab

  34. Limiting Cases Binomial Poisson Gaussian

  35. Nobel Prize or IgNobel Prize? • CDF result

  36. Kaplan-Meier Curve • A patient is treated for a disease. What is the probability of an individual surviving or remaining disease-free? • Usually patients will be followed for various lengths of time after treatment • Some will survive or remain disease-free while others will not. Some will leave the study. • A nonparametric method can be found using • Kaplan-Meier curve • Life table • Survival curve 36

  37. Kaplan-Meier Curve • Calculate a conditional probability • S(tN) = P(t1) x P(t2) x P(t3) x … P(tN) • The survival function S(t) is equivalent to the empirical distribution function F(t) • We can write this as 37

  38. Kaplan-Meier Curve

  39. Kaplan-Meier Curve • The square root of the variance of S(t) can be calculated as • Assuming the pk follow a Gaussian (normal) distribution, then the 95% CL will be 39

  40. Gaussian Confidence Interval

  41. Gaussian Confidence Interval

  42. Gaussian Distribution • Some useful properties of the Gaussian distribution are • P(x in range m±s) = 0.683 • P(x in range m±2s) = 0.9555 • P(x in range m±3s) = 0.9973 • P(x outside range m±3s) = 0.0027 • P(x outside range m±5s) = 5.7x10-7 • P(x in range m±0.6745s) = 0.5

  43. Gaussian Distribution

  44. Confidence Intervals • Suppose you have a bag of black and white marbles and wish to determine the fraction f that are white. How confident are you of the initial composition? How does your confidence change after extracting n black balls? • Suppose you are tested for a disease. The test is 100% accurate if you have the disease. The test gives 0.2% false positive if you do not. The test comes back positive. What is the probability that you have the disease?

  45. Confidence Intervals • Suppose you are searching for the Higgs and have a well-known expected background of 3 events. What 90% confidence limit can you set on the Higgs cross section • if you observe 0 events? • if you observe 3 events? • if you observe 10 events? • The ability to set confidence limits (or claim discovery) is an important part of frontier physics • How to do this the “correct” way is somewhat/very controversial

  46. Confidence Intervals • Questions • What is the mass of the top quark? • What is the mass of the tau neutrino • What is the mass of the Higgs • Answers • Mt = 172.5 ± 2.3 GeV • Mv < 18.2 MeV • MH > 114.3 GeV • More correct answers • Mt = 172.5 ± 2.3 GeV with CL = 0.683 • 0 < Mv < 18.2 MeV with CL = 0.95 • Infinity > MH > 114.3 GeV with CL = 0.95

  47. Confidence Interval • A confidence interval reflects the statistical precision of the experiment and quantifies the reliabiltiy of a measurement • For a sufficiently large data sample, the mean and standard deviation of the mean provide a good provide a good interval • What if the pdf isn’t Gaussian? • What if there are physical boundaries? • What if the data sample is small? • Here we run into problems

  48. Confidence Interval • A dog has a 50% probability of being 100m from its master • You observe the dog, what can you say about its master? • With 50% probability, the master is within 100m of the dog • But this assumes • The master can be anywhere around the dog • The dog has no preferred direction of travel

  49. Confidence Intervals • Neyman’s construction • Consider a pdf f(x;θ) = P(x|θ) • For each value of θ, we construct a horizontal line segment [x1,x2] such that P(x Î[x1,x2]|θ) = 1-a • The union of such intervals for all values of θ is called the confidence belt

  50. Confidence Intervals • Neyman’s construction • After performing an experiment to measure x, a vertical line is drawn through the experimentally measured value x0 • The confidence interval for θis the set of all values of θfor which the corresponding line segment [x1,x2] is intercepted by the vertical line

More Related