statistics
Download
Skip this Video
Download Presentation
Statistics

Loading in 2 Seconds...

play fullscreen
1 / 82

Statistics - PowerPoint PPT Presentation


  • 72 Views
  • Uploaded on

Statistics. We collect a sample of data, what do we do with it? Estimate parameters (possibly of some model) Test whether a particular theory is consistent with our data (hypothesis testing) Statistics is a set of tools that allows us to achieve these goals. Statistics. Preliminaries.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Statistics' - elias


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
statistics
Statistics
  • We collect a sample of data, what do we do with it?
    • Estimate parameters (possibly of some model)
    • Test whether a particular theory is consistent with our data (hypothesis testing)
  • Statistics is a set of tools that allows us to achieve these goals
statistics1
Statistics
  • Preliminaries
statistics2
Statistics
  • Some common estimators are for the mean and variance
c 2 distribution
c2 Distribution
  • A common situation is that you have a set of measurements xi and you know the true value of each xit
    • How good are our measurements?
  • Similarly you may be comparing a histogram of data with another that contains expectation values under some hypothesis
    • How well do the data agree with this hypothesis?
  • Or if parameters of a function were estimated using the method of least squares, a minimum value of c2 was obtained
    • How good was the fit?
c 2 distribution1
c2 Distribution
  • Assuming
    • The measurements are independent of each other
    • The measurements come from a Gaussian distribution
  • One can use the “goodness-of-fit” statistic c2 to answer these questions
      • In the case of Poisson distributed numbers, si2=xti, this is called Pearson’s c2 statistic
c 2 distribution2
c2 Distribution
  • Chi-square distribution
c 2 distribution4
c2 Distribution
  • The integrals (or cumulative distributions) between arbitrary points for both the Gaussian and c2 distributions cannot be evaluated analytically and must be looked up
    • What is the probability of getting a c2 > 10 with 4 degrees of freedom?
    • This number tells you the probability that random fluctuations (chance fluctuations) in the data would give a value of c2 > 10
c 2 distribution5
c2 Distribution
  • Note the p-value is defined as
  • We’ll come back to p-values in a moment
c 2 distribution6
c2 Distribution
  • 1- cumulative c2distribution
c 2 distribution7
c2 Distribution
  • Often one uses the reduced c2 = c2/n
hypothesis testing
Hypothesis Testing
  • Hypothesis tests provide a rule for accepting or rejecting hypotheses depending on the outcome of a measurement
hypothesis testing1
Hypothesis Testing
  • Normally we define regions in x-space that define where the data is compatible with H or not
hypothesis testing2
Hypothesis Testing
  • Let’s say there is just one hypothesis H
  • We can define some test statistic t whose value in some way reflects the level of agreement between the data and they hypothesis
  • We can quantify the goodness-of-fit by specifying a p-value given an observed tobs in the experiment
    • Assumes t is defined such that large values correspond to poor agreement with the hypothesis
    • g is the pdf for t
hypothesis testing3
Hypothesis Testing
  • Notes
    • p is not the significance level of the test
    • p is not the confidence level of a confidence interval
    • p is not the probability that H is true
      • That’s Bayesian speak
    • p is the probability, under the assumption of H, of obtaining data (x or t(x)) having equal or lesser compatibility with H as xobs
hypothesis testing4
Hypothesis Testing
  • Flip coins
    • Hypothesis H is coin is fair (random) so ph=pt=0.5
    • We could take t=|nh-N/2|
  • Toss coin N=20 times and observe nh=17
  • Is H false?
    • Don’t know
    • We can say that probability of observing 17 or more heads assuming H is 0.0026
    • p is the probability of observing this result “by chance”
kolmogorov smirnov k s test
Kolmogorov-Smirnov (K-S) Test
  • The K-S test is an alternative to the c2test when the data sample is small
  • It is also more powerful than the c2test since it does not rely on bins – though one commonly uses it that way
    • A common use is to quantify how well data and Monte Carlo distributions agree
  • It also does not depend on the underlying cumulative distribution function being tested
k s test
K-S Test
  • Data – Monte Carlo comparison
k s test1
K-S Test
  • The K-S test is based on the empirical distribution function (ECDF) Fn(x)
    • For n ordered data points yi
  • This is a step function that increases by 1/N at the value of each ordered data point
k s test2
K-S Test
  • The K-S statistic is given by
  • If D > some critical value obtained from tables, the hypothesis (data and theory distributions agree) is rejected
statistics3
Statistics
  • Suppose N independent measurements xi are drawn from a pdf f(x;q)
  • We want want to estimate the parameters q
    • The most important method for doing this is the method of maximum likelihood
    • A related method in the case of least squares
hypothesis testing5
Hypothesis Testing
  • Example
    • Properties of some selected events
    • Hypothesis H is these are top quark events
  • Working in x-space is hard so usually one constructs a test statistic t instead whose value reflects the compatibility between the data vector x and H
    • Low t – data more compatible with H
    • High t – data less compatible with H
  • Since f(x,H) is known, g(t,H) can be determined
hypothesis testing6
Hypothesis Testing
  • Notes
    • p is not the significance level of the test
    • p is not the confidence level of a confidence interval
    • p is not the probability that H is true
      • That’s Bayesian speak
    • p is the probability, under the assumption of H, of obtaining data (x or t(x)) having equal or lesser compatibility with H as xobs
  • Since p is a function of r.v. x, p itself is a r.v
    • If H is true, p is uniform in [0,1]
    • If H is not true, p is peaked closer to 0
hypothesis testing7
Hypothesis Testing
  • Suppose we observe nobs=ns+nb events
    • ns, nb are Poisson r.v.’s with means ns,nb
    • nobs=ns+nb is Poisson r.v. with mean n=ns+nb
hypothesis testing8
Hypothesis Testing
  • Suppose nb=0.5 and we observe nobs=5
    • Publish/NY Times headline or not?
  • Often we take H to be the null hypothesis – assume it’s random fluctuation of background
    • Assume ns=0
    • This is the probability of observing 5 or more resulting from chance fluctuations of the background
hypothesis testing9
Hypothesis Testing
  • Another problem, instead of counting events say we measure some variable x
    • Publish/NY Times headline or not?
hypothesis testing10
Hypothesis Testing
  • Again take H to be the null hypothesis – assume it’s random fluctuation of background
    • Assume ns=0
  • Again p is the probability of observing 11 or more events resulting from chance fluctuations of the background
    • How did we know where to look / how to bin?
    • Is the observed width consistent with the resolution in x?
    • Would a slightly different analysis still show a peak?
    • What about the fact that the bins on either side of the peak are low?
least squares
Least Squares
  • Another approach is to compare a histogram with a hypothesis that provides expectation values
    • In this case we’d compare a vector of Poisson distributed numbers (the histogram) with their expectation values ni=E[ni]
    • This is called Pearson’s statistic
    • If the ni are not too small (e.g. ni > 5) then the observed c2 will follow the chi-square pdf for N dof
      • Or more generally for N – number of fitted parameters
      • Same will hold true for N independent measurements yi that are Gaussian distributed
least squares1
Least Squares
  • We can calculate the p-value as
  • In our example
least squares2
Least Squares
  • In our example though we have many bins with a small number of counts or 0
  • We can still use Pearson’s test but we need to determine the pdf f(c2) by Monte Carlo
    • Generate ni from Poisson, mean niin each bin
    • Compute c2 and record in a histogram
    • Repeat for a large number of times (see next slide)
least squares3
Least Squares
  • Using the modified pdf would give p=0.11 rather than p=0.073
    • In either case, we won’t publish
k s test4
K-S Test
  • Usage in ROOT
    • TFile * data
    • TFile * MC
    • TH1F * jet_pt = data → Get(“h_jet_pt”)
    • TH1F * MCjet_pt = MC → Get(“h_jet_pt”)
    • Double_t KS=MCjet_pt→KolmogorovTest(jet_pt)
  • Notes
    • The returned value is the probability of the test
      • << 1 means the two histograms are not compatable
    • The returned value is not the maximum KS distance though you can return this with option “M”
  • Also available in statistical toolbox in MatLab
limiting cases
Limiting Cases

Binomial

Poisson

Gaussian

kaplan meier curve
Kaplan-Meier Curve
  • A patient is treated for a disease. What is the probability of an individual surviving or remaining disease-free?
    • Usually patients will be followed for various lengths of time after treatment
    • Some will survive or remain disease-free while others will not. Some will leave the study.
    • A nonparametric method can be found using
      • Kaplan-Meier curve
      • Life table
      • Survival curve

36

kaplan meier curve1
Kaplan-Meier Curve
  • Calculate a conditional probability
    • S(tN) = P(t1) x P(t2) x P(t3) x … P(tN)
      • The survival function S(t) is equivalent to the empirical distribution function F(t)
    • We can write this as

37

kaplan meier curve3
Kaplan-Meier Curve
  • The square root of the variance of S(t) can be calculated as
  • Assuming the pk follow a Gaussian (normal) distribution, then the 95% CL will be

39

gaussian distribution
Gaussian Distribution
  • Some useful properties of the Gaussian distribution are
    • P(x in range m±s) = 0.683
    • P(x in range m±2s) = 0.9555
    • P(x in range m±3s) = 0.9973
    • P(x outside range m±3s) = 0.0027
    • P(x outside range m±5s) = 5.7x10-7
    • P(x in range m±0.6745s) = 0.5
confidence intervals
Confidence Intervals
  • Suppose you have a bag of black and white marbles and wish to determine the fraction f that are white. How confident are you of the initial composition? How does your confidence change after extracting n black balls?
  • Suppose you are tested for a disease. The test is 100% accurate if you have the disease. The test gives 0.2% false positive if you do not. The test comes back positive. What is the probability that you have the disease?
confidence intervals1
Confidence Intervals
  • Suppose you are searching for the Higgs and have a well-known expected background of 3 events. What 90% confidence limit can you set on the Higgs cross section
    • if you observe 0 events?
    • if you observe 3 events?
    • if you observe 10 events?
  • The ability to set confidence limits (or claim discovery) is an important part of frontier physics
  • How to do this the “correct” way is somewhat/very controversial
confidence intervals2
Confidence Intervals
  • Questions
    • What is the mass of the top quark?
    • What is the mass of the tau neutrino
    • What is the mass of the Higgs
  • Answers
    • Mt = 172.5 ± 2.3 GeV
    • Mv < 18.2 MeV
    • MH > 114.3 GeV
  • More correct answers
    • Mt = 172.5 ± 2.3 GeV with CL = 0.683
    • 0 < Mv < 18.2 MeV with CL = 0.95
    • Infinity > MH > 114.3 GeV with CL = 0.95
confidence interval
Confidence Interval
  • A confidence interval reflects the statistical precision of the experiment and quantifies the reliabiltiy of a measurement
  • For a sufficiently large data sample, the mean and standard deviation of the mean provide a good provide a good interval
    • What if the pdf isn’t Gaussian?
    • What if there are physical boundaries?
    • What if the data sample is small?
  • Here we run into problems
confidence interval1
Confidence Interval
  • A dog has a 50% probability of being 100m from its master
    • You observe the dog, what can you say about its master?
      • With 50% probability, the master is within 100m of the dog
      • But this assumes
        • The master can be anywhere around the dog
        • The dog has no preferred direction of travel
confidence intervals3
Confidence Intervals
  • Neyman’s construction
    • Consider a pdf f(x;θ) = P(x|θ)
    • For each value of θ, we construct a horizontal line segment [x1,x2] such that P(x Î[x1,x2]|θ) = 1-a
    • The union of such intervals for all values of θ is called the confidence belt
confidence intervals4
Confidence Intervals
  • Neyman’s construction
    • After performing an experiment to measure x, a vertical line is drawn through the experimentally measured value x0
    • The confidence interval for θis the set of all values of θfor which the corresponding line segment [x1,x2] is intercepted by the vertical line
confidence interval2
Confidence Interval
  • Notes
    • The coverage condition is not unique
      • P(x<x1|θ) = P(x>x2|θ) = a/2
        • Called central confidence intervals
      • P(x<x1|θ) = a
        • Called upper confidence limits
      • P(x>x2|θ) = a
        • Called lower confidence limits
poisson confidence interval
Poisson Confidence Interval
  • We previously mentioned that the number of events produced in a reaction with cross section σ and fixed luminosity L follows a Poisson distribution with mean n=σ∫Ldt
    • P(n;v) = e-nnn / n!
    • If the variables are discrete by convention one constructs the confidence belt by requiring P(x1<x<x2|θ) >= 1-a
  • Example: Measuring the Higgs production cross section assuming no background
poisson confidence interval2
Poisson Confidence Interval

Poisson Distribution

poisson confidence interval4
Poisson Confidence Interval
  • Assume signal s and background b
confidence intervals6
Confidence Intervals
  • Sometimes though confidence intervals
    • Are empty
    • Reduce in size when the background estimate increases
    • Are smaller for a poorer experiment
    • Exclude parameters for which the experiment is insensitive
  • Example
    • We know that P(x=0|v=2.3) = 0.1
    • v < 2.3 @ 90% CL
    • If the number of background events b is 3, then since v = s + b, number of signal events s < -0.7 at 90% CL?
confidence interval3
Confidence Interval
  • Experiment X uses a fit to extract the neutrino mass
    • Mv = -4 ± 2 eV
    • => P (Mv < 0 eV) = 0.98?
confidence interval4
Confidence Interval
  • What is probability?
    • Frequentist approach
      • Developed by Venn, Fisher, Neyman, von Mises
      • The relative frequency with which something happens
      • number of successes / number of trials
        • Venn limit (n trials to infinity)
      • Assumes success appeared in the past and will occur in the future with the same probability
    • It will rain tomorrow in Tucson and P(S) = 0.01
      • The relative frequency it rains on Mondays in April is 0.01
confidence interval5
Confidence Interval
  • What is probability
    • Bayesian approach
      • Developed by Bayes, Laplace, Gauss, Jeffreys, de Finetti
      • The degree of belief or confidence of a statement or measurement
      • Closer to what is used in everyday life
        • Is the Standard Model correct
      • Similar to betting odds
      • Not “scientific”?
    • It will rain tomorrow in Tucson and P(S) = 0.01
      • The plausibility of the above statement is 0.01 (ie the same as if I were to draw a white ball out of a container of 100 balls, 1 of which is white)
confidence interval6
Confidence Interval
  • Usually
    • Confidence interval == frequentist confidence interval
    • Credible interval == Bayesian posterior probability interval
      • But you’ll also hear Bayesian confidence interval
  • Probability
    • P = 1 – a
      • a = 0.05 => P = 95%
confidence interval7
Confidence Interval
  • Suppose you wish to determine a parameter θ whose true value is θt is unknown
  • Assume we make a single measurement of an observable x whose pdf P(x|θ) depends on θ
    • Recall this is the probability of obtaining x given θ
  • Say we measure x0, then we obtain P(x0|θ)
  • Frequentist
    • Makes statements about P(x|θ)
  • Bayesian
    • Makes statements about P(θt|x0)
    • P(θt|x0) = P(x0|θt) P(θt) / P(x0)
  • We’ll stick with the frequentist approach for the moment
confidence interval8
Confidence Interval
  • (Frequentist) confidence intervals are constructed to include the true value of the parameter (θt) with a probability of 1-α
    • In fact this is true for any value of θ
  • A confidence interval [θ1,θ2] is a member of a set, such that the set has the property that P(θÎ [θ1,θ2])= 1-α
    • Perform an ensemble of experiments with fixed θ
    • The interval [θ1,θ2] will vary and cover the fixed value θ in a fraction of 1-α of the experiments
  • Presumably when we make a measurement we are selecting it at random from the ensemble that contains the true value of θ, θt
  • Note we haven’t said anything about the probability of θt being in the interval [θ1,θ2] as a Bayesian would
confidence interval9
Confidence Interval
  • If P(θ Î[θ1,θ2]) = 1-a is true we say the intervals “cover” θat the stated confidence
  • If there are values of θfor which P(θ Î[θ1,θ2]) < 1-a we say the intervals “undercover” for that θ
  • If there are values of θfor which P(θ Î[θ1,θ2]) > 1-a we say the intervals “overcover” for that θ
  • Undercoverage is bad
  • Overcoverage is conservative
confidence intervals9
Confidence Intervals
  • Neyman’s construction
    • Consider a pdf f(x;θ) = P(x|θ)
    • For each value of θ, we construct a horizontal line segment [x1,x2] such that P(x Î[x1,x2]|θ) = 1-a
    • The union of such intervals for all values of θ is called the confidence belt
confidence intervals10
Confidence Intervals
  • Neyman’s construction
    • After performing an experiment to measure x, a vertical line is drawn through the experimentally measured value x0
    • The confidence interval for θis the set of all values of θfor which the corresponding line segment [x1,x2] is intercepted by the vertical line
confidence interval10
Confidence Interval
  • Notes
    • The coverage condition is not unique
      • P(x<x1|θ) = P(x>x2|θ) = a/2
        • Called central confidence intervals
      • P(x<x1|θ) = a
        • Called upper confidence limits
      • P(x>x2|θ) = a
        • Called lower confidence limits
confidence intervals12
Confidence Intervals
  • These confidence intervals have a confidence level = 1-a
  • By construction, P(θ Î[θ1,θ2]) > 1-a is satisfied for all θ including θt
  • Another method is to consider a test of the hypothesis that the parameters true value is θ
  • If the variables are discrete by convention one constructs the confidence belt by requiring P(x1<x<x2|θ) >= 1-a
examples
Examples
  • Data consisting of a single random variable x that follows a Gaussian distribution
  • Counting experiments
poisson confidence interval6
Poisson Confidence Interval
  • We previously mentioned that the number of events produced in a reaction with cross section σ and fixed luminosity L follows a Poisson distribution with mean v=σ∫Ldt
    • P(n;v) = e-v vn / n!
    • If the variables are discrete by convention one constructs the confidence belt by requiring P(x1<x<x2|θ) >= 1-a
  • Example: Measuring the Higgs production cross section assuming no background
poisson confidence interval8
Poisson Confidence Interval

Poisson Distribution

confidence intervals13
Confidence Intervals
  • Sometimes though confidence intervals
    • Are empty
    • Reduce in size when the background estimate increases
    • Are smaller for a poorer experiment
    • Exclude parameters for which the experiment is insensitive
  • Example
    • We know that P(x=0|v=2.3) = 0.1
    • v < 2.3 @ 90% CL
    • If the number of background events b is 3, then since v = s + b, number of signal events s < -0.7 at 90% CL?
ad