Charles Plager UCLA LJ+MET Meeting March 23, 2008

“Throwing PEs” and More Charles Plager UCLA LJ+MET Meeting March 23, 2008 LJ+MET, March 23, 2009

“PE? What’s a PE?” • Pseudo-experiment (PE) is the process of generating a set of random numbers that simulates the given quantities needed to perform an analysis. • This can be as simple of generating a number of events for a counting experiment. • This can be as complicated as picking a subset of “events” needed to perform correlated analyses. • In Charles-speak: • PE) One pseudo-experiment. • PEs) Many pseudo-experiments. • Throw PEs) Generate many pseudo-experiments • “Why throw PEs?” • Throwing and “fitting” PEs is a very useful way of: • Debugging an analysis • Checking correlations between analyses • Measuring sensitivity of a given analysis LJ+MET, March 23, 2009

Simple Counting Experiment PE • You expect Nback background events and Nsig signal events. • For each PE, your number of observed events: nObs = random.Poisson (Nback) + random.Poisson (Nsig) = random.Poisson (Nback + Nsig) • Your analysis is a simple counting experiment with no systematic uncertainties. LJ+MET, March 23, 2009

PEs from Templates • Same idea as “Simple Counting PEs,” except that we can’t merge signal and background: • NobsSig = random.Poisson (Nsig) • NobsBack = random.Poisson (Nback) • For each signal event, a value is drawn from the template • Fit the resulting PE as if it were data. LJ+MET, March 23, 2009

Aside: PEs with a Fixed Number of Events • It may sometimes happen that one wants to generate PEs with a fixed number of events, Ntotal. • E.g., one wants to match the total number of events in a particular datasample. • If you are assuming, for example, a small background Nback and a large signal Nsig, you can not: • Poisson fluctuate background NobsBack = Poisson (Nback), and then • NobsSig = Ntotal – Nback • You need to fluctuate both: • NobsBack = Poisson (Nback), • NobsSig = Poisson (Nsig), • Keep only if NobsBack + NobsSig == Ntotal • Note: If you have NobsBack + NobsSig > Ntotal, • Pick random event to get rid of (signal or background) • Repeat until NobsBack + NobsSig == Ntotal LJ+MET, March 23, 2009

PEs with Systematic Uncertainties • “Blur, then throw” • You have a whole bunch of estimates described by Gaussian uncertainties: x §x. • For each PE, generate a value for each estimate from the correct Gaussian distribution. • Make sure all estimates are physical. If not, repeat previous step*. • Given this set of estimates, generate PE as we have done in previous steps. LJ+MET, March 23, 2009

Correlated PEs • “We have two analyses that use overlapping data samples. What do we do?” • Instead of picking values from a template, we pick whole “events”from MC. • All analyses then take what they need from each event. • Using weighted MC is more complicated here: • Pick events • Generate random number to see if event is kept • Repeat until total number of events is reached in sub-sample • Important: Reweighting is very powerful technique and throwing PEs is a solved problem (see me). LJ+MET, March 23, 2009

Your Analysis Checkup • I’ve now thrown many PEs. What do I look at? • For each PE, take the results and treat it like you would with data: • Fit, etc. • Look at different distributions: • Uncertainties ) Sensitivity • Pull distributions (more later) • Looking for coverage, bias, etc. • Hint: Store PE results in a TTree • Easier to debug problems. LJ+MET, March 23, 2009

Pull Distributions • A pull distribution is a very useful way to make sure your analysis machinery is doing what it is supposed to. • Checks for biases as well as under and over coverage. • For each event, calculate: • Resulting distribution should be a unit Gaussian. LJ+MET, March 23, 2009

More with Pull Distributions • If you measure asymmetric uncertainties, simply use the “right one:” • upper : measurement < true value • lower : measurement > true value • If the parameter of interest has a Gaussian constraint in your fit: • If you have Gaussian constraints in your fit, it is important to throw the PEs using the same distribution. ) Throw Consistently. • Remember: checking pull distribution is a necessary, but not sufficient check. Lower Meas Upper True True LJ+MET, March 23, 2009

Hints • Small statistics can sometimes “mess” with pull distributions: • If you aren’t already, try varying the true value of the variable when throwing the PE. • E.g., When throwing pull distributions for top pair cross sections, vary the theoretical cross section using the theory errors. • Root lets you generate random numbers from a histogram: )value = myHistPtr->Random() LJ+MET, March 23, 2009

Root and Random Number Generators • For all modern versions of Root (≥ 5.18), gRandom is an instance of TRandom3. • This is a decent random number generator. • If you are using an older version of Root, this may not be the default. • Change it: delete gRandom; gRandom = new TRandom3; • If you want to run PE jobs in parallel, make sure you set the jobs to have different random seeds: • gRandom->SetSeed (seed); where seed is an integer not equal to 0. • If you don’t set different seeds, your different jobs will not generate different PEs. Hardware Random Number Generator LJ+MET, March 23, 2009

Warning: • The concept of ensembles (e.g., PEs) is the heart of frequentist statistical thought. • When I talk about how to deal with systematic uncertainty, I will be effectively talking about integrating over systematic priors. )Bayesian statistics. • Most of this will be a (reasonable) mix of the two. • These methods will be sufficient for almost all of your needs, but: • These methods, however, will not always lead to proper frequentist coverage. ? LJ+MET, March 23, 2009

Summary • Throwing PEs is fun, easy, and important. • Weights and reweighting is cool and powerful. • Pull distributions should be standard operating procedure. • Using these simple ideas can really pay off in the end! LJ+MET, March 23, 2009

Charles Plager UCLA LJ+MET Meeting March 23, 2008