1 / 57

Quick & Simple Simulation in Excel with Clinical Trials Applications

Quick & Simple Simulation in Excel with Clinical Trials Applications. Presented to the Delaware Chapter of the American Statistical Association 20 October 2011 Dennis Sweitzer, Ph.D. www.Dennis-Sweitzer.com. Background. Occasional need for simulations Excel is convenient, but

arien
Download Presentation

Quick & Simple Simulation in Excel with Clinical Trials Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Quick & SimpleSimulation in Excelwith Clinical Trials Applications Presented to the Delaware Chapter of the American Statistical Association 20 October 2011 Dennis Sweitzer, Ph.D. www.Dennis-Sweitzer.com

  2. Background • Occasional need for simulations • Excel is convenient, but • does not explicitly support simulations • Simulation usually requires VBA programming (so why not use R or SAS instead) • Or Add-in commercial programs (eg., @Risk) • Or some academic add-ins • Does have iterative calculations, Solver • Why not simulation?

  3. Simulate what? • Stochastic Models • Unknown parameters? Guestimate a distribution • Optimizing policy? Test each with simulations • Sensitivity Analysis • Variations in Inputs  Variations in Outputs • 2 parameters: use a table • >2 parameters: simulate & compare variation

  4. Excel: Pros MEGO Common Language / Common Tools • Most people understand Excel • Many tools available in Excel Transparency: Modeling assumptions can be: Specified -- Graphed -- Debated What you see is what you get!

  5. Excel Cons Slower than in SAS, S+, R, etc Lacks some statistical/probability functions • Latest versions are a little better • Still need to add some VBA code • Known bugs in statistical routines (often fixed) Tradeoffs: • Quicker modifications vs slower execution

  6. Simple Solution: Data Tables Excel Data Tables • Creates a table of values of a function • (ie, Random Variables) • Leftmost column is used as an argument • (which is ignored in a simulation) • Data Table repeats calculations for each row • (Each row is an iteration of the simulation)

  7. 1. Create Simulation Create Random Variables using Inverse Probability Method: For Random Variable X with distribution function F(x), F(x): Â→ [0,1] If Random Uniform UÎ [0,1] X = F-1(U) (Excel: U=Rand() )

  8. 2. Align Random Variables • Calculations can be anywhere in Spreadsheet • Reference the Variables in a row • Is best to label variables in same way

  9. 3. Select Data Table • Select table region • 1st row is Rand Vars • 1st column is not used (can label iterations) • From toolbar: • Data>Data Table

  10. 4. Create Simulation Table • Column input cell = Upper left hand corner of table • Row input cell = ignore • OK  Populates the table • (may have to manually recalcule)

  11. 5. Execute Simulation Iterative development • Simulation can be changed • Add reporting variables • Recalculate to rerun • (no need to use Data Table again, unless expanding) • Hint: debug with short table, expand for final run

  12. The End (of the key concepts)

  13. But still more…. • Why use inverse probability distributions (instead of random variables)? • When not to use a spreadsheet for simulation? • Tools: • Macros to set up a simulation • VBA functions for common simulation distributions

  14. Inverse Probability Function • Most systems directly generate random variables with the desired distribution • Why use Inverse Probability Functions? • Which are (probably) slower? Personal opinion • Testing & Debugging • Verification  Calculates correctly • Validation  Calculations answer Problem • Sensitivity  Input vs Output variability

  15. As Mapping function ⟼ F-1 U Probability Distribution: F(x): Â→ [0,1] Random Uniform: UÎ (0,1] Inverse PDF: X = F-1(U) For Continuous (or monotone) F-1 Small changes in u∈U  small changes inF-1 (u)

  16. Mapping 2 Random Uniform Var As input to Deterministic Function

  17. Mapping Random numbers in (should) Map to outputs in

  18. Example #1 Simple model, function of 2 RV Saving {Ui}: • Verify • Replicate • Quantify A Max value looks high. Is it a bug? If not, how often? Saved random U[0,1] For each iteration Check u∈U[0,1] That generated high value u=0.983…  random high  Rarely happens

  19. Example #1 (Sensitivity) Sort by U1, U2 Sensitive to U1 Insensitive to U2

  20. Spreadsheet limitations • Only simple data structures are available • Rows & columns, no lists & trees • Discrete event simulations • Complex algorithms: difficult • Eg, While or for loops • Can improvise (cumbersome, slow, buggy) • Speed: slow • Data Storage: what-you-see-is-all-you-get

  21. Tools: Excel Simulation Template • Adds some missing random functions • Adds some set-up macros

  22. Macro SimulateSampler To start a new simulation when you don't remember the names & parameters of common random variables used in simulation: • Run the Macro SimulationSample • Copy, delete, and edit as needed. • Make sure all random values are referenced in the first row of the data table at the bottom.

  23. Macro SimulationSampler • Creates a simulation with each of common simulation functions

  24. Macro SimulationSampler ……… • Sets up header row for data table • Sets up a place for statistics

  25. Macro Simulate • Highlight the row of random variables • (1st row of simulation table) • Run macro "Simulate” • Prompts for which will ask for the number of simulation iterations, • The default number of iterations is 100 • Debug & develop (manually recalculate) • Final run with >1000 iterations • Visual Basic code is computationally intensive,

  26. Macro Simulate

  27. Note bene • Run Simulate right after SimulationSampler • Risk of “Ref!” error • SimTemplate,Plot,Sampler contains • The sampler • A distribution plot of all random variable • Crude, but handy for quick comparisons • Ready to edit

  28. SimTemplate,Plot,Sampler • Crude Distribution plot of ALL variables • Uses Percentile Ranksto save space • Good for Continuous Var. • Bad for Discrete Var. • Copy • Delete unwanted variables • Make it pretty

  29. Excel Random Variables • Rand() --Random Uniform [0,1] • NormSInv() – Inverse Standard Normal Distribution • CriticalBinomial() – Inverse Binomial Distribution • LogNormInv() - Inverse Log Normal Distribution • Caveat: parameters are mean, SD after the Log transformation

  30. Erlang Distribution How long do you wait until you get a predetermined number of arrivals? • Interarrival times are distributed IID exponential • Erlang is Gamma with integer parameter

  31. Beta Distribution • Can use as • Distribution of a Binomial probability • Range = [0,1] • Generic bounded hump (vs Normal as generic unbounded hump)

  32. Example#2, Problem Client: “Here’s our plan….” • Simple spreadsheet calculation • But only the expected value, • but not variability

  33. Example #2, Simulation • Time to 100th patient • Patients arrive IID Exponential Summary Statistics of Simulated values (below) Interpretation: under the assumptions, 90% of simulations required more than 4.4 months

  34. Added VBA Functions Inverse Functions Needed for Simulation • Poisson, Negative Binomial Interpolation from Table • Interpolate: 1 or 2 dimensional interpolation Convenience • Beta with Mean, SD as parameters • Beta with Hi, Low, and Mode used for parameters (often used for PERT/CPM charts) • Log Normal with mean, SD as parameters

  35. Missing Statistical Functions • InvPoisson :: Poisson Distribution • InvPascal :: Integer valued Negative Binomial • (how many failures before k successes) Negative Binomial is continuous valued distribution; discrete version is often denoted Pascal distribution

  36. Example#3, Patients to Screen Expected Enrollment rate = 75% ± 5% ~ Beta Distribution # Screen Failures ~ Negative Binomial (Pascal) • Depends on Enrollment Rate

  37. Beta Distribution (2) For Convenience • Beta distribution given Mean, SD • Beta distribution given Mean, SD, upper, lower bounds • Beta distribution given Mode, Upper, Lower bounds • Sometimes used for PERT/Critical Path Analysis • 3 estimates for tasks: Optimistic, Pessimistic, Most Likely • Beta distributed time for each task • Assumes SD = 1/6 of the interval [low, high]

  38. Simulation from a Table Simulate arbitrary distribution: • Top Row: values in [0,1] • Bottom Row: Quantiles • Result: interpolated value of U from table Or a function: y=f(x) • X is found in top row, y is interpolated from bottom row

  39. Table Simulation Uses • Polygonal distributions (like Triangular) • Survival curve (for time to event) • Est. K-M curve from data, simulate rest of trial • Arbitrary empirical distributions • Distribution from observations

  40. Simulation from a 2-dimensional table • Here: • Rows are quartiles of a random function • Left column is value of a parameter • A family of distributions which vary with the parameter • Parameter y=75% (can be random) • Generate random numbers from the interpolated distribution.

  41. Example #4: Interim Review • After 2 months, review randomization rates • Continue to Randomize to 100 patients • How long?

  42. Example#4: Interim Review (Simulation) Y= # Patients at 2 mos ~ Poisson Time to Randomize (100-Y) additional pts ~ Erlang (Gamma) 80% CI:; (2.5, 3.7) months

  43. Clinical Trials Applications • Simulations for planning • Prototyping larger simulation • Checking assumptions/validation

  44. Why Simulate? Expected Trial Performance • Usually not of interest -- already done w/o simulation Variability of Trial Performance • Important for Risk Management: “What’s the earliest, the latest, the most, the least, etc” • 80% CIs Structural Problems • Interactions of parameters may doom the trial before it even starts! (eg, mean (max{ X, Y} ) vs max{ mean(X), mean(Y) } )

  45. Prototyping Prototyping: • Toy simulation with hands-on teamwork • Development model • Get team buy-in on assumptions • Processing speed not important • Rapid modifications are important Ideal? • Develop a prototype in an 1 hour meeting • Check for errors later • Run large simulations later for precise estimates

  46. Checking planning assumptions • H0 = Simulation assumptions • Observed: a value X • {xi} = corresponding values in simulation • Rank of X in {xi} ≈ p-value Stored Values: Use Function Percent Rank Descriptive Statistics: Use Frequency Count Use to: • Test assumptions, validate model, +?? • If an observed value of X is rare in the simulation, question assumptions!

  47. Checking Assumptions (2) Example: • A trial is designed based on a non-trivial simulation. • The model predicts a completion rate of 65% with 95% C.I.= (55%, 75%) • 4 months into the trial, a 50% completion rate is observed. • How significant is this discrepancy? Resimulate: • {xi} = simulated completion rates (1/iteration) • Rank of observed 50% in {xi} ≈ p-value • “How likely is the observation, under the modeled assumptions?”

  48. Example #5: Simulating a 30 patient trial • Each patient is a random vari • Each patient is a random variable • Survival times are interpolated • Estimated survival curves have confidence intervals • All 30 patients in an iteration use the same random conf.level • Conf. Level is updated each iteration

  49. Example #5: Testing Assumptions Statistics on the patients are the simulation random variables • Assume the trial was carried out • 70% of patients complete Q: is this consistent with the simulations? A: Yes, but… Only 6.1% of simulations had >70% completion

  50. Macro Management VBA Editor: Alt-F11 (or find the menu) • Some versions of Excel • Copy Module between sheets • Copy code from .xls sheet & insert into VBA editor • Open & save as new sheet

More Related