1 / 46

(and Precision) Effective Research Design Planning for Grant Proposals & More

Power. (and Precision) Effective Research Design Planning for Grant Proposals & More. Walt Stroup, Ph.D. Professor & Chair, Department of Statistics University of Nebraska, Lincoln. Outline for Talk. What is “Power Analysis”? Why should I do it? Essential Background

wells
Download Presentation

(and Precision) Effective Research Design Planning for Grant Proposals & More

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Power (and Precision)Effective Research Design Planningfor Grant Proposals & More Walt Stroup, Ph.D. Professor & Chair, Department of Statistics University of Nebraska, Lincoln SSP Core Facility

  2. Outline for Talk • What is “Power Analysis”? Why should I do it? • Essential Background • A Word about Software • Decisions that Affect Power – several examples • Latest Thinking • Final Thoughts SSP Core Facility

  3. Power and Precision Defined • Precision a.k.a “Margin of Error” • In most cases, the standard error of relevant estimate • Power • Prob { reject H0 given H0 false } • Prob { research hypothesis statistically significant } • Power analysis • essentially, “If I do the study this way, power = ?” • Sample size estimation • How many observations required to achieve given power? SSP Core Facility

  4. What’s involved in Power Analysis • WHAT IT’S NOT: • “Painting by numbers...” • IF IT’S DONE RIGHT • Power analysis should be • a comprehensive conversation to plan the study • a “dress rehearsal” for the statistical analysis once the data are collected SSP Core Facility

  5. Why do a Power Analysis? • For NIH Grant Proposal • because it’s required • For many other grant proposals • because it gives you a competitive edge • Other reasons • practical: increases chance of success; reduces “we don’t have time to do it right, but lots of time to do it over” syndrome • ethical SSP Core Facility

  6. Ethical??? • Last Ph.D. in U.S. Senate • Irritant to doctrinaire left and right • Keynote address to 1997 American Stat. Assoc. “... we can continue to make policy based on ‘data-free ideology’ on we can inform policy where possible by competent inquiry...” late U.S. Senator Daniel Patrick Moynihan SSP Core Facility

  7. Ethical • Results of your study may affect policy • Well-conceived research means • better information • greater chance of sound decisions • Poorly-conceived research • lost opportunity • deprives policy-makers of information that might have been useful • or worse: bad information misinforms or misleads public SSP Core Facility

  8. What affects Power & Precision? • A short statistics lesson • What goes into computing test statistics • What test statistics are supposed to tell us • A bit about the distribution of test statistics • Central and non-central t, F, and chi-square ( mostly F ) SSP Core Facility

  9. What goes into a test statistic? Research hypothesis – motivation for study Assumed not true unless data show compelling evidence otherwise Research hypothesis: HA ; opposite: H0 SSP Core Facility

  10. What goes into a test statistic? • Visualize using F • But same basic principles for t, chi-square, etc • F is ratio of variation attributable to factor under study vs. variation attributable to noise N of obs effect size variance of noise (i.e. among obs) SSP Core Facility

  11. When H0 True – i.e. no trt effect SSP Core Facility

  12. When H0 false (i.e. Research HA true) SSP Core Facility

  13. What affects Power? N of obs effect size variance of noise (i.e. among obs) SSP Core Facility

  14. What should be in a conversation about Power? • Effect size: what is the minimum that matters? • Variance: how much “noise” in the response variable (range? distribution? count? pct?) • Practical Constraints • Design: same N can produce varying Power N of obs effect size variance of noise (i.e. among obs) SSP Core Facility

  15. About Software (part I) • Canned Software • lots of it • Xiang and Zhou working on report • “painting by numbers” • Simulation • most accurate; not constrained by canned scenarios • you can see what will happen if you actually do this... • “Exemplary data set” + modeling software • nearly as accurate as simulation • “dress rehearsal” for actual analysis • MIXED, GLIMMIX, NLMIXED: if you can model it you can do power analysis SSP Core Facility

  16. Design Decisions – Some Examples • Main Idea: For the same amount of effort, or $$$, or # observations, power and precision can be quite different • Power analysis objective: Work smarter, not harder • Simple example – design of regression study • From STAT 412 exercise SSP Core Facility

  17. Treatment Design Exercise • Class was asked to predict Bounce Height of basketball from Drop Heightandto see if relationship changes depending on floor surface • Decision: What drop heights to use??? SSP Core Facility

  18. Objectives and Operating Definitions • Recall objective: does drop: bounce height relationship change with floor surface? operating definition SSP Core Facility

  19. Consequences of Drop Height Decisions • Should we use fewer drops heights & more obs per drop height or vice versa? table from Stat 412 Avery archive SSP Core Facility

  20. Simulation • CRD example: 3 treatments, 5 reps / treatment • Suspected Effect size:6-10% relative to control, whose mean is known to be ~ 100 • Standard deviation: 10 considered “reasonable” • Simulate 1000 experiments • Reject H0: equal trt means 228 times • power = 0.228 at alpha=0.05 • Ctl mean ranked correctly 820 times • (intermediate mean rankedcorrectly 589 times) SSP Core Facility

  21. “Exemplary Data” • Many software packages for power & sample size • e.g SAS PROC POWER • for FIXED effect models only • “Exemplary Data” more general • Especially (but not only) when “Mixed Model Issues” • random effects • split-plot structure • errors potentially correlated: longitudinal or spatial data • any other non-standard model structure • Methods use PROC MIXED or GLIMMIX • adapted from Stroup (2002, JABES) • Chapter 12, SAS for Mixed Models • (Littell, et al, 2006) SSP Core Facility

  22. “Exemplary Data” - Computing Power using SAS • create data set like proposed design • run PROC GLIMMIX (or MIXED) with variance fixed • =(F computed by GLIMMIX)rank(K) [or chi-sq with GLM] • use GLIMMIX to compute  • critical F (Fcrit ) is value s.t. P{F(rank(K), υ, 0 ) > Fcrit}=  [or chi-square] • Power = P{F[rank(K), υ, ] >Fcrit } • SAS functions can compute Fcrit & Power SSP Core Facility

  23. Compute Power with GLIMMIX – CRD example /* step 1 - create data set with same structure as proposed design use MU (expected mean) instead of observed Y_ij values */ /* this example shows power for 5, 10, and 15 e.u. per trt */ data crdpwrx1; input trt mu; do n=5to15by5; do eu=1to n; output; end; end; cards; 1 100 2 94 3 90 ; SSP Core Facility

  24. Compute Power with GLIMMIX – CRD example /* step 2 - use PROC GLIMMIX to compute non-centrality parameters for ANOVA tests & contrasts ODS statements output them to new data sets */ procsortdata=crdpwrx1; by n; procglimmix data=crdpwrx1; by n; class trt; model mu=trt; parms (100)/hold=1; contrast'et1 v et2' trt 01 -1; contrast'c vs et' trt 2 -1 -1; odsoutput tests3=b; odsoutput contrasts=c; run; SSP Core Facility

  25. /* step 3: combine ANOVA & contrast n-c parameter data sets use SAS functions PROBF and FINV to compute power */ data power; set b c; alpha=0.05; ncparm=numdf*fvalue; fcrit=finv(1-alpha,numdf,dendf,0); power=1-probf(fcrit,numdf,dendf,ncparm); procprint; Note close agreement of Simulated Power (0.228) and “exemplary data” power (0.224) Obs Effect Label DF DenDF alpha nc fcrit power 1 trt 2 12 0.05 2.53333 3.88529 0.22361 2 et1 v et2 1 12 0.05 0.40000 4.74723 0.08980 3 c vs et 1 12 0.05 2.13333 4.74723 0.26978 SSP Core Facility

  26. More Advanced Example • Plots in 8 x 3 grid • Main variation along 8 “rows” • 3 x 2 treatment design • Alternative designs • randomized complete block (4 blocks, size 6) • incomplete block (8 blocks, size 3) • split plot • RCBD “easy” but ignores natural variation SSP Core Facility

  27. Picture the 8 x 3 Grid Gradient e.g. 8 schools, gradient is “SES”, 3 classrooms each SSP Core Facility

  28. SAS Programs to Compare 8 x 3 Design data a; input bloc trtmnt @@; do s_plot=1to3; input dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end; cards; 1 1 1 2 3 1 2 1 2 3 2 1 1 2 3 2 2 1 2 3 3 1 1 2 3 3 2 1 2 3 4 1 1 2 3 4 2 1 2 3 ; Split-Plot procglimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; random trtmnt/subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast'trt x lin' trtmnt*dose 10 -1 -101; odsoutput diffs=b; odsoutput contrasts=c; run; SSP Core Facility

  29. 8 x 3 – Incomplete Block data a; input bloc @@; do eu=1to3; input trtmnt dose @@; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end; cards; 1 1 1 1 2 1 3 2 1 1 1 2 2 2 3 1 1 1 3 2 3 4 1 1 2 1 2 2 5 1 2 1 3 2 2 6 1 2 2 1 2 3 7 1 3 2 1 2 3 8 2 1 2 2 2 3 ; procglimmix data=a noprofile; class bloc trtmnt dose; model mu=trtmnt|dose; random intercept / subject=bloc; parms (4) (6) / hold=1,2; lsmeans trtmnt*dose / diff; contrast'trt x lin' trtmnt*dose 10 -1 -101; odsoutput diffs=b; odsoutput contrasts=c; run; SSP Core Facility

  30. 8 x 3 Example - RCBD data a; input trtmnt dose @@; do bloc=1to4; mu=trtmnt*(0*(dose=1)+4*(dose=2)+8*(dose=3)); output; end; cards; 1 1 1 2 1 3 2 1 2 2 2 3 ; procglimmix data=a noprofile; class bloc trtmnt dose; model mu=bloc trtmnt|dose; parms (10) / hold=1; lsmeans trtmnt*dose / diff; contrast'trt x lin' trtmnt*dose 10 -1 -101; odsoutput diffs=b; odsoutput contrasts=c; run; SSP Core Facility

  31. How did designs compare? • Suppose main objective is compare regression over 3 levels of doses: do they differ by treatment? (similar to basketball experiment) • Operating definition is thus H0: dose regression coefficient equal • Power forRandomized Block:0.66 • Power for Incomplete Block: 0.85 • Power for Split-Plot: 0.85 • Same # observations – you can work smarter SSP Core Facility

  32. But what if I don’t know Trt Effect Size or Variance? • “How can I do a power analysis? If I knew the effect size and the variance I wouldn’t have to do the study.” • What trt effect size is NOT: it is NOT the effect size you are going to observe • It is somewhere between • what current knowledge suggests is a reasonable expectation • minimum difference that would be considered “important” or “meaningful” SSP Core Facility

  33. And Variance?? • Know thy relevant background / Do thy homework • Literature search: what have others working with similar subjects reported as variance? • Pilot study • Educated guess • range you’d expect 95% of likely obs? divide it by 4 • most extreme values you can plausibly imagine? divide range by 6 SSP Core Facility

  34. Hierarchical Linear Models • From Bovaird (10-27-2006) seminar • 2 treatment • 20 classrooms / trt • 25 students / classroom • 4 years • reasonable ideas of classroom(trt), student(classroom*trt), within student variances as well as effect size • Implement via exemplary data + GLIMMIX SSP Core Facility

  35. Categorical Data? • Example: Binary data • “Standard” has success probability of 0.25 • “New & Improved” hope to increase to 0.30 • Have N subjects at each of L locations • For sake of argument, suppose we have • 900 subjects / location • 10 locations SSP Core Facility

  36. Power for GLMs • 2 treatments • P{favorable outcome} • for trt 1 p= 0.30; for trt 2 p=0.25 • power if n1=300; n2=600 data a; input trt y n; datalines; 1 90 300 2 150 600 ; proc glimmix; class trt; model y/n=trt / chisq; ods output tests3=pwr; run; data power; set pwr; alpha=0.05; ncparm=numdf*chisq; crit=cinv(1-alpha,numdf,0); power=1-probchi(crit,numdf,ncparm); proc print; run; exemplary data SSP Core Facility

  37. Power for GLMM • Same trt and sample size per location as before • 10 locations • Var(Location)=0.25; Var(Trt*Loc)=0.125 • Variance Components: variation in log(OddsRatio) • Power? data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ; proc glimmix data=a initglm; class trt loc; model y/n = trt / oddsratio; random intercept trt / subject=loc; random _residual_; parms (0.25) (0.125) (1) / hold=1,2,3; ods output tests3=pwr; run; SSP Core Facility

  38. GLMM Power Analysis Results Gives you expected Conf Limits for # Locations & N / Loc contemplated Gives you the power of the test of TRT effect on prob(favorable) SSP Core Facility

  39. GLMM Power: Impact of Sample Size? • N of subjects per trt per location? • N of Locations? • Three cases • n-300/600 10 loc • n=600/1200, 10 loc • n=300/600, 20 loc data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 90 300 2 150 600 ; data a; input trt y n; do loc=1 to 10; output; end; datalines; 1 180 600 2 300 1200 ; data a; input trt y n; do loc=1 to 20; output; end; datalines; 1 90 300 2 150 600 ; SSP Core Facility

  40. GLMM Power: Impact of Sample Size? Recall, for 10 locations, N=300/600, CI for OddsRatio was (0.884, 1.871); Power was 0.274 For 10 locations, N=600 / 1200 N alone has almost no impact For 20 locations, N=300 / 600 SSP Core Facility

  41. Recent developments • Continue binary example • Power analysis shows: what do you do? SSP Core Facility

  42. More Information • Consider studies directed toward improving success rate similar to that proposed in study • Lit search yields 95 such studies • 29 have reported statistically significant gains of p1-p2>0.05 (or, alternatively, significant odds ratios of [(30/70)/(25/75)]=1.28 or greater) • If this holds, “prior” prob (desired effect size ) is approx 0.3 SSP Core Facility

  43. An Intro Stat Result real Pr{type I error} is more like 0.23 than 0.10!!! SSP Core Facility

  44. Returning to All Scenarios NOTE dramatic impact of alpha-level when “prior” Pr { DES } is relatively low POWER role increases at Pr { DES } increases SSP Core Facility

  45. Closing Comments • In case it’s not obvious • I’m not a fan of “painting by numbers” • Role of power analysis misunderstood & underappreciated • MOST of ALLit is an opportunity to explore and rehearse study design & planned analysis • Engage statistician as a participating member of research team • Give it the TIME it REQUIRES SSP Core Facility

  46. Thanks ... for coming

More Related