# How big should my study be? The science and art of choosing your sample size - PowerPoint PPT Presentation

How big should my study be? The science and art of choosing your sample size

1 / 98
How big should my study be? The science and art of choosing your sample size

## How big should my study be? The science and art of choosing your sample size

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. How big should my study be?The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

2. Choosing sample size • A fundamental decision • A critical determinate of statistical power • A critical determinate of feasibility

3. Choosing sample size • “Nothing focuses the mind like a sample size calculation” • Mike Kohn

4. Choosing sample size • Ingredients for a sample size calculation • “Focusing the mind” on measurements, etc • Tools for making the calculation • Tables in the book, Stata, online calculators • Examples • What drives sample size? • Modifying study design to reduce sample size • Getting to a final answer for your study • Round peg/square hole? MAKE IT FIT! • Unknown assumptions? GUESS! • Persuasive writing and justification

5. Example 1 • Alcohol and atrial fibrillation incidence As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.

6. Example 1 • Alcohol and atrial fibrillation incidence As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.

7. Example 1 (boiled down…) • If………..[assumptions] • Then……a sample size of 2920 will give us a 90% chance of ending up with a “statistically significant” result

8. Example 1 (boiled down…) • If………..[assumptions] • Then……a sample size of 2920 will give us a 90% chance of ending up with a “statistically significant” result What are the key assumptions?

9. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis • Clear measurements • Usually phrased as a “null” hypothesis • Planned statistical test • Assumption about variability of measurements • An effect size • “Alpha” error (1-sided or 2-sided) threshold

10. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?”

11. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?” Too vague!

12. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?” “Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65?”

13. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?” “Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65? Better, but not phrased as a “null” hypothesis

14. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?” “Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65? “H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”

15. The Null Hypothesis… • Why do we need a NULL hypothesis?

16. The Null Hypothesis… • Why do we need a NULL hypothesis? • Theoretically speaking, we can only DISPROVE something (or say it’s unlikely), we can never PROVE something* • So we state a NULL hypothesis, and then say that it is very unlikely to be true “H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65” *Karl Popper, The Logic of Scientific Discovery, 1934

17. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis • Clear measurements • Usually phrased as a “null” hypothesis • Planned statistical test • Assumption about variability of measurements • An effect size • “Alpha” error (1-sided or 2-sided) threshold

18. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test PREDICTOR OUTCOMEDichotomous Continuous Dichotomous chi-squared t-test Continuous t-test correlation

19. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test PREDICTOR OUTCOMEDichotomous Continuous Dichotomous chi-squared t-test Continuous t-test correlation Need to know your variable types!

20. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test Dichotomous variables have only 2 values. Male vs. female Dead vs. alive Hypertension vs. no hypertension Smoker or non-smoker

21. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test Continuous variables have many values Blood pressure Age Quality of life Waist circumference

22. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use?

23. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use? Drinks/day Drinker vs. non-drinker Heavy (2+) vs. light drinker (<2 drinks/day) Non-drinker vs. occasional vs. regular vs. heavy

24. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use? Drinks/day Drinker vs. non-drinker Heavy (2+) vs. light drinker (<2 drinks/day) Non-drinker vs. occasional vs. regular vs. heavy Not normally distributed?

25. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use? Drinks/day Drinker vs. non-drinker Heavy (2+) vs. light drinker (<2 drinks/day) Non-drinker vs. occasional vs. regular vs. heavy 4-level categorical variable?

26. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use? Drinks/day Drinker vs. non-drinker Heavy (2+) vs. light drinker (<2 drinks/day) Non-drinker vs. occasional vs. regular vs. heavy Easy! For the purposes of sample size calculation, you may want to dichotomize…

27. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is atrial fibrillation? Person with vs. without afib Frequency of episodes Beats/minute Years to onset of afib (“time to event”) Proportion onset of afib at 5 years

28. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is atrial fibrillation? Person with vs. without afib Frequency of episodes Beats/minute Years to onset of afib (“time to event”) Proportion onset of afib at 5 years Normally distributed?

29. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is atrial fibrillation? Person with vs. without afib Frequency of episodes Beats/minute Years to onset of afib (“time to event”) Proportion onset of afib at 5 years “Survival analysis”

30. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is atrial fibrillation? Person with vs. without afib Frequency of episodes Beats/minute Years to onset of afib (“time to event”) Proportion onset of afib at 5 years Dichotomous (easy)

31. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test PREDICTOR OUTCOMEDichotomous Continuous Dichotomous chi-squared t-test Continuous t-test correlation “H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”

32. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis • Clear measurements • Usually phrased as a “null” hypothesis • Planned statistical test • Assumption about variability of measurements • An effect size • “Alpha” error (1-sided or 2-sided) threshold

33. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% P2 = 15%

34. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% (prob afib at 5 years if <2 drinks) P2 = 15% (prob afib at 5 years if 2+ drinks)

35. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% (prob afib at 5 years if <2 drinks) P2 = 15% (prob afib at 5 years if 2+ drinks) Effect size clearly delineated: Risk difference = 5%; relative risk = 1.5

36. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% (prob afib at 5 years if <2 drinks) P2 = 15% (prob afib at 5 years if 2+ drinks) Variability is “embedded”…varies with P1…

37. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% (prob afib at 5 years if <2 drinks) P2 = 15% (prob afib at 5 years if 2+ drinks) Bottom line: Giving both probabilities is clear and unambiguous (…wait for counter-examples)

38. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis • Clear measurements • Usually phrased as a “null” hypothesis • Planned statistical test • Assumption about variability of measurements • An effect size • “Alpha” error (1-sided or 2-sided) threshold

39. Key assumptions • Assumptions (aka “ingredients”) • “Alpha” error (1-sided or 2-sided) threshold Standard p-value threshold: 0.05 (“Type I error” rate = “alpha”)

40. Key assumptions • Assumptions (aka “ingredients”) • “Alpha” error (1-sided or 2-sided) threshold Standard p-value threshold: 0.05 (“Type I error” rate = “alpha”) Standard choice: 2-sided test

41. Key assumptions • Assumptions (aka “ingredients”) • “Alpha” error (1-sided or 2-sided) threshold Standard p-value threshold: 0.05 (“Type I error” rate = “alpha”) Standard choice: 2-sided test Unless uninterested in a large effect in the opposite direction as you expect, choose 2-sided - the clear, safe choice almost always

42. Key assumptions • Assumptions (aka “ingredients”) • “Alpha” error (1-sided or 2-sided) threshold Standard p-value threshold: 0.05 (“Type I error” rate = “alpha”) Standard choice: 2-sided test Power = 1- “beta” error (so 90% power = 10% beta error)

43. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 10% • P2 = 15% • 2-sided alpha = 0.05, beta = .10

44. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 10% • P2 = 15% • 2-sided alpha = 0.05, beta = .10 Go to page 75 of DCR (4th edition)…

45. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 10% • P2 = 15% • 2-sided alpha = 0.05, beta = .10 Go to page 75 of DCR (4th edition)… Sample size = 958 PER GROUP = 1916 total

46. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 15% • P2 = 20% Risk diff = 5% • 2-sided alpha = 0.05, beta = .10 Go to page 86 of DCR (3rd edition)… Sample size = 1252 x 2 = 2504 total

47. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 20% • P2 = 25% Risk diff = 5% • 2-sided alpha = 0.05, beta = .10 Go to page 86 of DCR (3rd edition)… Sample size = 1504 x 2 = 3008 total

48. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 20% • P2 = 30% RR = 1.5 • 2-sided alpha = 0.05, beta = .10 Go to page 86 of DCR (3rd edition)… Sample size = 412 x 2 = 824 total

49. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 20% • P2 = 30% RR = 1.5 • 2-sided alpha = 0.05, beta = .10 Go to page 86 of DCR (3rd edition)… Sample size = 412 x 2 = 824 total Not enough to specify an effect size of “5%” or “RR = 1.5” – need to give both probabilities

50. Back to our paragraph… As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.