How big should my study be? The science and art of choosing your sample size - PowerPoint PPT Presentation

how big should my study be the science and art of choosing your sample size n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
How big should my study be? The science and art of choosing your sample size PowerPoint Presentation
Download Presentation
How big should my study be? The science and art of choosing your sample size

play fullscreen
1 / 98
How big should my study be? The science and art of choosing your sample size
170 Views
Download Presentation
santa
Download Presentation

How big should my study be? The science and art of choosing your sample size

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. How big should my study be?The science and art of choosing your sample size Mark Pletcher Designing Clinical Research Summer 2013

  2. Choosing sample size • A fundamental decision • A critical determinate of statistical power • A critical determinate of feasibility

  3. Choosing sample size • “Nothing focuses the mind like a sample size calculation” • Mike Kohn

  4. Choosing sample size • Ingredients for a sample size calculation • “Focusing the mind” on measurements, etc • Tools for making the calculation • Tables in the book, Stata, online calculators • Examples • What drives sample size? • Modifying study design to reduce sample size • Getting to a final answer for your study • Round peg/square hole? MAKE IT FIT! • Unknown assumptions? GUESS! • Persuasive writing and justification

  5. Example 1 • Alcohol and atrial fibrillation incidence As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.

  6. Example 1 • Alcohol and atrial fibrillation incidence As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.

  7. Example 1 (boiled down…) • If………..[assumptions] • Then……a sample size of 2920 will give us a 90% chance of ending up with a “statistically significant” result

  8. Example 1 (boiled down…) • If………..[assumptions] • Then……a sample size of 2920 will give us a 90% chance of ending up with a “statistically significant” result What are the key assumptions?

  9. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis • Clear measurements • Usually phrased as a “null” hypothesis • Planned statistical test • Assumption about variability of measurements • An effect size • “Alpha” error (1-sided or 2-sided) threshold

  10. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?”

  11. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?” Too vague!

  12. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?” “Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65?”

  13. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?” “Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65? Better, but not phrased as a “null” hypothesis

  14. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis “Does alcohol cause atrial fibrillation?” “Is drinking 2+ drinks/day (vs. drinking less) associated with incident atrial fibrillation at 5 years in adults over age 65? “H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”

  15. The Null Hypothesis… • Why do we need a NULL hypothesis?

  16. The Null Hypothesis… • Why do we need a NULL hypothesis? • Theoretically speaking, we can only DISPROVE something (or say it’s unlikely), we can never PROVE something* • So we state a NULL hypothesis, and then say that it is very unlikely to be true “H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65” *Karl Popper, The Logic of Scientific Discovery, 1934

  17. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis • Clear measurements • Usually phrased as a “null” hypothesis • Planned statistical test • Assumption about variability of measurements • An effect size • “Alpha” error (1-sided or 2-sided) threshold

  18. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test PREDICTOR OUTCOMEDichotomous Continuous Dichotomous chi-squared t-test Continuous t-test correlation

  19. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test PREDICTOR OUTCOMEDichotomous Continuous Dichotomous chi-squared t-test Continuous t-test correlation Need to know your variable types!

  20. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test Dichotomous variables have only 2 values. Male vs. female Dead vs. alive Hypertension vs. no hypertension Smoker or non-smoker

  21. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test Continuous variables have many values Blood pressure Age Quality of life Waist circumference

  22. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use?

  23. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use? Drinks/day Drinker vs. non-drinker Heavy (2+) vs. light drinker (<2 drinks/day) Non-drinker vs. occasional vs. regular vs. heavy

  24. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use? Drinks/day Drinker vs. non-drinker Heavy (2+) vs. light drinker (<2 drinks/day) Non-drinker vs. occasional vs. regular vs. heavy Not normally distributed?

  25. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use? Drinks/day Drinker vs. non-drinker Heavy (2+) vs. light drinker (<2 drinks/day) Non-drinker vs. occasional vs. regular vs. heavy 4-level categorical variable?

  26. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is alcohol use? Drinks/day Drinker vs. non-drinker Heavy (2+) vs. light drinker (<2 drinks/day) Non-drinker vs. occasional vs. regular vs. heavy Easy! For the purposes of sample size calculation, you may want to dichotomize…

  27. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is atrial fibrillation? Person with vs. without afib Frequency of episodes Beats/minute Years to onset of afib (“time to event”) Proportion onset of afib at 5 years

  28. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is atrial fibrillation? Person with vs. without afib Frequency of episodes Beats/minute Years to onset of afib (“time to event”) Proportion onset of afib at 5 years Normally distributed?

  29. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is atrial fibrillation? Person with vs. without afib Frequency of episodes Beats/minute Years to onset of afib (“time to event”) Proportion onset of afib at 5 years “Survival analysis”

  30. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test What kind of variable is atrial fibrillation? Person with vs. without afib Frequency of episodes Beats/minute Years to onset of afib (“time to event”) Proportion onset of afib at 5 years Dichotomous (easy)

  31. Key assumptions • Assumptions (aka “ingredients”) • Planned statistical test PREDICTOR OUTCOMEDichotomous Continuous Dichotomous chi-squared t-test Continuous t-test correlation “H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65”

  32. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis • Clear measurements • Usually phrased as a “null” hypothesis • Planned statistical test • Assumption about variability of measurements • An effect size • “Alpha” error (1-sided or 2-sided) threshold

  33. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% P2 = 15%

  34. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% (prob afib at 5 years if <2 drinks) P2 = 15% (prob afib at 5 years if 2+ drinks)

  35. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% (prob afib at 5 years if <2 drinks) P2 = 15% (prob afib at 5 years if 2+ drinks) Effect size clearly delineated: Risk difference = 5%; relative risk = 1.5

  36. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% (prob afib at 5 years if <2 drinks) P2 = 15% (prob afib at 5 years if 2+ drinks) Variability is “embedded”…varies with P1…

  37. Key assumptions • Assumptions (aka “ingredients”) • Variability and effect size for chi-squared test Probability of outcome in each predictor group P1 = 10% (prob afib at 5 years if <2 drinks) P2 = 15% (prob afib at 5 years if 2+ drinks) Bottom line: Giving both probabilities is clear and unambiguous (…wait for counter-examples)

  38. Key assumptions • Assumptions (aka “ingredients”) • Testable hypothesis • Clear measurements • Usually phrased as a “null” hypothesis • Planned statistical test • Assumption about variability of measurements • An effect size • “Alpha” error (1-sided or 2-sided) threshold

  39. Key assumptions • Assumptions (aka “ingredients”) • “Alpha” error (1-sided or 2-sided) threshold Standard p-value threshold: 0.05 (“Type I error” rate = “alpha”)

  40. Key assumptions • Assumptions (aka “ingredients”) • “Alpha” error (1-sided or 2-sided) threshold Standard p-value threshold: 0.05 (“Type I error” rate = “alpha”) Standard choice: 2-sided test

  41. Key assumptions • Assumptions (aka “ingredients”) • “Alpha” error (1-sided or 2-sided) threshold Standard p-value threshold: 0.05 (“Type I error” rate = “alpha”) Standard choice: 2-sided test Unless uninterested in a large effect in the opposite direction as you expect, choose 2-sided - the clear, safe choice almost always

  42. Key assumptions • Assumptions (aka “ingredients”) • “Alpha” error (1-sided or 2-sided) threshold Standard p-value threshold: 0.05 (“Type I error” rate = “alpha”) Standard choice: 2-sided test Power = 1- “beta” error (so 90% power = 10% beta error)

  43. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 10% • P2 = 15% • 2-sided alpha = 0.05, beta = .10

  44. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 10% • P2 = 15% • 2-sided alpha = 0.05, beta = .10 Go to page 75 of DCR (4th edition)…

  45. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 10% • P2 = 15% • 2-sided alpha = 0.05, beta = .10 Go to page 75 of DCR (4th edition)… Sample size = 958 PER GROUP = 1916 total

  46. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 15% • P2 = 20% Risk diff = 5% • 2-sided alpha = 0.05, beta = .10 Go to page 86 of DCR (3rd edition)… Sample size = 1252 x 2 = 2504 total

  47. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 20% • P2 = 25% Risk diff = 5% • 2-sided alpha = 0.05, beta = .10 Go to page 86 of DCR (3rd edition)… Sample size = 1504 x 2 = 3008 total

  48. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 20% • P2 = 30% RR = 1.5 • 2-sided alpha = 0.05, beta = .10 Go to page 86 of DCR (3rd edition)… Sample size = 412 x 2 = 824 total

  49. Example 1 • H0: There is no association between drinking 2+ drinks/day (vs. drinking less) and incident atrial fibrillation at 5 years in adults over age 65 • 2 dichotomous variables  chi-squared test • P1 = 20% • P2 = 30% RR = 1.5 • 2-sided alpha = 0.05, beta = .10 Go to page 86 of DCR (3rd edition)… Sample size = 412 x 2 = 824 total Not enough to specify an effect size of “5%” or “RR = 1.5” – need to give both probabilities

  50. Back to our paragraph… As an example, we might wish to assess alcohol as a predictor of incident atrial fibrillation. Assuming 20% of the cohort will drink 2 or more alcoholic beverages daily, we estimate that 2920 participants (584 drinking 2+/day) with full data and longitudinal follow-up over 5 years would provide 90% power to detect a 5% difference (15% vs. 10% in controls) in the incidence of AF using a two-tailed alpha of 0.05.