Sample Size and Power Calculations

1. Sample Size and Power Calculations Marcia A. Ciol 04/09/08

2. What resources do I need? How long will it take to conduct the study? I need 50 participants in my study About 5 individuals per year will be enrolled Therefore, it will take 10 years to finish the study� How much money do I need? I will follow a cohort of 500 individuals A lab test that costs US$100 will be conducted for each person Therefore, I will need US$50,000 just for lab tests�

3. Am I going to reach my objective? I have 2 years to finish my thesis, of which one year is for data collection I think I can get data on 50 people in that year Is 50 a sufficient number of people to test my hypothesis with the significance level I want?

4. Why to calculate sample size and power? To show that under certain conditions, the hypothesis test has a good chance of showing a desired difference (if it exists) To show to the funding agency that the study has a reasonable chance to obtain a conclusive result To show that the necessary resources (human, monetary, time) will be minimized and well utilized

5. What do I need to know to calculate sample size? Most Important: sample size calculation is an educated guess It is appropriate for studies involving hypothesis testing There is no magic involved; only statistical and mathematical logic and some algebra Researchers need to know something about what they are measuring and how it varies in the population of interest

6. Factors related to the sample size Population factor (cannot be controlled by researcher) Characteristics of the study design Quantities related to the research question (defined by the researcher) There are many factors that are intertwined in the calculation of sample sizes or power of a study. Some factors depend on the design of the study, others on the investigator�s choices, and others on the data themselves. The first consideration is the type of response variable. The study design and the response variable will determine the type of statistical method used in the data analysis. For example, if the data are continuous, and two groups are being compared for their means, a t-test may be the appropriate statistical method of analysis. The t-test defines the formula for the sample size or power calculation. The second set of factors depend on the investigator�s choices. He/she needs to define the acceptable levels of significance and power of the study. The sample size may be more a consideration of availability and/or resources than what is necessary to achieve a certain power. The third factor is the variation of the data in the population of interest. It is intuitive to realize that the higher the variation of the data (this may includes measures of variance and correlation among observations), the larger the sample size will have to be to give us enough confidence that we have a good estimate of the mean, for example. The last five items in the slide above are related to each other in the formula defined by specific statistical test used in the study. If one knows four of those values, the fifth will be determined. Therefore, some values will have to either come from previous studies and/or knowledge, or will have to be assumed. There are many factors that are intertwined in the calculation of sample sizes or power of a study. Some factors depend on the design of the study, others on the investigator�s choices, and others on the data themselves. The first consideration is the type of response variable. The study design and the response variable will determine the type of statistical method used in the data analysis. For example, if the data are continuous, and two groups are being compared for their means, a t-test may be the appropriate statistical method of analysis. The t-test defines the formula for the sample size or power calculation. The second set of factors depend on the investigator�s choices. He/she needs to define the acceptable levels of significance and power of the study. The sample size may be more a consideration of availability and/or resources than what is necessary to achieve a certain power. The third factor is the variation of the data in the population of interest. It is intuitive to realize that the higher the variation of the data (this may includes measures of variance and correlation among observations), the larger the sample size will have to be to give us enough confidence that we have a good estimate of the mean, for example. The last five items in the slide above are related to each other in the formula defined by specific statistical test used in the study. If one knows four of those values, the fifth will be determined. Therefore, some values will have to either come from previous studies and/or knowledge, or will have to be assumed.

7. Where do we get this knowledge? Previous published studies Pilot studies If information is lacking, there is no good way to calculate the sample size!

8. Population factor Variance of the measure (outcome) within the population

11. Study Design

12. Quantities related to the research question (defined by the researcher)

13. Quantities related to the research question (defined by the researcher)

14. Example: test of difference of means in two populations Researcher fixes probabilities of type I and II errors Prob (type I error) = Prob (reject H0 when H0 is true) = ? Smaller error ? greater precision ? need more information ? need larger sample size Prob (type II error) = Prob (don�t reject H0 when H0 is false) = ? Power =1- ? More power ? smaller error ? need larger sample size

15. Example: test of difference of means in two populations The equation for sample size is derived from the equation for the statistical test In a t-test the equation for the test is t = (x1 - x2) - (m1 - m2)?? ??????????????????(s12? n??+ s?2? n??)??? The derived equation for sample size is

16. Using PASS: t-test example Question: does exercise help to decrease body weight? Study design: participants will be randomized into two groups (exercise and control) Outcome: change in weight Want to detect: a change of at least 15 pounds Known: from past studies, the standard deviation varies between 10 and 15 pounds.

28. Other Types of Hypothesis Tests Different methods of data analysis require different input for sample size calculations

29. Cox Regression (Survival analysis)

30. Logistic Regression

31. Repeated measures

32. Simple designs may not require complex calculations Read chapter 2 of Statistical Rules of Thumb, by Gerald van Belle (2002, John Wiley and Sons) Using specialized software is useful if many calculations will be performed

33. Important to remember Pilot studies do not need sample size calculation!!! There is no point in doing power analysis after the study is done Sample size is an educated guess, and it works only if: The study samples comes from the same or similar populations to the pilot study populations The population of interest is not changing over time The difference or association being studied exists

34. How about Effect Size? Most common definition E = m1 - m2?? ?????????????????spooled If we change de value of E, how do we know what we changed in the formula?

35. Some situations I have encountered Question: �How many more people do I need to enroll in the study (already in progress) to show statistical significance�? Answer: It depends� If the two populations have the same mean, increasing the sample size will not help! Since when is the objective of a study to find a statistically significant result??

36. Some situations I have encountered Researcher is interested in outcome A, which differs very little for two treatments Sample size needed is around 3000!! Researchers changes the outcome to B, where sample size is smaller B does not answer the researcher�s question and he needs to accept that his new treatment is not really different (clinically speaking) from the already existent treatment

37. Some situations I have encountered Researcher is interested in comparing two groups regarding prediction of outcome A by using a regression analysis (using several variables) He uses the only available formula from his statistical book (for a t-test) Wrong! He should find a software that can calculate the sample size appropriately

38. Summary Define research question well Consider study design, type of response variable, and type of data analysis Decide on the type of difference or change you want to detect (make sure it answers your research question) Choose ? and ? Use appropriate equation sample size calculation

Sample Size and Power Calculations

Sample Size and Power Calculations

Presentation Transcript

Sample Size and Power

Sample Size and Power

Power and sample size calculations Michael Væth, University of Aarhus

Power and Sample Size

Sample size calculations

Power and Sample Size

Power and sample size

Example sample size calculations

Sample Size and Power

Sample Size and Power

Sample Size and Power

Introduction to sample size and power calculations

Sample Size and Power Calculations

Power and Sample Size

CRD, Strength of Association, Effect Size, Power, and Sample Size Calculations

Power and Sample Size

Power and Sample Size

Sample Size and Power

Power and Sample Size

Power and Sample Size

Sample Size and Power

Statistical Power And Sample Size Calculations