Sample Size Calculation

Sample Size Calculation ผศ.ดร. ขวัญเกศ กนิษฐานนท์ คณะสัตวแพทยศาสตร์ มหาวิทยาลัยขอนแก่น

Why • To complete research proposal • Reduce unnecessary expense (time, labor, money, materials) • Avoid useless research

Type of Experiment • 1. Survey • Observational Study • 2. Test a Hypothesis • Observational or Experimental Study

1. Sample Size for Prevalence Survey • For dichotomous data (only 2 outcomes; sick/not sick, male/female, dead/alive) • The study describe results in percentage • For example, disease prevalence survey

Sample Size for Prevalence Survey • P = Estimated prevalence (percentage) • Q =1-P • L = Allowable Error

Definition • P = Estimated prevalence (percentage) • From pilot study, published papers, experience • Q =1-P • L = Allowable Error • เปอร์เซ็นต์ที่ยอมให้คลาดเคลื่อนได้จากค่าจริง (ผู้วิจัยระบุเอง; ไม่น่าจะเกิน 1 ใน 5 ของค่า P) • L and Q and P are in the same unit

- L +L L; Allowable Error • Suppose, the surveywants to estimate the true prevalence of a disease in population • The estimate we get from the survey will be within +/- L% of the true prevalence

Example • A survey is to estimate prevalence of influenza virus infection in school kids • Suppose the available evidence suggests that approximately 20% (P=20) of the children will have antibodies to the virus • Assume the investigator wants to estimate the prevalence within 6% of the true value (6% is called allowable error; L)

Example • The required sample size is • n = (4 x 20 x 80) / (6 x 6) = 177.78 • Thus approximately 180 kids would be needed for the survey Note: No population size involves in the formula

2. Sample Size for Estimation of the Mean • A Survey to find an average of a parameter (birth weight, antibody titre, blood pressure) • The study reports average of parameters • The parameter must be quantitative

- L +L Sample Size for Estimation of the Mean • S = Standard Deviation of the parameter • L = Allowable Error • S and L are in the same unit • The average we find in the survey will be within +/- L of the true population mean

Example • Suppose an investigator has some evidence suggests that the standard deviation of rat weight is about 455 g • He wishes to provide an estimate within 80 g of the true average (80 g is the allowable error; L)

Example • The required sample size is n = 4 x (455)2 / (80)2 = 129.39 • Thus approximately 130 rats would be needed.

3. Sample Size to Compare Percentages • A study to compare percentages of outcomes from different groups (incidence, conversion rate, cure rate, mortality rate, survival rate) • For chi-square analysis or logistic regression (one predictor)

Sample Size to Compare Percentages • Pc = percentage from control group • Qc = 1- Pc • Pe = Percentage from the experimental group • Qe = 1- Pe Pick 2 groups that you think will be most different

Sample Size to Compare Percentages • d = Difference between the two groups (must be positive) • C = Constant (See table next page) Pc and Pe are from pilot study or published papers

C : Constant • When power is 80% • Power = Ability to find significance when the two groups are really different (the formula is for two sided difference)

Example • สมมุติว่าต้องการทดสอบว่ากลุ่มควบคุมต่างกับกลุ่มให้ยาหรือไม่ในการรักษาโรคชนิดหนึ่ง สิ่งที่วัดในการทดลองคืออัตราการรอด (survival rate) ในแต่ละกลุ่ม • Pc = 0.25, Pe = 0.65, then d = 0.4 and choose alpha = 0.05

Example = 27.36 = use 28 animals in each group

Example 2 • The research question is whether smokers have a greater incidence of skin cancer than nonsmokers • A review of previous literature suggests that the incidence of skin cancer is about 0.2 in nonsmokers • At alpha=0.05, and power=80%, how many smokers and nonsmokers will need to be studied to determine whether skin cancer incidence is at least 0.3 in smokers?

Example 2 • Null Hypothesis : The incidence of skin cancer does not differ in smokers and nonsmokers • Alternative Hypothesis : The incidence of skin cancer is different between smokers than nonsmokers • (Note that this is a two-tailed hypothesis)

Example 2 • Pe = 0.3, Pc = 0.2 = 312.45 = use 313 persons in each group

4. Sample Size to Compare Means • Hypothesis: Compare means of different groups • The parameters are quantitative (birth weight, blood pressure) • Select 2 groups that you think they will be most different (such as; a control and a treatment group) • For t-test, ANOVA

Sample Size to Compare Means • S = Standard Deviation of the variable • d = Difference between the 2 groups • C = Constant (from previous table)

Example • The research question is to compare the efficacy of metaproterenol and theophylline in the treatment of asthma • The outcome variable is FEV1 (forced expiratory in 1 second) 1 hour after treatment • A previous study has reported that the mean FEV1 in persons with treated asthma was 2.0 litres, with a standard deviation of 1.0 litre • The investigator would like to be able to detect a difference of 10% or more in mean FEV1 between the two treatment groups

Example • Null Hypothesis : Mean FEV1 is the same in asthmatics treated with theophylline as in those treated with metaproterenol • Alternative Hypothesis : Mean FEV1 is different between asthmatic patients treated with theophylline and those treated with metaproterenol • (This is a two-tailed hypothesis)

Example • S = 1 • d = 10% of 2 litre = 0.2 litre n = 393.5 : Then use 394 patients in each group

5. Paired Study • Pre-test/Post-test • Before/After treatment • Paired t-test analysis • More powerful than unpaired study

Example • From pilotstudy, Before and After treatment of the average of blood pressures are estimated to be 120 and 80, respectively • S = 38 n = 8.84 : Then use 9 patients in each group

What affect sample size • Wants small n ? • 1. Prevalence study • Large L • Maximized at P = 50% • 2. Mean study • Small standard deviation • Large L

What affect sample size • 3. Comparing percentages • Large d • 4. Comparing means (paired and unpaired) • Small standard deviation • Large d • Paired study uses less samples

The End

Sample Size Calculation