Create Presentation
Download Presentation

Download Presentation
## Biostat 200 Lecture 6

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Recap**• We calculate confidence intervals to give the most plausible values for the population mean or proportion. • We conduct hypothesis tests of a mean or a proportion to make a conclusion about how our sample mean or proportion compares with some hypothesized value for the population mean or proportion. • You can use 95% confidence intervals to reach the same conclusions as hypothesis tests • If that value is null value of the mean is outside of the 95% confidence intervals, that would be the same as rejecting the null of a two sided test at significance level 0.05. Pagano and Gavreau, Chapter 10**Recap**• We make these conclusions based on what we observed in our sample -- we will never know the true population mean or proportion • If the data are very different from the hypothesized mean or proportion, we reject the null • Example: Phase I vaccine trial – does the candidate vaccine meet minimum thresholds for safety and efficacy? • Statistical significance can be driven by n, and does not equal clinical or biological significance • On the other hand, you might have a suggestive result but not statistical significance with a small sample that deserves a larger follow up study Pagano and Gavreau, Chapter 10**Types of error**• Type I error = significance level of the test =P(reject H0 | H0 is true) • Incorrectly reject the null • We take a sample from the population, calculate the statistics, and make inference about the true population. If we did this repeatedly, we would incorrectly reject the null 5% of the time that it is true if is set to 0.05. Pagano and Gavreau, Chapter 10**Types of error**• Type II error – = P(do not reject H0 | H0 is false) • Incorrectly fail to reject the null • This happens when the test statistic is not large enough, even if the underlying distribution is different**Types of error**• Remember, H0 is a statement about the population and is either true or false • We take a sample and use the information in the sample to try to determine the answer • Whether we make a Type I error or a Type II error depends on whether H0 is true or false • We set , the chance of a Type I error, and we can design our study to minimize the chance of a Type II error**Chance of a type II error**, chance of failing to reject the null if the alternative is true**If the alternative is very different from the null, the**chance of a Type II error is low**If the alternative is not very different from the null, the**chance of a Type II error is high**Chance of a Type II error is lower if the SD is smaller**This is relevant because the SD for the distribution of a sample mean is σ/n So increasing n decreases the SD of the sampling distribution**Finding , P(Type II error)**• Example: Mean age of walking • H0: μ<11.4 months (μ0) • Alternative hypothesis: HA: μ>11.4 months • Known SD=2 • Significance level=0.05 • Sample size=9 • We will reject the null if the zstat (assuming σ known) > 1.654 • So we will reject the null if • For our example, the null will be rejected if X> 1.654*2/3 + 11.4 = 13.9**But if the true mean is really 16, what is the probability**that the null will not be rejected? • The probability of a Type II error? • The null will be rejected if the sample mean is >13.9, not rejected if is ≤13.9 • What is the probability of getting a sample mean of ≤13.9 if the true mean is 16 in a sample with n=9? • P(Z<(13.9-16)/(2/9)) = 0.0008 • So if the true mean is 16 and the sample size is 9, the probability of rejecting the null incorrectly is 0.0008**Note that this depended on:**• The hypothesized alternative population mean (16) • What is the probability of failing to reject the null if the true population is 15? • P(Z<(13.9-15)/(2/9)) = P(Z< -1.6499)= 0.05 • What is the probability of failing to reject the null if the true population mean is 14? • P(Z<(13.9-14)/ (2/9)) =P(Z<-.1499) = 0.44 • The sample size n • What is the probability of failing to reject the null if the true population is 14 and n is 100? • P(Z<(13.9-14)/(2/100)) = P(Z<-.5)= 0.31 • What is the probability of failing to reject the null if the true population mean is 14 and n is 1000? • P(Z<(13.9-14)/(2/100)) = P(Z<-1.581)= 0.06**Power**• The power of a test is the probability that you will reject the null hypothesis when it is not true and it is 1- • Written P(reject H0 | HA ) • You can construct a power curve by plotting power versus different alternative hypotheses • You can also construct a power curve by plotting power versus different sample sizes (n’s) • This curve will allow you to see what gains you might have versus cost of adding participants • Power curves are not linear – you get to a point of diminishing returns**The power of a statistical test is lower for alternative**values that are closer to the null value (the chance of a Type II error is higher) and higher for more extreme alternative values. • The power of a statistical test can be increased by increasing n • You can fix =0.05 and =0.20 (for 80% power) and determine n for various alternative hypotheses • In practice, you often have n fixed by cost • Then you can calculate how big the alternative has to be to reject the null with 80% probability assuming the alternative is true • The difference between this alternative and the null is called the minimum detectable difference • In epidemiology when wanting to estimate an odds ratio it is call the minimum detectable odds ratio**Comparison of two means: paired t-test**• Paired samples, continuous variables • Two determinations on the same person (before and after) – e.g. before and after intervention • Matched samples – measurement on pairs of persons similar in some characteristics, i.e. identical twins (matching is on genetics) • Matching or pairing is performed to control for extraneous factors • Each person or pair has 2 data points, and we calculate the difference for each • Then we can use our one-sample methods to test hypotheses about the value of the difference Pagano and Gavreau, Chapter 11**Two independent samples**Paired data**Comparison of two means: paired t-test**• Step 1: The hypothesis (two sided) • Generically H0: μ1-μ2 =δ HA: μ1-μ2 ≠δ • Often δ=0, no difference So H0: μ1-μ2 =0, i.e. H0: μ1=μ2 HA: μ1-μ2 ≠0, i.e. HA: μ1≠μ2**Comparison of two means: paired t-test**• Step 1: The hypothesis (one sided) • Generically H0: μ1-μ2 ≥δ or H0: μ1-μ2 ≤δ HA: μ1-μ2 <δH0: μ1-μ2 <δ • Often δ=0, no difference So H0: μ1 ≥ μ2 or H0: μ1 ≤ μ2 HA: μ1 < μ2 HA: μ1 > μ2**Comparison of two means: paired t-test**• Step 2: Determine the compatibility with the null hypothesis The test statistic is**Comparison of two means: paired t-test**• Step 3: Reject or fail to reject the null • Is the p-value (the probability of observing a difference as large or larger, under the null hypothesis) greater than or less than the significance level, ?**Paired t-test example**list +-----------+ | wt1 wt2 | |-----------| 1. | 55 58 | 2. | 90 87 | 3. | 48 55 | 4. | 76 65 | 5. | 64 77 | |-----------| 6. | 84 92 | 7. | 56 57 | 8. | 81 76 | 9. | 92 93 | 10. | 65 72 | +-----------+**. gen wtdiff=wt1-wt2**. summ wtdiff, detail wtdiff ------------------------------------------------------------- Percentiles Smallest 1% -13 -13 5% -13 -8 10% -10.5 -7 Obs 10 25% -7 -7 Sum of Wgt. 10 50% -2 Mean -2.1 Largest Std. Dev. 7.093816 75% 3 -1 90% 8 3 Variance 50.32222 95% 11 5 Skewness .3296938 99% 11 11 Kurtosis 2.396264 . di -2.1/(7.094/sqrt(10)) -.93611264 . di 2*ttail(9,.936113) .37365191**. ttest wt1==wt2**Paired t test ------------------------------------------------------------------------------ Variable | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- wt1 | 10 71.1 4.933896 15.60235 59.93875 82.26125 wt2 | 10 73.2 4.535784 14.34341 62.93934 83.46066 ---------+-------------------------------------------------------------------- diff | 10 -2.1 2.243262 7.093816 -7.17461 2.97461 ------------------------------------------------------------------------------ mean(diff) = mean(wt1 - wt2) t = -0.9361 Ho: mean(diff) = 0 degrees of freedom = 9 Ha: mean(diff) < 0 Ha: mean(diff) != 0 Ha: mean(diff) > 0 Pr(T < t) = 0.1868 Pr(|T| > |t|) = 0.3736 Pr(T > t) = 0.8132 .**Comparison of two means: t-test**• The goal is to compare means from two independent samples • Two different populations • E.g. vaccine versus placebo group • E.g. women with adequate versus in adequate micronutrient levels Pagano and Gavreau, Chapter 11**Comparison of two means: t-test**• Two sided hypothesis H0: μ1=μ2 ... μ1-μ2=0 HA: μ1≠μ2 μ1-μ2≠0 • One sided hypothesis H0: μ1≥μ2 μ1-μ2 ≥ 0 HA: μ1<μ2 • One sided hypothesis H0: μ1≤μ2 HA: μ1>μ2 Pagano and Gavreau, Chapter 11**Comparison of two means: t-test**• Even though the null and alternative hypotheses are the same as for the paired t-test, the test is different, it is wrong to use a paired t-test with independent samples and vice versa Pagano and Gavreau, Chapter 11**Comparison of two means: t-test**• By the CLT, is normally distributed with mean μ1-μ2 and standard deviation • In one version of the t-test, we assume that the population standard deviations are equal, so σ1 = σ2 = σ • Substituting, the standard deviation for the distribution of the difference of two sample means is • So we can calculate a z-score for the difference in the means and compare it to the standard normal distribution. The test statistic is Pagano and Gavreau, Chapter 11**Comparison of two means: t-test**• If the σ’s are unknown (pretty much always), we substitute with sample standard deviations, s, and compare the test statistic to the t-distribution • t-test test statistic • The formula for the pooled SD is a weighted average of the individual sample SDs • The degrees of freedom for the test are n1+n2-2 Pagano and Gavreau, Chapter 11**Comparison of two means: t-test**• As in our other hypothesis tests, compare the t statistic to the t-distribution to determine the probability of obtaining a mean difference as large or larger as the observed difference • Reject the null if the probability, the p-value, is less than , the significance level • Fail to reject the null if p≥ **Comparison of two means, example**• Study of non-pneumatic anti-shock garment (Miller et al) • Two groups – pre-intervention received usual treatment, intervention group received NASG • Comparison of hemorrhaging in the two groups • Null hypothesis: The hemorrhaging is the same in the two groups H0: μ1=μ2 HA: μ1≠μ2 • The data: • External blood loss: • Pre-intervention group (n=83) mean=340.4 SD=248.2 • Intervention group (n=83) mean=73.5 SD=93.9 Pagano and Gavreau, Chapter 11**Calculating by hand**• External blood loss: • Pre-intervention group (n=83) mean=340.4 SD=248.2 • Intervention group (n=83) mean=73.5 SD=93.9 • First calculate sp2 = (82*248.22 + 82*93.92)/(83+83-2) = 35210.2 tstat = (340.4-73.5)/sqrt(35210.2*(2/83)) = 9.16 df=83+83-2=164 . di 2*ttail(164,9.16) 2.041e-16**Comparison of two means, example*** ttesti n1 mean1 sd1 n2 mean2 sd2 ttesti 83 340.4 248.2 83 73.5 93.9 Two-sample t test with equal variances ------------------------------------------------------------------------------ | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- x | 83 340.4 27.24349 248.2 286.204 394.596 y | 83 73.5 10.30686 93.9 52.99636 94.00364 ---------+-------------------------------------------------------------------- combined | 166 206.95 17.85377 230.0297 171.6987 242.2013 ---------+-------------------------------------------------------------------- diff | 266.9 29.12798 209.3858 324.4142 ------------------------------------------------------------------------------ diff = mean(x) - mean(y) t = 9.1630 Ho: diff = 0 degrees of freedom = 164 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000 Pagano and Gavreau, Chapter 11**You can calculate a 95% confidence interval for the**difference in the means • If the confidence interval for the difference does not include 0, then you can reject the null hypothesis of no difference • This is NOT equivalent to calculating separate confidence intervals for each mean and determining whether they overlap**Comparison of two means: t-test**• This t-test assumes equal variances in the two underlying populations • Assumes that both samples come from populations, each with variance σ2 • If we do not assume equal variances we use a slightly different test statistic • Variances not assumed to be equal, so you do not use a pooled estimate • There is another formula for degrees of freedom • Often the two different t-tests yield the same answer, but it’s important to check if you don’t have good reason to assume equivalence Pagano and Gavreau, Chapter 11**The t statistic is**Round up to the nearest integer to get the degrees of freedom**Comparison of two means, example**ttesti 83 340.4 248.2 83 73.5 93.9, unequal Two-sample t test with unequal variances ------------------------------------------------------------------------------ | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- x | 83 340.4 27.24349 248.2 286.204 394.596 y | 83 73.5 10.30686 93.9 52.99636 94.00364 ---------+-------------------------------------------------------------------- combined | 166 206.95 17.85377 230.0297 171.6987 242.2013 ---------+-------------------------------------------------------------------- diff | 266.9 29.12798 209.1446 324.6554 ------------------------------------------------------------------------------ diff = mean(x) - mean(y) t = 9.1630 Ho: diff = 0 Satterthwaite's degrees of freedom = 105.002 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 1.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 0.0000 • Note that the t-statistic stayed the same. This is because the sample sizes in each group are equal. When the sample sizes are not equal this will not be the case • The degrees of freedom are decreased, so if the sample sizes are equal in the two groups this is a more conservative test**Test of the means of independent samples**• When you have the data in Stata, with the different groups in different columns, use ttest var1==var2, unpaired or ttest var1==var2, unpaired unequal • More often, you will have the data all in one variable, and the grouping in another variable. Then use ttest var, by(groupvar) or ttest var, by(groupvar) unequal**Testing whether BMI in our class data set differs by sex**Null hypothesis: BMI of females = BMI of males . ttest bmi, by(sex) Two-sample t test with equal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- male | 292 22.84854 .2370304 4.050377 22.38203 23.31506 female | 241 24.76146 .1948815 3.025374 24.37757 25.14536 ---------+-------------------------------------------------------------------- combined | 533 23.71348 .1621327 3.743124 23.39499 24.03198 ---------+-------------------------------------------------------------------- diff | -1.912917 .3153224 -2.53235 -1.293485 ------------------------------------------------------------------------------ diff = mean(male) - mean(female) t = -6.0665 Ho: diff = 0 degrees of freedom = 531 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000**. ttest bmi, by(sex) unequal**Two-sample t test with unequal variances ------------------------------------------------------------------------------ Group | Obs Mean Std. Err. Std. Dev. [95% Conf. Interval] ---------+-------------------------------------------------------------------- male | 292 22.84854 .2370304 4.050377 22.38203 23.31506 female | 241 24.76146 .1948815 3.025374 24.37757 25.14536 ---------+-------------------------------------------------------------------- combined | 533 23.71348 .1621327 3.743124 23.39499 24.03198 ---------+-------------------------------------------------------------------- diff | -1.912917 .3068586 -2.515736 -1.310099 ------------------------------------------------------------------------------ diff = mean(male) - mean(female) t = -6.2339 Ho: diff = 0 Satterthwaite's degrees of freedom = 525.975 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(T < t) = 0.0000 Pr(|T| > |t|) = 0.0000 Pr(T > t) = 1.0000**Confidence interval for the difference of two means from**independent samples, when unequal variances are assumed**Comparison of two proportions**• Similar to comparing two means • Null hypothesis about two proportions, p1 and p2, H0: p1= p2 HA: p1≠ p2 • If n1 and n2 are sufficiently large, the difference between the two proportions follows a normal distribution. So we can use the z statistic to find the probability of observing a difference as large as we do, under the null hypothesis of no difference Pagano and Gavreau, Chapter 14**Comparison of two proportions**• Example: HIV prevalence among the men and women testing at Mulago Hospital • N=466 males; 107 (23.0%) tested HIV+ • N=467 females; 162 (34.7%) tested HIV+ • Null hypothesis: The HIV prevalence in males and females is the same? • H0: p1= p2 • Z statistic is calculated: • p̂ = (107+162)/(466+467) = 0.288 • zstat = (.230-.347)/sqrt[ .288*(1-.288)*(1/466+1/467) • =-.117/.030 = -3.95 • 2*p(Z<-.395) < 0.001 Pagano and Gavreau, Chapter 14**Comparison of two proportions**prtesti n1 p1 n2 p2 prtesti 466 .23 467 .347 Two-sample test of proportion x: Number of obs = 466 y: Number of obs = 467 ------------------------------------------------------------------------------ Variable | Mean Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- x | .23 .0194947 .1917911 .2682089 y | .347 .0220274 .3038271 .3901729 -------------+---------------------------------------------------------------- diff | -.117 .0294151 -.1746525 -.0593475 | under Ho: .0296673 -3.94 0.000 ------------------------------------------------------------------------------ diff = prop(x) - prop(y) z = -3.9437 Ho: diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(Z < z) = 0.0000 Pr(|Z| < |z|) = 0.0001 Pr(Z > z) = 1.0000 Pagano and Gavreau, Chapter 14**Comparison of two proportions**. prtest hiv, by(sex) Two-sample test of proportion 0: Number of obs = 466 1: Number of obs = 467 ------------------------------------------------------------------------------ Variable | Mean Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- 0 | .2296137 .0194832 .1914274 .2678001 1 | .3468951 .0220258 .3037253 .3900649 -------------+---------------------------------------------------------------- diff | -.1172813 .0294063 -.1749167 -.059646 | under Ho: .0296598 -3.95 0.000 ------------------------------------------------------------------------------ diff = prop(0) - prop(1) z = -3.9542 Ho: diff = 0 Ha: diff < 0 Ha: diff != 0 Ha: diff > 0 Pr(Z < z) = 0.0000 Pr(|Z| < |z|) = 0.0001 Pr(Z > z) = 1.0000 Pagano and Gavreau, Chapter 14**Comparison of several means: analysis of variance**• A different test is needed when we want to test for differences among more than two independent samples • This test is called the Analysis of Variance, or ANOVA • Null hypothesis H0: all equal means μ1=μ2=μ3=… • The alternative HA is that at least one of the means differs from the others Pagano and Gavreau, Chapter 12**Testing each pair of means would increase the chances of**making a type I error (rejecting the null when it is true) • Say the null of no difference is true • Then P(Type I error) = P(reject H0 | H0 is true) = for each test • P( no Type I error on 1 test) = 1- • P( no Type I error on all 3 tests) = (1- )3 • If =0.05, then (1- )3 = .857 • The P(at least 1 type I error on 3 tests) = 1-.875=.143 • So there is a reasonable chance (.143)that the null will be rejected even if it is true • This is known as the multiple comparison problem**Comparison of several means: analysis of variance**• Why is it called analysis of variance? • The test compares the between-group variability (how different are the groups from the overall mean) to the within-group variability Pagano and Gavreau, Chapter 12