Comparing Two Population Means. Two kinds of studies or experiments. There are two general research strategies that can be used to compare the two populations of interest:
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Comparing Two Population Means
There are two general research strategies that can be used to compare the two populations of interest:
between subjects or independent samples design
We will focus on the independent samples case first.
OBSERVATIONAL STUDY or EXPERIMENT?
Observational Comparative Study
Population 1m1 & s1
Population 2m2 & s2
Sample size = n1
Calculate:
Sample size = n2
Calculate:
m1vs.m2
To make inferences use:
Hypothesis test, CI for difference in means, effect size (d)
Comparative Experiment
Randomly assign
Treatment 1m1 & s1
Treatment 2m2 & s2
Sample size = n1
Calculate:
Sample size = n2
Calculate:
Population
m1vs.m2
To make inferences use:
Hypothesis test, CI for difference in means, effect size (d)
In general, we can always compare two means by seeing
how their difference (m1 – m2) compares to 0:
Note: If we wanted to establish that one mean was say e.g. at least 10 units larger than the other we could replace 0 in these statements by 10. In general to establish a difference of at least Dunits then we replace 0 by D.
For testing equality
Ho: m1 = m2 or (m1 – m2) = 0
The possible alternatives are:
HA: m1 > m2 or (m1 – m2) > 0 (upper-tail)
HA: m1 < m2 or (m1 – m2) < 0 (lower-tail)
HA: 1 2 or (m1 – m2) 0 (two-tailed)
Recall our research hypothesis is the females have a higher mean body temperature than males, therefore we have…
mF = mean body temperature for females
mM= mean body temperature for males
Ho: mF = mM or equivalently (mF – mM) = 0
HA: mF > mM or equivalently (mF – mM) > 0
The basic form of the two-sample ttest statistic is...
which assuming the following assumptions are satisfied has an approximate t-distribution (df(see 3 below)).
Yes, use pooled t-test
No, use Welch’s t-test
The standard error of the difference in sample means is calculated to different ways depending on whether or not we assume the population variances are equal.
i.e., Can we assume ?
Estimate of the common variance (s2)
Estimate the standard error of the difference using the common pooled variance :
where
Then the sampling distribution is a t-distribution with n1+n2-2 degrees of freedom (df).
Rule O’ Thumb:Assume variances are equal only if neither sample standard deviation is more than twice that of the other sample standard deviation.
Always round down!
Estimate the standard error of the difference as:
Then the sampling distribution is an approximate t distribution with a complicated formula for d.f.
To quantify the size of the effect, i.e. the difference in the population means we use…
(estimate) + (table value) SE(estimate) basic form
STEP 1) State Hypotheses
mF = mean body temperature for females
mM= mean body temperature for males
Ho: mF = mM or equivalently (mF – mM) = 0
HA: mF > mM or equivalently (mF – mM) > 0
STEP 2) Determine Test Criteria
a) Choose a=.05 as a Type I Error is of little consequence.
b) Use two-sample t-test, either pooled t-test or Welch’s t-test. As to which form to use we need to examine our data in terms of the equality of population variances. If uncertain use Welch’s!
STEP 3) Collect Data and Compute Test Statistic
Take independent samples from the two populations and examine the resulting data.
Oneway Analysis > Means/Anova/Pooled t
Populations appear normally distributed and our sample sizes are “large” (> 30).
Few mild outliers for in sample for females
Sample mean for females appears to slightly larger than that for males
Variation appears to be similar for both samples.
The sample standard deviation for females (sF = .74) is larger than that for males (sM = .70) although it is not twice as large, thus we assume pop. variances are equal.
STEP 3) Collect Data and Compute Test Statistic
Because it seems reasonable to assume the population variances are approximately equal we will use apooled t-test.
For the body temperature example
Assuming equal variances, the pooled estimate of common variance is
So the standard error of the difference in sample means is
P-value = .0121
0
t = 2.28
For the body temperature example
Computing test statistic gives
STEP 4) Compute p-value
The p-value = .0121, which indicates we have a 1.21% chance of observing a difference in sample means this large by chance variation alone if in fact the population means were equal.
STEP 5) Make Decision and Interpret
Because p-value < .05 we reject Ho and conclude that the mean normal body temperature for females is larger than that for males.
STEP 6) Quantify Significant Findings
Construct a 95% CIfor (mF – mM)
(98.39 – 98.10) + (1.98)(.127) = (.039o , .541o)
We estimate the normal mean body temperature for women is between .039o F to .541o F larger than the normal mean body temperature for men.
STEP 6) Quantify Significant Findings
Effect Size (d)
Thus effect size is moderate at best with % overlap of the two body temperature distributions being around 72.6%. Furthermore, the absolute difference represented by the lower confidence limit (LCL = .039 deg F) hardly seems of any physiological importance.
Power is a function of:
Common standard deviation to both populations/groups
D = difference in population means which makes alternative true
Common sample size (n1 = n2 = n)
Probability of rejecting Ho when difference in means is D.
Error Std. Dev = common standard deviation (s) = .721 for body temperature example.
Difference in Means = D = |m1 – m2| = .29 which is the difference in the samples for body temperature example.
Alpha = P(Type I Error) = .05
For body temperature example we have a power of 1 – b = .6240.
Error Std. Dev = common standard deviation (s) = .721 for body temperature example.
Difference in Means = D = |m1 – m2| = .29 which is the difference in the samples for body temperature example.
Alpha = P(Type I Error) = .05
We wish to compare the mean gestational age (in weeks) of babies born to women with preeclampsia during pregnancy vs. those who had normal pregnancies.
Data:Preeclampsia: 38, 32, 42, 30, 38,35, 32, 38, 39, 29, 29, 32Normal: 40, 41, 38, 40, 40, 39, 39, 41, 41, 40, 40, 40
The sample standard deviations control the slope of the reference lines, the line for preeclamptics is over 4 times steeper.
The SD for the preeclamptic group is over 4 times larger!!
Formally testing equality of variances:
Test Statistic:
F = (4.40)2/(.900)2 = 23.89
i.e. the sample variance for the preeclamptic group is 23.89 times larger than the sample variance for the normal pregnancy group. The p-value comes from an a F-dist. with numerator df = 11, denominator df = 11.
We have strong evidence against equality of the variance of the response for these two populations of women (p < .0001) therefore we should not use a pooled t-Test!
F-distribution
0.8
0.6
Then the P-value < .0001
0.4
0.2
0.0
0
10
20
30
40
Let’s say our observed value for F was F0 = 23.89
When H0 is true, the F-ratio ~ F(df1,df2)
For example, consider the F-distribution with 11 and 11 df
Oneway Analysis > t Test (i.e. non-pooled t–Test)
We strong evidence that the mean gestational age of babies born to preeclamptic mothers is lower than that for babies born to women with normal pregnancies (p = .0013). Furthermore, we estimate that the mean gestational age for babies born to preeclamptic mothers is between 2.59 and 8.24 weeks less than the mean gestational age for babies born to mothers with normal pregnancies.