- 146 Views
- Uploaded on
- Presentation posted in: General

Chapter 22 Comparing Two Proportions

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Chapter 22

Comparing Two Proportions

- Comparisons between two percentages are much more common (and interesting) than questions about isolated percentages.
- Why? We often want to know how two groups differ, whether a treatment is better than a placebo control, or whether this year’s results are better than last year’s.

Suppose you are interested in

whether men and women differ with

regard to how often they wash their

hands in public restrooms?

- In order to examine the difference between two proportions, we need another standard deviation formula…
- Recall that standard deviations don’t add, but variances do.

Two empty fields are used as parking lots for concerts and festivals. The number of vehicles that can park in Lot A has a mean of 219 and standard deviation of 13. Lot B can hold an average of 193 cars with a standard deviation of 11.

a.What is the expected difference for the number of vehicles parked in the two lots.

b.Find the standard deviation of that difference.

- Proportions observed in independent random samples are independent. Thus, we can add their variances. So…
- The standard deviation (really “standard error”) of the difference between two sample proportions is

- SRS (or RAT):EACH sample is a SRS from its’ own population (or 2 experimental groups randomly assigned to treatments)
- 10% Condition:n1 and n2 are both <10% of their respective populations
- Sample Size Condition (normality): Both groups are big enough that at least 5 successes and 5 failures have been observed in each.
- Independent Samples: The two groups we’re comparing must be independent of each other. (ie, drawn independently)

- We already know that for large enough samples, each of our proportions has an approximately Normal sampling distribution.
- The same is true of their difference.

- When the conditions are met, we are ready to find the confidence interval for the difference of two proportions:
- The confidence interval is
where

We are ___% confident that the

true proportion of [p1 in context]

is between ___% and ___%

[more/less] than the proportion of

[p2 in context].

Suppose you are interested in whether men and women differ with

regard to how often they wash their hands in public restrooms?

Researchers monitored the behavior of public restroom users at

major venues such as Turner Field and Grand Central Station and

found that 2393 out of 3206 men washed their hands and 2802 of

3130 women washed their hands. Create a 95% confidence interval

to describe the difference.

Example

At Community Hospital, the burn center is experimenting with a new plasma compress treatment. A random sample of 316 patients with minor burns received the plasma compress treatment. Of these patients, it was found that 259 had no visible scars after treatment. Another random sample of 419 patients with minor burns received no plasma compress treatment. For this group, it was found that 94 had no visible scars after treatment. What is a 95% confidence interval of the difference in proportion of people who had no visible scars between the plasma compress treatment & control group?

pT: proportion of ppl who received plas comp treatment & had no visible scars

pN: proportion of ppl who did NOT receive plas comp treat & had no visable scars

Since these are all burn patients (come from the same pop.), we can add 316 + 419 = 735.

If not the same – you MUST list separately.

- Have 2 independent, randomly assigned treatment groups
- 735<10% of all burn patients
- nTpT=259, nTqT=57, nNpN=94, nNqN=325 -> all > 5, so can use Normal model (all the p’s and q’s have hats)

2-Proportion

Z-Interval

We are 95% confident that the true proportion of people who received plasma compress treatment and had no visible scars was between 53.7% and 65.4% more that the proportion of those who didn’t receive the treatment.

Ch22 (page 433) #6

- In 1995, 24.8% of 550 white adults surveyed reported that they smoke cigarettes, while 25.7% of the 550 black adults surveyed were smokers.
- Create a 90% confidence interval for the difference in percentages of smokers among black and white American adults.
- Does this survey indicate a race-based difference in smoking among American adults?

- The typical hypothesis test for the difference in two proportions is the one of no difference.
- In symbols, H0: p1 – p2 = 0
- Or H0: p1 = p2

H0: p1 = p2

H0: p1 - p2 = 0

Ha: p1 - p2 > 0

Ha: p1 - p2 < 0

Ha: p1 - p2 ≠ 0

Be sure to define both p1 & p2!

Ha: p1 > p2

Ha: p1 < p2

Ha: p1 ≠ p2

- Remember that when you find the SD in a hypothesis test, you use the p from the H0
- Since we are hypothesizing that there is no difference between the two proportions, that means that p1 and p2 are the same, and so are their standard deviations.
- Since this is the case, we combine (pool) the counts to get one overall proportion.

- The pooled proportion is
If the numbers of successes are not whole numbers, round them first. (This is the only time you should round values in the middle of a calculation.)

- We then put this pooled value into the formula, substituting it for both sample proportions in the standard error formula:

- [P] Define p1 and p2 (in words)
- [H] We are testing the hypothesis H0: p1 = p2
- [no difference between p1 = p2]
- Alternative hypothesis either
- HA: p1 > p2 or HA: p1 < p2 or HA: p1≠p2
- [A] The assump/cond for the two-proportion z-test are the same as for the two-proportion z-interval.
- [N] Name the test [2-proportion z-test]

- [T] State signif. level (usually α = .05)
- Because we hypothesize that the proportions are equal, we pool them to find
- We use the pooled value to estimate the standard error:

- Now we find the test statistic:

Usually

p1 – p2 =0

- [O] Use the Normal model to obtain a P-value.
- [M] Make a decision
- Since the p-value ([state p-value]) is [less than / greater than]α ([state α]), I will [reject / fail to reject] the null hypothesis.
- [S] State a conclusion in context.
- There [is / is not] sufficient evidence to suggest that [state HA in words]

Example

A forest in Oregon has an infestation of spruce moths. In an effort to control the moth, one area has been regularly sprayed from airplanes. In this area, a random sample of 495 spruce trees showed that 81 had been killed by moths. A second nearby area receives no treatment. In this area, a random sample of 518 spruce trees showed that 92 had been killed by the moth. Do these data indicate that the proportion of spruce trees killed by the moth is different for these areas?

pt : the proportion of trees killed by moths in the treated area

pu : the proportion of trees killed by moths in the untreated area

- Conditions:
- Both samples of spruce trees are SRS and independently selected
- since ntpt=81, ntqt=414, nupu=92, nuqu=426 and all > 5, can use Normal model (all these p’s and q’s have hats)
- Reasonable to assume that 1013 is less than 10% of all spruce trees.

H0: pt=pu

Ha: pt≠pu

2-proportion

Z-test

xt=81xu=92

nt=495 nu=518

α = .05

P-value = 0.5547

Since p-value (.5547) > a(.05), I fail to reject H0. There is not sufficient evidence to suggest that the proportion of spruce trees killed by the moth is different for the treated and untreated areas.