- 116 Views
- Uploaded on
- Presentation posted in: General

Sampling

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Sampling

- You want to make a general statement about a large group of people (a population).
- The population size makes studying everyone impractical.
- You select a part of the group (a sample) to study. You measure numerical facts of interest (parameters) for the sample.
- Use statistics to generalize (infer) from the sample to the population.

Alf Landon (Republican)

Franklin Roosevelt (Democrat)

- To predict the winner, Literary Digest magazine mailed out 10 million questionnaires to addresses from telephone books and vehicle registrations.
- 2.4 million responded: 57% said they’d vote for Landon
- The election result:

- To predict the winner, Literary Digest magazine mailed out 10 million questionnaires to addresses from telephone books and vehicle registrations.
- 2.4 million responded: 57% said they’d vote for Landon
- The election result: Roosevelt won 62%-38%.
- (Literary Digest soon went bankrupt)

- How was the LD sample different from the population of all voters?
- Consider what kind of people had phones and cars in 1936, and which party those kind of people tended to vote for.
- The LD sample systematically favored wealthier people, and wealthier people tended to vote Republican.

- Selection bias: a systematic tendency of the sampling procedure to exclude a portion of the population
- Example: randomly choosing from a phone book

- Non-response bias: a tendency of survey respondents to be different from those who didn’t respond.
- Sometimes indicated by a large non-response rate

- If a sampling procedure is biased, a larger sample size won’t help.
- Bias can’t always be detected by looking at data. You have to ask how the sample was chosen.
- So…did pollsters fix the bias issue?

Thomas Dewey (Republican)

Harry Truman (Democrat)

- Three major polls covered the election. All used large sample sizes.
- These polls all used a different method of sampling than Literary Digest.

- Three major polls covered the election. All used large sample sizes.
- These polls all used a different method of sampling than Literary Digest.

- Goal: Create a sample which faithfully represents the target population with respect to key characteristics.
- Implementation: Define categories of interest (e.g. residence, sex, age, race, income, etc.). Establish a fixed number of subjects to interview overall and in each category. Interviewers select freely within categories.

- Example: A Gallup poll interviewer was required to interview 13 people.
- 6 from suburbs, 7 from city
- 7 men, 6 women
- Of the men (and similarly for women)
- 3 under age 40, 4 over age 40
- 1 black, 6 white

- Of the white men,
- 1 paid over $44 monthly rent, 2 paid less than $18

- The Gallup poll seems to guarantee the sample will be like the voting population in every meaningful way. What happened?
- The interviewers were free to select within categoriesand this introduced bias.
- In 1948, Republicans (in each category) were marginally easier to reach for interviews because they tended to be wealthier, better educated, own telephones, have addresses, etc.

- The bias in quota sampling is generally unintentional on the part of interviewers.
- Prior to 1948, Democratic majority was so large, this bias didn’t show up. In a close race, the bias was significant.
- Can we remove this bias from an otherwise sensible approach to sampling?

- Interviewers have no discretion at all as to whom they interview
- Sampling procedure intentionally involves chance variation.
- Investigators can compute the probability that any particular individual will be selected.
- Quota sampling fails these tests.

- Simple Random Sampling: Each individual is given a number. Numbers are drawn at random without replacement.
- Each person has an equal chance of being selected
- As sample size increases, the sample proportion for each parameter approaches the population proportion (Law of Averages)
- Still impractical for very large populations

- Cluster sampling:
- Divide population into “natural” groups.
- Randomly choose which groups to study.
- Randomly select individuals from the chosen groups.
- Can be done in stages, dividing each group into subgroups several times

- Post-1948 Gallup Poll sampling method

- A degree of bias is inevitable in any survey.
- Using probability introduces chance error (also called sampling error).
- Nonetheless, improvements are noticeable.