250 likes | 425 Views
Chapter 12. Sample Surveys. Idea 1: Take a Sample. Examine a part of the whole. Population. Sample. Idea 1: Take a Sample. Population Group of people we want information from Examples: Registered voters in US ISU undergraduates Generally large
E N D
Chapter 12 Sample Surveys
Idea 1: Take a Sample • Examine a part of the whole. Population Sample
Idea 1: Take a Sample • Population • Group of people we want information from • Examples: • Registered voters in US • ISU undergraduates • Generally large • Impractical or too expensive to talk to everyone
Idea 1: Take a Sample • Sample • Smaller group of people from population • Examples: • 200 registered voters • 100 ISU undergrads • Group we get information from
Properties of a Sample • Would like the sample to be representative of the population. • This may not be possible, but at least we would like a sample that is not biased.
Idea 2: Select the Sample Randomly • Controls for factors that you know in the data • Examples: Gender, Race, Religion, etc. • Controls for factors you don’t know in data • Allows you to make inferences about Population • The point of Statistics • Without random selection, your sample does not tell you anything about population • Selecting items for the sample should be done at random so as to reduce the chance of getting a biased sample.
Idea 3: Sample Size Matters • Size of sample matters • Fraction of the population sampled is not important! • Want sample to be fairly large • Why not do a census? • Impractical • Expensive • Difficult to do • Populations are often dynamic • Can be more complex
Terminology • Information (what do we want to know?) • Examples: • Percent of Registered Voters that would vote for a candidate. • Mean age of ISU undergraduates • Population • Parameter • Percent of all registered voters that will vote for a candidate • Mean age of all ISU undergrads • Sample • Statistic • Percent of the sample that will vote for a candidate • Mean age of sample
Terminology • Population: All students at ISU. • Question: Are the hours the Park’s Library is open convenient? • Population parameter: Proportion of all ISU students who would answer yes. • Sample: 400 ISU students. • Sample statistic: the proportion of the 400 students in the sample who say yes.
Parameters and Statistics • Most common parameters and statistics
How do we select the 400? • Put an ad in the ISU Daily with the question and ask students to drop off their answers. • Stand in front of the library and ask the first 400 students who come by.
Simple Random Sample • Want a representative sample but will settle for one that is not biased. • SRS – Each combination of 400 ISU students has the same chance of being the sample selected.
Simple Random Sample • Sampling Frame • A list of all students at ISU (the Registrar has such a list) • Use random numbers to select 400 students at random from this list.
Simple Random Sample • If one were to do this more than once • Different random numbers will give different samples of 400 students. • We have introduced variability by sampling!
Stratified Random Sample • Large population will be made up of smaller homogenous groups • Make sure each group is included in sample • Usually in proportion of population • Divide population into groups • Take SRS from each group • Combine SRSs = Stratified Random Sample
Example – Stratified Sample • Population – 200 employees at a company; 120 are men and 80 are women • Opinions on policy of arrival of children • Sample 20 people • Stratify into men and women • Sample 12 men and 8 women
Cluster Sampling • Difficult to get sampling frame for large population • Sample group or cluster first • Then take SRS from each cluster • Combined SRSs = Cluster Sample
Example – Cluster Sample • Opinion of Catholics church goers in Boston • Cluster = Catholic churches • Take SRS of churches • Take SRS of members of selected churches
Systematic Sampling • Use a system to select the sample • Every 10th person on an alphabetical list of students • OK if the order of the list is not going to be associated with the responses • Must start a systematic sample randomly (randomly choose where to start on the list)
Sampling Variability • Take several samples from a population and compute a statistic (i.e. mean) • These means will not be the same • This is the natural tendency of randomly drawn samples to vary from trial to trial • Sometimes called sampling error, but it is not an error; just a natural tendency
What Can Go Wrong? • Bias – any systematic failure of a sample to represent its population • Biased Samples • Voluntary Response Sampling • A large group of people are invited to respond, and those who do respond are counted • Problem: Not representative of pop - those with very strong opinions on subject are most likely to respond. • This is called voluntary response bias
What Can Go Wrong? • More Biased Samples • Convenience Sampling • This approach simply includes those at hand, or easily available • Problem: Not representative of population
Cautions about Samples • Undercoverage – Missing part of the population • Household Surveys • Phone Surveys • Avoid undercoverage by having an accurate and complete sampling frame • Non-response bias – People elect not to participate in survey.
Cautions about Samples • Response bias – People will lie • Illegal or unpopular behavior • Leading questions from interviewer • Faulty memory • Wording of questions • Confusing wording, i.e., use of double negatives • Leading questions
Inference about Population • Biased samples tell us nothing about the population • Good samples have sampling variability • Statistics will be different for each sample • Statistics will be different for population paramters • These differences obey certain laws of probability, but only for random samples • Larger samples give more accurate results