Sampling . The Statistical Adventure Begins. Populations. Def: Census Sample Which is better? census? sample?. Step 1: Define the Target Population. Must be very specific: What is a user? What demographics matter? Are there geographic boundaries? What is the relevant time period?
The Statistical Adventure Begins
Which is better?
Must be very specific:
What is a user?
What demographics matter?
Are there geographic boundaries?
What is the relevant time period?
What is an element?
Where can you get a sampling frame?
List may not match the target population
Probability samples let us estimate _________
We can calculate a confidence interval
So, probability samples are more representative than non-probability samples.
Number each unit in the sampling frame
Pick ___ units using a random numbers table
Element Attitude toward Motel 6
Decide on stratification variable
Homogeneity with respect to the dependent variable w/in the group
Divide population into a few mutually exclusive and exhaustive strata
Take a SRS from each strata
Choose sample from strata in same proportion as they are in the population
NOTE: Use when you have equal variance within the strata
Strata proportion proportion n=200
Take a larger sample from the strata with ________ variance
What is variance?
Exercise: Develop two populations with 8 elements each.
Population 1: high variance, low mean
Population 2: low variance, high mean
Strata Variance proportion proportion
Make sure that you include certain subgroups
More precise, IF we use the right stratification variable
margin of error is ___________
sampling distribution is __________
confidence intervals are __________
What is the right variable?
Divide population into lots of heterogeneous clusters
Take a SRS of clusters
Single stage: sample all elements in the selected clusters
Multi-stage: take a SRS of elements in the selected clusters
Likely to be the way the sampling frame is set up
not precise, lacks statistical efficiency
Convenience or accidental sample: select subjects because they are the most convenient or readily available
If the sample size is really large, we know we have a representative sample
Elements selected because they can serve the research purpose--they are believed to be representative
Attempts to be representative by sampling characteristics in the same proportion as the population
Interviewer chooses sample
Are these representative? _____
Must take into consideration:
Discuss this in detail in the next chapter
Actually collect the data
Clean-up the data - Editing
Put the data into the computer
2 (sigma squared)
X (x bar)
# of elements
sum of the sample elements
X= number of elements in sample
Sample variance = Sx2
sum of deviations around the mean squared
sample size minus 1
The square root of the sample variance = sx
Has a specific meaning
Think Chebychev’s Theorem
The difference between the :
and the sample statistic
We look at confidence intervals to estimate this but not until the next chapter
(i.e., all other kinds of errors except for sampling error!)