Chapter 2
Download
1 / 36

Chapter 2 - PowerPoint PPT Presentation


  • 109 Views
  • Uploaded on

Chapter 2. Sampling Design. How do we gather data?. Surveys Opinion polls Interviews Studies Observational Retrospective (past) Prospective (future) Experiments. the entire group of individuals that we want information about. Population. a complete count of the population. Census.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Chapter 2' - quinta


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Chapter 2

Chapter 2

Sampling Design


How do we gather data
How do we gather data?

  • Surveys

  • Opinion polls

  • Interviews

  • Studies

    • Observational

    • Retrospective (past)

    • Prospective (future)

  • Experiments




How good is a census do frog fairy tale
How good is a census? aboutDo frog fairy tale . . .

The answer is 83!


Why would we not use a census all the time

Not accurate about

Very expensive

Perhaps impossible

If using destructive sampling, you would destroy population

Breaking strength of soda bottles

Lifetime of flashlight batteries

Safety ratings for cars

Why would we not use a census all the time?

Look at the U.S. census – it has a huge amount of error in it; plus it takes a long to compile the data making the data obsolete by the time we get it!

Since taking a census of any population takes time, censuses are VERY costly to do!

Suppose you wanted to know the average weight of the white-tail deer population in Texas – would it be feasible to do a census?


Sample

A part of the population that we actually examine in order to gather information

Use sample to generalize to population

Sample


Sampling design

refers to the to gather informationmethod used to choose the sample from the population

Sampling design


Sampling frame

a list of to gather informationevery individual in the population

Sampling frame


Jelly blubber activity
Jelly Blubber Activity to gather information

  • Select 5 Jelly blubbers that you think are representative of the population of blubbers in regards to length.

  • Find the mean length of your sample


Simple random sample srs

consist of to gather informationn individuals from the population chosen in such a way that

every individual has an equal chance of being selected

every set of n individuals has an equal chance of being selected

Simple Random Sample (SRS)

Suppose we were to take an SRS of 100 PWSH students – put each students’ name in a hat. Then randomly select 100 names from the hat. Each student has the same chance to be selected!

Not only does each student has the same chance to be selected – but every possible group of 100 students has the same chance to be selected! Therefore, it has to be possible for all 100 students to be seniors in order for it to be an SRS!


Stratified random sample

population is divided into homogeneous groups called strata to gather information

SRS’s are pulled from each strata

Stratified random sample

Homogeneous groups are groups that are alike based upon some characteristic of the group members.

Suppose we were to take a stratified random sample of 100 PWSH students. Since students are already divided by grade level, grade level can be our strata. Then randomly select 50 seniors and randomly select50 juniors.


Systematic random sample

select sample by following a systematic approach to gather information

randomly select where to begin

Suppose we want to do a systematic random sample of PSWH students - number a list of students

(There are approximately 2000 students – if we want a sample of 100, 2000/100 = 20)

Select a number between 1 and 20 at random. That student will be the first student chosen, then choose every 20th student from there.

Systematic random sample


Cluster sample

based upon location to gather information

randomly pick a location & sample all there

Suppose we want to do a cluster sample of PSWH students. One way to do this would be to randomly select 10 classrooms during 2nd period. Sample all students in those rooms!

Cluster Sample


For the Jelly Blubber colony: to gather information

m = 19.41


Multistage sample

select successively smaller groups within the population in stages

SRS used at each stage

Multistage sample

To use a multistage approach to sampling PWSH students, we could first divide 2nd period classes by level (AP, Honors, Regular, etc.) and randomly select 4 second period classes from each group. Then we could randomly select 5 students from each of those classes. The selection process is done in stages!


Advantages stages

Unbiased

Easy

Disadvantages

Large variance

May not be representative

Must have sampling frame (list of population)

SRS


Stratified

Advantages stages

More precise unbiased estimator than SRS

Less variability

Cost reduced if strata already exists

Disadvantages

Difficult to do if you must divide stratum

Formulas for SD & confidence intervals are more complicated

Need sampling frame

Stratified


Systematic random sample1

Advantages stages

Unbiased

Don’t need sampling frame

Ensure that the sample is spread across population

More efficient, cheaper, etc.

Disadvantages

Large variance

Can be confounded by trend or cycle

Formulas are complicated

Systematic Random Sample


Cluster samples

Advantages stages

Unbiased

Cost is reduced

Sampling frame may not be available (not needed)

Disadvantages

Clusters may not be representative of population

Formulas are complicated

Cluster Samples


Identify the sampling design
Identify the sampling design stages

1)The Educational Testing Service (ETS) needed a sample of colleges. ETS first divided all colleges into groups of similar types (small public, small private, etc.) Then they randomly selected 3 colleges from each group.

Stratified random sample


Identify the sampling design1
Identify the sampling design stages

2) A county commissioner wants to survey people in her district to determine their opinions on a particular law up for adoption. She decides to randomly select blocks in her district and then survey all who live on those blocks.

Cluster sampling


Identify the sampling design2
Identify the sampling design stages

3) A local restaurant manager wants to survey customers about the service they receive. Each night the manager randomly chooses a number between 1 & 10. He then gives a survey to that customer, and to every 10th customer after them, to fill it out before they leave.

Systematic random sampling


Random digit table

each entry is equally likely to be any of the 10 digits stages

digits are independent of each other

Numbers can be read across.

Random digit table

Numbers can be read vertically.

The following is part of the random digit table found on page 847 of your textbook:

Row

1 4 5 1 8 5 0 3 3 7 1

2 4 2 5 5 8 0 4 5 7 0

3 8 9 9 3 4 3 5 0 6 3

Numbers can be read diagonally.


Suppose your population consisted of these 20 people: stages

1) Aidan 6) Fred 11) Kathy 16) Paul

2) Bob 7) Gloria 12) Lori 17) Shawnie

3) Chico 8) Hannah 13) Matthew 18) Tracy

4) Doug 9) Israel 14) Nan 19) Uncle Sam

5) Edward 10) Jung 15) Opus 20) Vernon

Use the following random digits to select a sample of five from these people.

We will need to use double digit random numbers, ignoring any number greater than 20. Start with Row 1 and read across.

1) Aidan

13) Matthew

18) Tracy

15) Opus

5) Edward

Ignore.

Ignore.

Ignore.

Ignore.

Stop when five people are selected. So my sample would consist of :

Aidan, Edward, Matthew, Opus, and Tracy

Row

1 4 5 1 8 0 5 1 3 7 1

2 0 1 5 5 8 0 1 5 7 0

3 8 9 9 3 4 3 5 0 6 3


A systematic error in stagesmeasuring the estimate

favors certain outcomes

Bias

Anything that causes the data to be wrong! It might be attributed to the researchers, the respondent, or to the sampling method!


Sources of bias

things that stagescan cause bias in your sample

cannot do anything with bad data

Sources of Bias


Voluntary response

People chose to respond stages

Usually only people with very strong opinions respond

Voluntary response

An example would be the surveys in magazines that ask readers to mail in the survey. Other examples are call-in shows, American Idol, etc.

Remember, the respondent selects themselves to participate in the survey!

Remember – the way to determine voluntary response is:

Self-selection!!


Convenience sampling

Ask people who are easy to ask stages

Produces bias results

Convenience sampling

The data obtained by a convenience sample will be biased – however this method is often used for surveys & results reported in newspapers and magazines!

An example would be stopping friendly-looking people in the mall to survey. Another example is the surveys left on tables at restaurants - a convenient method!


Undercoverage

some groups of population are left out of the sampling process

People with unlisted phone numbers – usually high-income families

People without phone numbers –usually low-income families

People with ONLY cell phones – usually young adults

Undercoverage

Suppose you take a sample by randomly selecting names from the phone book – some groups will not have the opportunity of being selected!


Nonresponse

occurs when an individual chosen for the sample can’t be contacted or refuses to cooperate

telephone surveys 70% nonresponse

Nonresponse

Because of huge telemarketing efforts in the past few years, telephone surveys have a MAJOR problem with nonresponse!

People are chosen by the researchers, BUT refuse to participate.

NOT self-selected!

This is often confused with voluntary response!

One way to help with the problem of nonresponse is to make follow contact with the people who are not home when you first contact them.


Response bias

occurs when the behavior of respondent or interviewer causes bias in the sample

wrong answers

Suppose we wanted to survey high school students on drug abuse and we used a uniformed police officer to interview each student in our sample – would we get honest answers?

Response bias

Response bias occurs when for some reason (interviewer’s or respondent’s fault) you get incorrect answers.


Wording of the questions

wording can influence the answers that are given bias in the sample

connotation of words

use of “big” words or technical words

– if surveying Podunk, TX, then you should avoidcomplex vocabulary.

– if surveying doctors, then use more complex, technical wording.

Wording of the Questions

The level of vocabulary should be appropriate for the population you are surveying

Questions must be worded as neutral as possible to avoid influencing the response.


Source of bias
Source of Bias? bias in the sample

1) Before the presidential election of 1936, FDR against Republican ALF Landon, the magazine Literary Digest predicting Landon winning the election in a 3-to-2 victory. A survey of 2.8 million people. George Gallup surveyed only 50,000 people and predicted that Roosevelt would win. The Digest’s survey came from magazine subscribers, car owners, telephone directories, etc.

Undercoverage – since the Digest’s survey comes from car owners, etc., the people selected were mostly from high-income families and thus mostly Republican! (other answers are possible)


2) Suppose that you want to estimate the total amount of money spent by students on textbooks each semester at SMU. You collect register receipts for students as they leave the bookstore during lunch one day.

Convenience sampling – easy way to collect data

or

Undercoverage – students who buy books from on-line bookstores are included.


3) To find the average value of a home in Plano, one averages the price of homes that are listed for sale with a realtor.

Undercoverage – leaves out homes that are not for sale or homes that are listed with different realtors.

(other answers are possible)


ad