1 / 23

STA 291 Fall 2009

STA 291 Fall 2009. Lecture 2 Dustin Lueker. Basic Terminology. P arameter Numerical characteristic of the p opulation Calculated using the whole p opulation S tatistic Numerical characteristic of the s ample Calculated using the s ample. Simple Random Sampling (SRS).

judah
Download Presentation

STA 291 Fall 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STA 291Fall 2009 Lecture 2 Dustin Lueker

  2. Basic Terminology • Parameter • Numerical characteristic of the population • Calculated using the whole population • Statistic • Numerical characteristic of the sample • Calculated using the sample STA 291 Fall 2009 Lecture 2

  3. Simple Random Sampling (SRS) • Each possible sample has the same probability of being selected • The sample size is usually denoted by n STA 291 Fall 2009 Lecture 2

  4. Example of SRS • Population of 4 students: Alf, Buford, Charlie, Dixie • Select a SRS of size n = 2 to ask them about their smoking habits • 6 possible samples of size 2 • A,B • A,C • A,D • B,C • B,D • C,D STA 291 Fall 2009 Lecture 2

  5. How to choose a SRS? • Each of the size possible samples has to have the same probability of being selected • How could we do this? • Roll a die • Random number generator STA 291 Fall 2009 Lecture 2

  6. Common Problems when Sampling • Convenience sample • Selecting subjects that are easily accessible to you • Volunteer sample • Selecting the first two subjects who volunteer to take the survey • What are the problems with these samples? • Proper representation of the population • Bias • Examples • Mall interview • Street corner interview STA 291 Fall 2009 Lecture 2

  7. Example • A survey of 300 random individuals was conducted in Louisville that revealed that President Obama had an approval rating of 67%. • Is 67% a statistic or parameter? • The surveyors stated that only 67% of Kentuckians approved of President Obama. • What is the problem with this statement? • Why might the surveyors have chosen Louisville as their sampling location? STA 291 Fall 2009 Lecture 2

  8. Famous Example • 1936 presidential election of Alfred Landon vs. Franklin Roosevelt • Literary Digest sent out over 10 million questionaires in the mail to predict the election outcome • What type of sample is this? • 2 million responses predicted an landslide victory for Alfred Landon • George Gallup used a much small random sample and predicted a clear victory for FDR • FDR won with 62% of the vote STA 291 Fall 2009 Lecture 2

  9. Other Examples • TV, radio call-in polls • “should the UN headquarters continue to be located in the United States?” • ABC poll with 186,000 callers: 67% no • Scientific random sample of 500: 28% no • Which sample is more trust worthy? • Would any of you call in to give you opinion? Why or why not? STA 291 Fall 2009 Lecture 2

  10. Other Examples • Another advantage of random samples • Inferential statistical methods can be applied to state that “the true percentage of all Americans who want the UN headquarters out of the United States is between 24% and 32%” • These methods cannot be applied to volunteer sample STA 291 Fall 2009 Lecture 2

  11. Don’t Trust Bad Samples • Whenever you see results from a poll, check whether they come from a random sample • Preferably, it should be stated • Who sponsored and conducted the poll? • How were the questions worded? • How was the sample selected? • How large was it? • If not, the results may not be trustworthy STA 291 Fall 2009 Lecture 2

  12. Question Wording • Kalton et al. (1978), England • Two groups get questions with slightly different wording • Group 1 • “Are you in favor of giving special priority to buses in the rush hour or not ?” • Group 2 • “Are you in favor of giving special priority to buses in the rush hour or should cars have just as much priority as buses ?” STA 291 Fall 2009 Lecture 2

  13. Question Wording • Result: Proportion of people saying that priority should be given to buses. STA 291 Fall 2009 Lecture 2

  14. Question Order • Two questions asked in different order during the cold war • (1)“Do you think the U.S. should let Russian newspaper reporters come here and send back whatever they want?” • (2)“Do you think Russia should let American newspaper reporters come in and send back whatever they want?” • When question (1) was asked first, 36% answered “Yes” • When question (2) was asked first, 73% answered “Yes” to question (1) STA 291 Fall 2009 Lecture 2

  15. ‘Flavors’ of Statistics • Descriptive Statistics • Summarizing the information in a collection of data • Inferential Statistics • Using information from a sample to make conclusions/predictions about the population STA 291 Fall 2009 Lecture 2

  16. Example • 71% of individuals surveyed believed that the Kentucky Football team will return to a bowl game in 2009 • Is 71% an example of descriptive or inferential statistics? • From the same sample it is concluded that at least 85% of Kentucky Football fans approve of Coach Brooks’ job here at UK • Is 85% an example of descriptive or inferential statistics? STA 291 Fall 2009 Lecture 2

  17. Qualitative Variables • Nominal • Gender, nationality, hair color, state of residence • Nominal variables have a scale of unordered categories • It does not make sense to say, for example, that green hair is greater/higher/better than orange hair • Ordinal • Disease status, company rating, grade in STA 291 • Ordinal variables have a scale of ordered categories, they are often treated in a quantitative manner (A = 4.0, B = 3.0, etc.) • One unit can have more of a certain property than does another unit STA 291 Fall 2009 Lecture 2

  18. Quantitative Variables • Quantitative • Age, income, height • Quantitative variables are measured numerically, that is, for each subject a number is observed • The scale for quantitative variables is called interval scale STA 291 Fall 2009 Lecture 2

  19. Example • A survey of Kentucky Football fans obtained the following information • Age • Whether they preferred the new blue helmet or the old white helmet • The number of games they think the team will win in 2009 • How they felt the UK vs. U of L game would turn out • U of L in a blowout • U of L in a close game • UK in a close game • UK in a blowout • Are these qualitative or quantitative variables and what is the scale for each? STA 291 Fall 2009 Lecture 2

  20. Discrete and Continuous • A variable is discrete if it can take on a finite number of values • Gender • Favorite MLB team • Qualitative variables are discrete • Continuous variables can take an infinite continuum of possible real number values • Time spent studying for STA 291 per day • 27 minutes • 27.487 minutes • 27.48682 minutes • Can be subdivided into more accurate values • Therefore continuous STA 291 Fall 2009 Lecture 2

  21. Observational Study • An observational study observes individuals and measures variables of interest but does not attempt to influence the responses • Purpose of an observational study is to describe/compare groups or situations • Example: Select a sample of men and women and ask whether he/she has taken aspirin regularly over the past 2 years, and whether he/she had suffered a heart attack over the same period STA 291 Fall 2009 Lecture 2

  22. Experiment • An experiment deliberately imposes some treatment on individuals in order to observe their responses • Purpose of an experiment is to study whether the treatment causes a change in the response • Example: Randomly select men and women, divide the sample into two groups. One group would take aspirin daily, the other would not. After 2 years, determine for each group the proportion of people who had suffered a heart attack. STA 291 Fall 2009 Lecture 2

  23. Which is Preferred? • Observational Studies • Passive data collection • We observe, record, or measure, but don’t interfere • Experiments • Active data production • Actively intervene by imposing some treatment in order to see what happens • Experiments are preferable if they are possible • We are able to control more things and be sure our data isn’t tainted STA 291 Fall 2009 Lecture 2

More Related