Statistics : The science of learning from data Chapters 1-4: Tools and strategies for organizing, describing, analyzing data Chapter 5 : How to produce data Chapter 6-9: Probability: The study of chance behavior Chapter 10-15 : Testing Claims/Computing estimates. Preliminary Chapter.

Statistics: The science of learning from data

Chapters 1-4: Tools and strategies for organizing, describing, analyzing data

Chapter 5: How to produce data

Chapter 6-9: Probability: The study of chance behavior

Chapter 10-15: Testing Claims/Computing estimates

1. What individuals do the data describe? How many individuals are there?

2. How many variables? Defs of these variables? What units? (lbs? kilos?)

3. Reasons the data was gathered? (For a sample or population)?

Every set of data comes with background information to help us understand the data!

Data Production

Where can we find good data?

Library

Internet

www.nces.ed.gov (Nat’nl Center for Education Statistics website)

www.fedstats.gov (good source for projects)

 Statistical offices of foreign countries (www.statcan.ca, www.inegi.gob.mx )

Is this good data?
• Suppose you want to find out if your classmates prefer cheeseburgers from McDonald's or Burger King. You decide to ask 50 people under the age of 20 which fast-food they prefer. In order to save time and energy, you conduct your survey at the McDonald’s closest to campus. Is there a problem with this?
Does drinking at least five carbonated sodas a week improve a student’s GPA?
• Observation: Compare the GPA’s of a sample of students who drink more than five sodas a week with those who drink less.
• Experiment: From a random group of students, require some to drink more than five sodas per week, and require the rest ot drink less. After a couple of years, compare their GPA’s.

In 1976, Shere Hite published The Hite Report on Female Sexuality, Seven Stories Press, Ny, Ny 2004. The conclusions reported in her book were based on 3,000 returned surveys from 100,000 surveys distributed by women’s groups. The results were that women were highly critical of men. In what way might the author’s findings have been biased?

• W-6H: Who What Why How Where When by Whom?
• Who – is being studied
• What – are the variables
• Why – was the data gathered
• How – was the data produced
• Where – was the data gathered
• When – was the data produced
• By Whom – who directed it, can we trust it?
Exploratory Data Analysis: Examining data in order to describe their main features. (What do u see?)

2 steps

1) Examine the variables

2) Graph them.

Distribution of a Variable: what values the variable takes on and how often it takes these values.

The pattern of a variable is its distribution.

Dotplots

Number of goals scored by the US women’s soccer team in 34 games played in the 2004 season are:

3 0 2 7 8 2 4 3 5 1 1 4 5 3 1 1 3 3 3 2 1 2 2 2 4 3 5 6 1 5 5 1 1 5

What does this tell us about the performance of the US women’s team in 2004?

Exploring Relationships between variables

Air travelers would like their flights to arrive on time. Airlines collect data about on-time arrivals and report them to the department of Transportation. Here’s one month’s data for flights from several western cities for two airlines:

An association or comparison that holds for all of several groups can reverse direction when the data are combined to form a single group.

Statistical Inference
• Population values (parameters) are fixed
• Sample values (statistics) vary from sample to sample.
• A sample value will not give us precise information about a population parameter (but if properly collected, it will provide us with reasonable bounds on a parameter).
How unlikely must an event be before we conclude that it isn’t due to chance?

25%? 10% 1%? 0.01?

Our willingness to declare an event “unlikely” is usually based on….

