Section 1-1 Review and Preview. Preview.
Review and Preview
Polls, studies, surveys and other data collecting tools collect data from a small part of a larger group so that we can learn something about the larger group. This is a common and important goal of statistics: Learn about a large group by examining data from some of its members.
collections of observations (such as measurements, genders, survey responses).
is the science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data.
the complete collection of all individuals (scores, people, measurements, and so on) to be studied; the collection is complete in the sense that it includes all of the individuals to be studied.
Collection of data from every member of a population.
Subcollection of members selected from a population.
Sample data must be collected in an appropriate way, such as through a process of random selection.
If sample data are not collected in an appropriate way, the data may be so completely useless that no amount of statistical torturing can salvage them.
This section introduces basic principles of statistical thinking used throughout this book. Whether conducting statistical analysis of data that we have collected, or analyzing a statistical analysis done by someone else, we should not rely on blind acceptance of mathematical calculation. We should consider these factors:
Refer to the data in the table below. The x-values are weights (in pounds) of cars; the y-values are the corresponding highway fuel consumption amounts (in miles/gallon).
The x values are matched with the y values. It does not make sense to use the difference between each x value and the y value that is in the same column. The x values are weights (in pounds) and the y values are fuel consumption amounts (in mi/gal), so the differences are meaningless.
b) Given the context of the car measurement data, what issue can be addressed by conducting a statistical analysis of the values?
Is there a relationship or association between the weight of a car and its fuel consumption amount?
c) Comment on the source of the data if you are told that car manufacturers supplied the values. Is there an incentive for car manufacturers to report values that are not accurate?
Consumers know that some cars are more expensive to operate because they consume more fuel, so consumers may be inclined to purchase cars with better fuel efficiency. Car manufacturers can then profit by selling cars that appear to have high levels of fuel efficiency, so there would be an incentive to make the fuel consumption amounts appear to be as favorable as possible. In this case, the source of the data would be suspect with a potential for bias.
d) If we use statistical methods to conclude that there is a correlation (or relationship or association) between the weights of cars and the amounts of fuel consumption, can we conclude that adding weight to a car causes it to consume more fuel?
No. A conclusion of a correlation (or association) does not imply that one variable is the cause of the other.
Form a conclusion about statistical significance. Do not make any formal calculation. Either use the results provided or make a subjective judgment about the results: In a study of the Ornish weight loss program, 40 subjects lost a mean of 3.3 lb after 12 months (based on data from “Comparison of the Atkins, Ornish, Weight Watchers, and Zone Diets for Weight Loss and Heart Disease Risk Reduction,” by Dansinger et al., Journal of the American Medical Association, Vol. 293, No. 1) Methods of statistics can be used to show that if this diet had no effect, the likelihood of getting these results is roughly 3 chances in 1000. Does the Ornish weight loss program have statistical significance? Does it have practical significance? Why or why not?
The Ornish weight loss program has statistical significance, because the results are so unlikely (3 chance in 1000) to occur by chance. It does not have practical significance because the amount of lost weight (3.3 lb.) is so small.
Form a conclusion about statistical significance. Do not make any formal calculation. Either use the results provided or make a subjective judgment about the results: One of Gregor Mendel’s famous hybridization experiments with peas yielded 580 offspring with 152 of those peas (or 26%) having yellow pods. According to Mendel’s theory, 25% of the offspring peas should have yellow pods. Do the results of the experiment differ from Mendel’s claimed rate of 25% by an amount that is statistically significant?
Determining whether or not the difference between Mendel’s actual results (26%) and the results predicted by his theory (25%) is statistically significant requires applying techniques presented in future chapters. Common sense suggests the 1% difference is of no practical difference. Considering the sample size, the actual difference between the observed and expected results is 152 – 145 = 7. Common sense suggests that a discrepancy of 7 (relative to an expected result of 145 plants from a total sample of 580 plants) is within the natural fluctuation inherent in normal biological processes, and that the difference is not statistically significant.
Examples 4 – 7, use common sense to determine whether the given event is (a) impossible; (b) possible, but very unlikely; (c) possibly and likely
Example 4: The Chicago Cubs win the 2012 World Series.
Example 5:The Chicago Cubs win the World Series within the next five years.
Possible, but very unlikely
Example 6:Halloween will fall on a Wednesday this year.
Example 7:All of the students in this class will post something on Facebook this week.
Possible and likely
Problem Number 2.
Problem Number 3.
Problem Number 4.
Problem Number 5.
Problem Number 6.