Statistics

1 / 35

# Statistics - PowerPoint PPT Presentation

Statistics. Chapter 10. 10.1 Organizing and Picturing Information. Line Plot: A line plot is a basic and intuitive visual representation of data. 16. 14. 12. 10. Frequency. 8. 6. 4. 2. 10. 20. 30. 40. 50. 60. 70. 80. 90. 100. Student Test Scores.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Statistics' - temple

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Statistics

Chapter 10

10.1 Organizing and Picturing Information

Line Plot:

A line plot is a basic and intuitive visual representation of data.

16

14

12

10

Frequency

8

6

4

2

10

20

30

40

50

60

70

80

90

100

Student Test Scores

Example:30 fourth graders took a science test and made the following scores. What can we conclude about the students’ performance?

22, 23, 14, 45, 39, 11, 9, 46, 22, 25, 6, 28, 33, 36, 16, 39, 49, 17, 22, 32, 34, 22, 18, 21, 27, 34, 26, 41, 28, 25

6, 9, 11, 14, 16, 17, 18, 21, 22, 22, 22, 22, 23, 25, 25, 26, 27, 28, 28, 32, 33, 34, 34, 36, 39, 39, 41, 45, 46, 49

Step1: Put data in ascending order.

Step 2: Place one dot for each score

5

4

Frequency

3

2

1

5

10

15

20

25

30

35

40

45

50

Science Test Scores

Stem and Leaf Plots

A stem and leaf plot is an effective way to present two sets of data side by side for analysis.

For a two digit number the “stem” is the tens place, and the “leaf” is the ones place.

Stems are listed once in ascending order vertically. Leaves are placed in increasing order away from the stem, and may be repeated if necessary.

Example: Make a stem and leaf plot for the following children’s heights, in centimeters.

94, 105, 107, 108, 108, 120, 121, 122, 123

For three digit numbers, the stem is the hundreds and tens positions, and the leaf is the ones position.

Histograms

A bar graph used to graph frequency distributions of continuous variables is called a histogram.

The graph is similar, but no spaces are allowed between the bars.

Grouping Data Values into Classes

When there are many different values in a data set, we may group the data values into classes to better understand the information.

Typically we use between 8 and 12 classes, but there is no rule that dictates the number we must use. Choose wisely.

Histograms

• Histograms can also be drawn with percentages, like the bar graph.
Bar Graphs

Bar graph: Specify the classes on the horizontal axis and the frequencies on the vertical axis.

Pictographs

A pictograph uses a picture or icon to symbolize the quantities being represented.

Pictorial Embellishments are used to make the graph more visibly appealing.

Scatterplots

Data that occurs in pairs, such as dates and temperature, selling price of a home and its appraised value, etc, can be plotted on a set of axes similar to an xy plane. Such a plot is called a scatterplot.

10.2 Analyzing Data

One of the ways to summarize data numerically is to calculate measures of center.

The measures we will use are the mean, median, mode and quartiles.

We will also be examining the Five Number Summary.

Mean

The arithmetic mean is what we usually refer to as the “average.”

To calculate the mean, we add up all the data points and divide by the number of data points.

Median

If we arrange a set of numbers in order, the median is the middle value in the list of numbers.

Case 1: Odd number of data points: The median is the data point in the middle position.

Case 2: Even number of data points: The median is the average of the two middle numbers and is not a data point.

Mode

The mode is the most frequent data point in the set.

There can be more than one mode. If there are two modes, the data set is “bimodal”.

The Five Number Summary

The median divides the data set into two halves. The set below the median is the lower half, and the set above the median is the upper half.

The median of the lower half is the first quartile, Q1. The median of the upper half is the third quartile, Q3.

The low data point, Q1, the median,Q3and the high data point form the five number summary.

Box and Whisker Plots

The graph of the five number summary is called a box and whisker plot.

min

Q1

med

Q3

max

Percentiles

If we were to divide the data into 100 equal parts, percentiles could be used to mark the dividing points in the data.

A number is in the nth percentile of some data if it is greater than or equal to n% of the data.

Measures of Dispersion

Definitions:

The range is the difference between the largest and the smallest data values in the set.

If x is a data value in a set whose mean is then is called x’s deviation from the mean.

Standard Deviation

The standard deviation measures how far off the mean a data point is “on average”. Think of standard deviation as the “average deviation” of a data set.

Formula:

Definition:

The z-score, z, for a particular score, x, is

The z-score indicates how many standard deviations the number is away from the mean. Numbers above the mean = positive z-score. Numbers below the mean = negative z-score.

Distributions

Definitions: A collection of numerical information is called data or a distribution. A set of data listed with their frequencies is called a frequency distribution.

When the percent of the time each item occurs in a frequency distribution is listed, we call the distribution a relative frequency distribution

Bar graphs can be drawn using frequency distributions or relative frequency distributions.

Bar graph using Relative Frequency

Bar graph using Frequency Distribution

The Normal Distribution

When describing a set of data, statisticians often look to the shape of the data. One special shape that occurs frequently is a bell curve. The bell curve indicates that a distribution is “normal”.

Characteristics of Normal Distribution
• Bell shaped curve.
• Highest point of curve is at the mean.
• Mean = median = mode.
• Curve is symmetric about the mean.
• Total area under the curve is 1.
• Points of inflection lie 1 standard deviation away from the mean.
• 68% of data lies within one standard deviation of the mean.95% of data lies within two.99.7% lies within three.
Normal Distribution

When discussing normal distributions, we assume we are dealing with an entire population rather than a sample. To indicate this, we change the symbols representing the mean and standard deviation.

Mean

Standard Deviation

Before:

Now:

Before:

Now:

Area Under the Normal Curve

Areas under the curve represent percentages (or probabilities) of values in a distribution.

To address this idea properly and generally, we need something called the standard normal distribution.

This distribution is also called a “z distribution.”

Standard Normal Distribution

68% of the data lies between z = - 1 and z = 1

95% of the data lies between z = -2 and z = 2

99.7% of the data lies between z =-3 and z =3

Scaling and Axis Manipulation

To make the differences among bars of a histogram or bar chart more dramatic, the axes are often manipulated, either by changing the scales or omitting the scale values.

Line Graphs and Cropping

To manipulate a line graph, one could either compress the vertical axis scaling or extend the scaling, whichever fits the desired effect.

Circle Graphs

A circle graph can be misleading by not indicating the percent amounts, not having the correct central angle, or by illustrating the graph by “exploding” sectors.

Sampling

The entire group in question is called the population. The subset of the population that is actually questioned is called a sample.

Bias

A bias is a flaw in the sampling procedure that makes it more likely the sample will not represent the entire population.