SEVENTH EDITION and EXPANDED SEVENTH EDITION
Download
1 / 86

SEVENTH EDITION and EXPANDED SEVENTH EDITION - PowerPoint PPT Presentation


  • 194 Views
  • Uploaded on

SEVENTH EDITION and EXPANDED SEVENTH EDITION. Chapter 13. Statistics. 13.1. Sampling Techniques. Statistics. Statistics is the art and science of gathering, analyzing, and making inferences from numerical information (data) obtained in an experiment.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' SEVENTH EDITION and EXPANDED SEVENTH EDITION' - halla-moss


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Chapter 13

Chapter 13

Statistics


13.1

Sampling Techniques


Statistics
Statistics

  • Statistics is the art and science of gathering, analyzing, and making inferences from numerical information (data) obtained in an experiment.

  • Statistics are divided into two main braches.

    • Descriptive statistics is concerned with the collection, organization, and analysis of data.

    • Inferential statistics is concerned with the making of generalizations or predictions of the data collected.


Statisticians
Statisticians

  • A statistician’s interest lies in drawing conclusions about possible outcomes through observations of only a few particular events.

    • The population consists of all items or people of interest.

    • The sample includes some of the items in the population.

  • When a statistician draws a conclusion from a sample, there is always the possibility that the conclusion is incorrect.


Types of sampling
Types of Sampling

  • A random sampling occurs if a sample is drawn in such a way that each time an item is selected, each item has an equal chance of being drawn.

  • When a sample is obtained by drawing every nth item on a list or production line, the sample is a systematic sample.

  • A cluster sample is referred to as an area sample because it is applied on a geographical basis.


Types of sampling continued
Types of Sampling continued

  • Stratified sampling involves dividing the population by characteristics such as gender, race, religion, or income.

  • Convenience sampling uses data that is easily obtained and can be extremely biased.


Example identifying sampling techniques
Example: Identifying Sampling Techniques

  • A raffle ticket is drawn by a blindfolded person at a festival to win a grand prize.

  • Students at an elementary school are classified according to their present grade level. Then, a random sample of three students from each grade are chosen to represent their class.

  • Every sixth car on highway is stopped for a vehicle inspection.


Example identifying sampling techniques continued
Example: Identifying Sampling Techniques continued

  • Voters are classified based on their polling location. A random sample of four polling locations are selected. All the voters from the precinct are included in the sample.

  • The first 20 people entering a water park are asked if they are wearing sunscreen.

    Solution:

    a)Random d)Cluster

    b)Stratified e)Convenience

    c)Systematic


13.2

The Misuses of Statistics


Misuses of statistics
Misuses of Statistics

  • Many individuals, businesses, and advertising firms misuse statistics to their own advantage.

  • When examining statistical information consider the following:

    • Was the sample used to gather the statistical data unbiased and of sufficient size?

    • Is the statistical statement ambiguous, could it be interpreted in more than one way?


Example misleading statistics

An advertisement says, “Fly Speedway Airlines and Save 20%”.

Here there is not enough information given.

The “Save 20%” could be off the original ticket price, the ticket price when you buy two tickets or of another airline’s ticket price.

A helped wanted ad read,” Salesperson wanted for Ryan’s Furniture Store. Average Salary: $32,000.”

The word “average” can be very misleading.

If most of the salespeople earn $20,000 to $25,000 and the owner earns $76,000, this “average salary” is not a fair representation.

Example: Misleading Statistics


Charts and graphs
Charts and Graphs 20%”.

  • Charts and graphs can also be misleading.

    • Even though the data is displayed correctly, adjusting the vertical scale of a graph can give a different impression.

    • A circle graph can be misleading if the sum of the parts of the graphs do not add up to 100%.


Example misleading graphs
Example: Misleading Graphs 20%”.

While each graph presents identical information, the vertical scales have been altered.


13.3 20%”.

Frequency Distributions


Example

0 20%”.

0

0

0

0

0

1

1

1

1

1

1

1

1

1

1

2

2

2

2

2

2

2

2

3

3

3

3

4

4

Example

  • The number of pets per family is recorded for 30 families surveyed. Construct a frequency distribution of the following data:


Solution

Number of Pets 20%”.

Frequency

0

6

0

0

0

0

0

0

1

1

1

1

1

1

1

10

1

1

1

1

2

2

2

8

2

2

2

2

2

2

3

4

3

3

3

3

4

4

4

2

Solution


Rules for data grouped by classes
Rules for Data Grouped by Classes 20%”.

  • The classes should be of the same “width.”

  • The classes should not overlap.

  • Each piece of data should belong to only one class.


Definitions
Definitions 20%”.

  • Midpoint of a class is found by adding the lower and upper class limits and dividing the sum by 2.


Example1

6.8 20%”.

5.3

9.7

3.8

8.7

0.5

5.9

0.8

5.7

1.3

4.8

9.6

1.5

7.4

0.2

Example

  • The following set of data represents the distance, in miles, 15 randomly selected second grade students live from school.

    Construct a frequency distribution with the first class 0  2.


Solution1

First, rearrange the data from lowest to highest. 20%”.

# of miles from school

Frequency

0.2

0.5

0.8

0 - 2

5

1.3

1.5

3.8

2.1 - 4.1

1

4.2 - 6.2

4

4.8

5.3

5.7

6.3 - 8.3

2

5.9

6.8

7.4

8.4 -10.4

3

8.7

9.6

9.7

15

Solution


13.4 20%”.

Statistical Graphs


Circle graphs
Circle Graphs 20%”.

  • Circle graphs (also known as pie charts) are often used to compare parts of one or more components of the whole to the whole.


Example2

Aspirin 20%”.

56

Ibuprofen

104

Acetaminophen

16

Other

24

200

Example

  • According to a recent hospital survey of 200 patients the following table indicates how often hospitals used four different kinds of painkillers. Use the information to construct a circle graph illustrating the percent each painkiller was used.


Solution2

Painkiller 20%”.

Number of Patients

Percent of Total

Measure of Central Angle

Aspirin

56

0.28  360 = 100.8

Ibuprofen

104

0.52  360 = 187.2

Acetaminophen

16

0.08  360 = 28.8

Other

24

0.12  360 = 43.2

Total

200

100%

360

Solution

  • Determine the measure of the corresponding central angle.


Solution continued
Solution continued 20%”.

  • Use a protractor to construct a circle graph and label it properly.


Histogram

# of pets 20%”.

Frequency

0

6

1

10

2

8

3

4

4

2

Histogram

  • A histogram is a graph with observed values on its horizontal scale and frequencies on it vertical scale.

  • Example: Construct a

    histogram of the frequency distribution.


Solution3

# of pets 20%”.

Frequency

0

6

1

10

2

8

3

4

4

2

Solution



Stem and leaf display
Stem-and-Leaf Display 20%”.

  • A stem-and-leaf display is a tool that organizes and groups the data while allowing us to see the actual values that make up the data.

  • The left group of digits is called the stem.

  • The right group of digits is called the leaf.


Example3

12 20%”.

18

3

8

12

25

21

3

15

4

17

27

43

21

16

12

26

35

14

9

Example

  • The table below indicates the number of miles 20 workers have to drive to work. construct a stem-and-leaf display.


Solution4

Data 20%”.

0

33489

12

18

3

8

12

25

21

3

15

4

1

22245678

17

27

43

21

16

2

11567

12

26

35

14

9

3

5

4

3

Solution


13.5 20%”.

Measures of Central Tendency


Definitions1
Definitions 20%”.

  • An average is a number that is representative of a group of data.

  • The arithmetic mean, or simply the mean is symbolized by or by the Greek letter mu, .


Mean 20%”.

  • The mean, is the sum of the data divided by the number of pieces of data. The formula for calculating the mean is

  • where represents the sum of all the data and n represents the number of pieces of data.


Example find the mean
Example-find the mean 20%”.

  • Find the mean amount of money parents spent on new school supplies and clothes if 5 parents randomly surveyed replied as follows: $327 $465 $672 $150 $230


Median

middle value (median) 20%”.

Median

  • The median is the value in the middle of a set of ranked data.

  • Example: Determine the mean of $327 $465 $672 $150 $230.

    Rank the data from smallest to largest.

    $150 $230 $327 $465 $672


Example median even data

7 8 20%”.

Example: Median (even data)

  • Determine the median of the following set of data: 8, 15, 9, 3, 4, 7, 11, 12, 6, 4.

    Rank the data:

    3 4 4 6 7 8 9 11 12 15

    There are 10 pieces of data so the median will lie halfway between the two middle pieces the 7 and 8. The median is (7 + 8)/2 = 7.5

    3 4 4 6 9 11 12 15


Mode 20%”.

  • The mode is the piece of data that occurs most frequently.

  • Example: Determine the mode of the data set: 3, 4, 4, 6, 7, 8, 9, 11, 12, 15.

  • The mode is 4 since is occurs twice and the other values only occur once.


Midrange
Midrange 20%”.

  • The midrange is the value halfway between the lowest (L) and highest (H) values in a set of data.

  • Example: Find the midrange of the data set$327, $465, $672, $150, $230.


Example4
Example 20%”.

  • The weights of eight Labrador retrievers rounded to the nearest pound are 85, 92, 88, 75, 94, 88, 84, and 101. Determine the

    • a) mean b) median

    • c) mode d) midrange

    • e) rank the measures of central tendency from lowest to highest.


Example dog weights 85 92 88 75 94 88 84 101
Example--dog weights 20%”.85, 92, 88, 75, 94, 88, 84, 101

  • Mean

  • Median-rank the data

    • 75, 84, 85, 88, 88, 92, 94, 101

    • The median is 88.


Example dog weights 85 92 88 75 94 88 84 1011
Example--dog weights 20%”.85, 92, 88, 75, 94, 88, 84, 101

  • Mode-the number that occurs most frequently. The mode is 88.

  • Midrange = (L + H)/2

    = (75 + 101)/2 = 88

  • Rank the measures

    88.375, 88, 88, 88


Measures of position
Measures of Position 20%”.

  • Measures of position are often used to make comparisons.

  • Two measures of position are percentiles and quartiles.


To find the quartiles of a set of data
To Find the Quartiles of a Set of Data 20%”.

  • Order the data from smallest to largest.

  • Find the median, or 2nd quartile, of the set of data. If there are an odd number of pieces of data, the median is the middle value. If there are an even number of pieces of data, the median will be halfway between the two middle pieces of data.


To find the quartiles of a set of data continued
To Find the Quartiles of a Set of Data continued 20%”.

  • The first quartile, Q1, is the median of the lower half of the data; that is, Q1, is the median of the data less than Q2.

  • The third quartile, Q3, is the median of the upper half of the data; that is, Q3is the median of the data greater than Q2.


Example quartiles
Example: Quartiles 20%”.

  • The weekly grocery bills for 23 families are as follows. Determine Q1, Q2, and Q3.

    170 210 270 270 280

    330 80 170 240 270

    225 225 215 310 50

    75 160 130 74 81

    95 172 190


Example quartiles continued
Example: Quartiles continued 20%”.

  • Order the data:

    50 75 74 80 81 95 130

    160 170 170 172 190 210 215

    225 225 240 270 270 270 280

    310 330

    Q2 is the median of the entire data set which is 190.

    Q1 is the median of the numbers from 50 to 172 which is 95.

    Q3 is the median of the numbers from 210 to 330 which is 270.


13.6 20%”.

Measures of Dispersion


Measures of dispersion
Measures of Dispersion 20%”.

  • Measures of dispersion are used to indicate the spread of the data.

  • The range is the difference between the highest and lowest values; it indicates the total spread of the data.


Example range
Example: Range 20%”.

  • Nine different employees were selected and the amount of their salary was recorded. Find the range of the salaries.

    $24,000 $32,000 $26,500

    $56,000 $48,000 $27,000

    $28,500 $34,500 $56,750

  • Range = $56,750  $24,000 = $32,750


Standard deviation
Standard Deviation 20%”.

  • The standard deviation measures how much the data differ from the mean.


To find the standard deviation of a set of data
To Find the Standard Deviation of a Set of Data 20%”.

  • 1. Find the mean of the set of data.

  • 2. Make a chart having three columns:

    • Data Data  Mean (Data  Mean)2

  • 3. List the data vertically under the column marked Data.

  • 4. Subtract the mean from each piece of data and place the difference in the Data  Mean column.


To find the standard deviation of a set of data continued
To Find the Standard Deviation of a Set of Data continued 20%”.

  • 5. Square the values obtained in the Data  Mean column and record these values in the (Data  Mean)2 column.

  • 6. Determine the sum of the values in the (Data  Mean)2 column.

  • 7. Divide the sum obtained in step 6 by n  1, where n is the number of pieces of data.

  • 8. Determine the square root of the number obtained in step 7. This number is the standard deviation of the set of data.


Example5
Example 20%”.

  • Find the standard deviation of the following prices of selected washing machines:

    $280, $217, $665, $684, $939, $299

    Find the mean.


Example continued mean 514

Data 20%”.

Data  Mean

(Data  Mean)2

217

297

(297)2 = 88,209

280

234

54,756

299

215

46,225

665

151

22,801

684

170

28,900

939

425

180,625

0

421,516

Example continued, mean = 514


Example continued mean 5141
Example continued, mean = 514 20%”.

  • The standard deviation is $290.35.


13.7 20%”.

The Normal Curve


Types of distributions

Rectangular Distribution 20%”.

J-shaped distribution

Types of Distributions


Types of distributions continued

Bimodal 20%”.

Skewed to right

Types of Distributions continued


Types of distributions continued1

Skewed to left 20%”.

Normal

Types of Distributions continued


Normal distribution
Normal Distribution 20%”.

  • In a normal distribution, the mean, median, and mode all have the same value.

  • Z-scores determine how far, in terms of standard deviations, a given score is from the mean of the distribution.


Example z scores
Example: 20%”.z-scores

  • A normal distribution has a mean of 50 and a standard deviation of 5. Find z-scores for the following values.

  • a) 55 b) 60 c) 43

  • a)

A score of 55 is one standard deviation above the mean.


Example z scores continued
Example: 20%”.z-scores continued

  • b)

    A score of 60 is 2 standard deviations above the mean.

  • c)

    A score of 43 is 1.4 standard deviations below the mean.


To find the percent of data between any two values
To Find the Percent of Data Between any Two Values 20%”.

1. Draw a diagram of the normal curve, indicating the area or percent to be determined.

2. Use the formula to convert the given values to z-scores. Indicate these z- scores on the diagram.

3. Look up the percent that corresponds to each z-score in Table 13.


To find the percent of data between any two values continued
To Find the Percent of Data Between any Two Values continued 20%”.

4.

  • a) When finding the percent of data between two z-scores on the opposite side of the mean (when one z-score is positive and the other is negative), you find the sum of the individual percents.

  • b) When finding the percent of data between two z-scores on the same side of the mean (when both z-scores are positive or both are negative), subtract the smaller percent from the larger percent.


To find the percent of data between any two values continued1
To Find the Percent of Data Between any Two Values continued 20%”.

  • c) When finding the percent of data to the right of a positive z-score or to the left of a negative z-score, subtract the percent of data between ) and z from 50%.

  • d) When finding the percent of data to the left of a positive z-score or to the right of a negative z-score, add the percent of data between 0 and z to 50%.


Example6
Example 20%”.

  • Assume that the waiting times for customers at a popular restaurant before being seated for lunch at a popular restaurant before being seated for lunch are normally distributed with a mean of 12 minutes and a standard deviation of 3 min.

  • a) Find the percent of customers who wait for at least 12 minutes before being seated.

  • b) Find the percent of customers who wait between 9 and 18 minutes before being seated.

  • c) Find the percent of customers who wait at least 17 minutes before being seated.

  • d) Find the percent of customers who wait less than 8 minutes before being seated.


Solution5

wait for at least 12 minutes 20%”.

Since 12 minutes is the mean, half, or 50% of customers wait at least 12 min before being seated.

between 9 and 18 minutes

Use table 13.7 page 801.

34.1% + 47.7%

= 81.8%

Solution


Solution continued1

at least 17 min 20%”.

Use table 13.7 page 801.

45.3% is between the mean and 1.67.

50%  45.3% = 4.7%

Thus, 4.7% of customers wait at least 17 minutes.

less than 8 min

Use table 13.7 page 801.

40.8% is between the mean and 1.33.

50%  40.8% = 9.2%

Thus, 9.2% of customers wait less than 8 minutes.

Solution continued


13.8 20%”.

Linear Correlation and Regression


Linear correlation
Linear Correlation 20%”.

  • Linear correlation is used to determine whether there is a relationship between two quantities and, if so, how strong the relationship is.

    • The linear correlation coefficient, r, is a unitless measure that describes the strength of the linear relationship between two variables.

      • If the value is positive, as one variable increases, the other increases.

      • If the value is negative, as one variable increases, the other decreases.

      • The variable, r, will always be a value between –1 and 1 inclusive.


Scatter diagrams
Scatter Diagrams 20%”.

  • A visual aid used with correlation is the scatter diagram, a plot of points (bivariate data).

    • The independent variable, x, generally is a quantity that can be controlled.

    • The dependant variable, y, is the other variable.

  • The value of r is a measure of how far a set of points varies from a straight line.

    • The greater the spread, the weaker the correlation and the closer the r value is to 0.


Correlation
Correlation 20%”.


Correlation1
Correlation 20%”.


Linear correlation coefficient
Linear Correlation Coefficient 20%”.

  • The formula to calculate the correlation coefficient (r) is as follows:


Example words per minute versus mistakes

Applicant 20%”.

Words per Minute

Mistakes

Ellen

24

8

George

67

11

Phillip

53

12

Kendra

41

10

Nancy

34

9

Example: Words Per Minute versus Mistakes

There are five applicants applying for a job as a medical transcriptionist. The following shows the results of the applicants when asked to type a chart. Determine the correlation coefficient between the words per minute typed and the number of mistakes.


Solution6

WPM 20%”.

Mistakes

x

y

x2

y2

xy

24

8

576

64

192

67

11

4489

121

737

53

12

2809

144

636

41

10

1681

100

410

34

9

1156

81

306

x = 219

y = 50

x2 =10,711

y2 = 510

xy = 2,281

Solution

  • We will call the words typed per minute, x, and the mistakes, y.

  • List the values of x and y and calculate the necessary sums.


Solution continued2
Solution continued 20%”.

  • The n in the formula represents the number of pieces of data. Here n = 5.


Solution continued3
Solution continued 20%”.

  • Since 0.86 is fairly close to 1, there is a fairly strong positive correlation.

  • This result implies that the more words typed per minute, the more mistakes made.


Linear regression
Linear Regression 20%”.

  • Linear regression is the process of determining the linear relationship between two variables.

  • The line of best fit (line of regression or the least square line) is the line such that the sum of the vertical distances from the line to the data points is a minimum.


The line of best fit
The Line of Best Fit 20%”.

  • Equation:


Example7
Example 20%”.

  • Use the data in the previous example to find the equation of the line that relates the number of words per minute and the number of mistakes made while typing a chart.

  • Graph the equation of the line of best fit on a scatter diagram that illustrates the set of bivariate points.


Solution7

From the previous results, we know that 20%”.

Now we find the y-intercept, b.

Solution

Therefore the line of best fit isy = 0.081x + 6.452


Solution continued4

x 20%”.

y

10

7.262

20

8.072

30

8.882

Solution continued

  • To graph y = 0.081x + 6.452, plot at least two points and draw the graph.



ad