Chapter 9
This presentation is the property of its rightful owner.
Sponsored Links
1 / 46

Chapter 9 PowerPoint PPT Presentation


  • 69 Views
  • Uploaded on
  • Presentation posted in: General

Chapter 9. Statistics. Frequency Distributions; Measures of Central Tendency. Frequency Distributions Three types of frequency distributions: Categorical – primarily for nominal, ordinal level data (FYI) Grouped – range of data is large

Download Presentation

Chapter 9

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Chapter 9

Chapter 9

Statistics


Frequency distributions measures of central tendency

Frequency Distributions; Measures of Central Tendency

  • Frequency Distributions

    • Three types of frequency distributions:

      • Categorical – primarily for nominal, ordinal level data (FYI)

      • Grouped – range of data is large

      • Ungrouped – range of data is small, single data values for each class (FYI)


Frequency distributions measures of central tendency1

Frequency Distributions; Measures of Central Tendency

  • Grouped Frequency Distributions

    • Step 1: Order data from smallest to largest

    • Step 2: Determine the number of classes (e.g. class intervals) using Sturges’ Rule k=1+3.322(log10n) where n is the number of observations (data values). *Always round up

      • Class intervals are contiguous, nonoverlapping intervals selected in such a way that they are mutually exclusive and exhaustive. That is, each and every value in the set of data can be placed in one, and only one, of the intervals.


Frequency distributions measures of central tendency2

Frequency Distributions; Measures of Central Tendency

  • Grouped Frequency Distributions

    • Step 3: Determine width of class intervals

      • Width (W) = Range (R)

        k

        where Range= largest value-smallest value

        k represents Sturges’ Rule


Frequency distributions measures of central tendency3

Frequency Distributions; Measures of Central Tendency

  • Grouped Frequency Distributions

    • Step 4: Assign observations to class intervals

      • The count in each class interval represents the frequency for that interval.

      • The smallest observation serves as the first lower class limit (LCL). Add the ‘width minus one’ to the LCL to get UCL (upper class limit)

        • NOTE: Technically, class limits (i.e., 0-5, 6-11, 12-17 and so on) are not adjacent.

          However, class boundaries account for the space between the class limit intervals (i.e., 0.5 – 5.5, 5.5-11.5, 11.5-17.5 and so on). Boundaries are written for convenience but understood to mean all values up to but not including the upper boundary.


Frequency distributions measures of central tendency4

Frequency Distributions; Measures of Central Tendency

  • Grouped Frequency Distributions

    • Step 5: Calculate cumulative & relative frequencies

      • Cumulative Frequency-Add number of observations from the first interval through the preceding interval, inclusive.

      • Relative Frequency – Divide number of observations in each class interval by the total number of observations

      • Cumulative Relative Frequency-Same calculation as cum-ulative frequency, but using the relative frequencies

        • A Frequency Distribution Table

          Class Int. Freq. Cum. Freq. Rel. Freq. Cum. Rel. Freq.

          LCL - UCL


Frequency distributions measures of central tendency5

Frequency Distributions; Measures of Central Tendency

  • Measures of Central Tendency – the value(s) the data tends to center around

    • Arithmetic mean (average)

    • Mode

    • Median


Frequency distributions measures of central tendency6

Frequency Distributions; Measures of Central Tendency

  • Measures of Central Tendency

    • Arithmetic mean (sample mean or sample average) --“x-bar”

      • Ungrouped data (individual data such as 5, 6, 10, 14, etc.

        _

        x =  xi

        n

        _

        x = x1 + x2 + x3 +… + xn

        n

        • where xi is each data value (observation) in the data set.

        • where n is the number of observations in the data set


Frequency distributions measures of central tendency7

Frequency Distributions; Measures of Central Tendency

  • Calculate the sample mean for ungrouped data:

    • Step 1: add all values in a data set

    • Step 2: divide the total by the number of values summed.


Frequency distributions measures of central tendency8

Frequency Distributions; Measures of Central Tendency

  • Example

    • 7.06.27.78.06.46.27.2 5.4 6.46.57.25.4

    • n = 12 *This is ungrouped data

      _

    • x = 7.0+6.2+7.7+8.0+6.4+6.2+7.2+5.4+6.4+6.5+7.2+5.4

      12

    • = 79.6

      12

    • = 6.63


Frequency distributions measures of central tendency9

Frequency Distributions; Measures of Central Tendency

  • Grouped data (assumes each value (observation) falling within a given class interval is equal to the value of the midpoint of that interval

    _

    x =  fi xi

    n

    • where xi represents each class interval midpoint (class mark)*

      *an easy way to determine the class mark is to simply add the upper class limit (boundary) to the lower class limit (boundary) then divide by 2.


Frequency distributions measures of central tendency10

Frequency Distributions; Measures of Central Tendency

  • Calculate the sample mean for grouped data:

    • Step 1: multiply each class mark by its corresponding frequency

    • Step 2: add the resulting products

    • Step 3: divide the total by the number of observations


Frequency distributions measures of central tendency11

Frequency Distributions; Measures of Central Tendency

  • Example

  • Class LimitsFrequency Class MarkxI fI

    90 – 986 (see note below)94564

    99-107221032266

    108-116431124816

    117-125281213388

    126-1349 1301170

    10812204

    _

  • x = 12204 = 113

    108

    Note: Where did the number 6 come from? There are 6 data values

    (observations) in the data set that fall between the range 90-98

    (inclusive)


Frequency distributions measures of central tendency12

Frequency Distributions; Measures of Central Tendency

  • Measures of Central Tendency

    • Mode – value that occurs most frequently

      • Ungrouped data

        • Step 1: identify the data value that occurs most frequently

          • Bi-modal -two values occurring at the same frequency

          • No mode – all values different (not same as mode=0)

      • Grouped data

        • Step 1: specify the modal class (i.e., the class interval containing the largest number of observations


Frequency distributions measures of central tendency13

Frequency Distributions; Measures of Central Tendency

  • For ungrouped data <mode>

    • 7.06.27.78.06.46.27.2 5.4 6.46.57.25.4

  • There are four numbers that appear two times each:

  • 5.4 6.26.4 7.2Therefore there are four modes.

  • The data set is quad-modal


Frequency distributions measures of central tendency14

Frequency Distributions; Measures of Central Tendency

  • For grouped data<modal class>

    • The modal class: 108-116 or 3rd class (The class with the largest number of data values)


Frequency distributions measures of central tendency15

Frequency Distributions; Measures of Central Tendency

  • Measures of Central Tendency

    • Median – The value above which half the values in a data set lie and below which the other half lie. (The middle value)

      • Ungrouped Data

        • Step 1: arrange the values in order of magnitude (smallest to largest)

        • Step 2: locate the middle value


Frequency distributions measures of central tendency16

Frequency Distributions; Measures of Central Tendency

  • For ungrouped data <median>

  • 5.4 5.4 6.2 6.2 6.4 6.4 6.5 7.0 7.2 7.2 7.78.0

    • Even number of values therefore we must get an average of the middle two values

  • 6.4 + 6.5=6.45

    2


Measures of variation dispersion

Measures of Variation (Dispersion)

  • Range (R) (for ungrouped data only)

    • Ungrouped data

      • Step 1: Take the difference between the largest and smallest values in a data set. For example, a data set such as 5, 6, 10, 14 has a range of 9 because 14 (the largest value) minus 5 (the smallest value) is 9.


Measures of variation dispersion1

Measures of Variation (Dispersion)

  • Deviations from the Mean

    • Differences found by subtracting the mean from each number in a sample

      • Given 3, 5, 2, 6

        • The mean ( ) is 4

        • The deviations from the mean would be -1, 1, -2, 2


Measures of variation dispersion2

Measures of Variation (Dispersion)

  • Variance (s2) - an average of the squares of the deviations of the individual values from their mean.

    • Ungrouped data

      s2 =  (xi – )2

      n-1


Measures of variation dispersion3

Measures of Variation (Dispersion)

  • Standard deviation (s)

    • Step 1: Calculate the sample standard deviation for grouped or ungrouped data by:

      • taking the square root of the variance


Measures of variation dispersion4

Measures of Variation (Dispersion)

  • Example

  • 8630059

    213710 036

    _ *This is ungrouped data

  • x = 4.2

    n = 15

  • (a) Range (R) = 10 – 0 = 10

  • (b) variance (s2) = (8-4.2)2 + (6-4.2)2 + (3-4.2)2 + (0-4.2)2 + (0-4.2)2 + (5-4.2)2 + (9-4.2)2 + (2-4.2)2 + (1-4.2)2 + (3-4.2)2 + (7-4.2)2 + (10-4.2)2+ (0-4.2)2 +(3-4.2)2 + (6-4.2)2 _________

    15-1

    = 158.40__

    14

    = 11.31

  • (c) standard deviation (s) = the square root of 11.31 = 3.36


Measures of variation dispersion5

Measures of Variation (Dispersion)

  • Grouped data

    s2 = n ( xi2 fi) - (xi fi)2

    n(n-1)

    • where xi represents each class boundary (or limit) midpoint (class mark)*

    • where fi represents each class frequency

      *an easy way to determine the class mark is to simply add the upper class limit (boundary) to the lower class limit

      (boundary) then divide by 2.


Measures of variation dispersion6

Measures of Variation (Dispersion)

  • Calculate the sample variance for grouped data:

    • Step 1: multiply each squared class mark by its corresponding frequency

    • Step 2: add the resulting products

    • Step 3: multiply the sum by n[A]

    • Step 4: multiply each class mark by its corresponding frequency

    • Step 5: add the resulting products

    • Step 6 :square the sum[B]

    • Step 7: perform subtraction[C] = [A] – [B]

    • Step 8: divide [C] by n(n-1)


Measures of variation dispersion7

Measures of Variation (Dispersion)

  • Example

  • Class limitsfreq(fi) xi xifi xi2fi

    90 – 986 94564 (946)53,016 [(942)6]

    99-107221032266233,398

    108-116431124816539,392

    117-125281213388409,948

    126-1349 1301170152,100

    10812204 1,387,854


Measures of variation dispersion8

Measures of Variation (Dispersion)

  • Refer to the formula for variance of grouped data below and see if you can fill in the formula using values from the table on the previous slide.

    s2 = n ( xi2 fi) - (xi fi)2

    n(n-1)


Measures of variation dispersion9

Measures of Variation (Dispersion)

  • s2 = 108(1,387,854) – (12,204)2

    108(107)

  • = 149,888,232.0 - 148,937,616.0

    11,556

  • = 950,616

    11,556

  • = 82.26

  • Therefore s = 9.07


The normal distribution

The Normal Distribution

  • The Normal Distribution

    • Also known as the “bell-shaped” curve

      • Some statisticians say it is the most important distribution in statistics

      • Most popular distribution in statistics


The normal distribution1

The Normal Distribution

  • The normal density function is given by

    • where ∏≈ 3.142 and ex ≈ 2.718


The normal distribution2

The Normal Distribution

  • Properties of the Normal Distribution

    - symmetrical about mean;

    - mean = median = mode

    - area under the curve = 1

    - each different and specifies different normal distribution, thus the normal distribution is really a family of distributions

    - a very important member of the family is the standard normal distribution


The normal distribution3

The Normal Distribution

  • The Standard Normal Distribution

    • has mean (μ) = 0

    • has standard deviation (σ) = 1

    • the normal density function reduces to


The normal distribution4

The Normal Distribution

  • The probability that z lies between any two points on the z-axis is determined by the area bounded by perpendiculars erected at each of the points, the curve, and the horizontal axis.

P(a <z< b)


The normal distribution5

The Normal Distribution

  • Generally we find the area under the curve for a continuous distribution via calculus by integrating the function between a & b.

dz


The normal distribution6

The Normal Distribution

  • However, we don't have to integrate because we have a table that has calculated this area

    • See TABLE 1 of Appendix A-2


The normal distribution7

The Normal Distribution

  • Exercises 6-3 #7 p. 282

    • Find the area under the normal distribution curve between z = 0 and z = 0.56

    • So, we want P (0 < z < 0.56)

    • From the standard normal table we find that

      P (0 < z < 0.56) = 0.2123

where a = 0 and b = 0.56


The normal distribution8

The Normal Distribution

  • Exercises 6-3 #16 p. 283

    • Find the area under the normal distribution curve between z = -0.87 and z = -0.21

    • So we want P(-0.87 < z < -0.21)

a b 0

where a = -0.87 and b =-0.21


The normal distribution9

The Normal Distribution

  • Exercises 6-3 #16 p. 283 con’t

  • The table gives a probability of 0.3078 at z = 0.87 (note area same for negative or positive z since distribution is symmetrical). This area covers values of z from 0 out to -.87. Since we don’t want that entire area we subtract the area from 0 out to -.21. That is , we subtract .0832 which is the area under the curve at z = 0.21

    • So 0.3078 – 0.0832 = 0.2246


The normal distribution10

a 0 b

The Normal Distribution

  • Exercises 6-3 #25 p. 283

    • Find the area under the normal distribution curve to the right of z = 1.92 and to the left of

      z = -0.44

    • So we want P(z >1.92)  P(z < -0.44) = 0.3574

      where a = -0.44 and b = 1.92


The normal distribution11

The Normal Distribution

  • Exercises 6-3 #25 p. 283 Con’t

  • Since the area at z = .44 is 0.1700 which is the area under the curve from 0 out to 0.44, the remaining area of interest has to be 0.5 – 0.1700 = 0.3300.

    AND

    Since the area at z = 1.92 is .4726 which is the area under the curve from 0 out to 1.92, the remaining area of interest has to be 0.5 – 0.4726 = 0.0274. So the combined areas of interest are

    0.3300 + 0.0274 = 0.3574


The normal distribution12

0.8962

z 0

The Normal Distribution

  • Exercises 6-3 #45 z = ?

    • Given that the shaded area is 0.8962, what would be the value of z?

      • z has to be equal to -1.26. Since the area from 0 out to z is equal to 0.3962 (0.8962 - 0.5000) Recall that one-half of the area under the curve is .5. If we look in the body of the standard normal table for an area of 0.3962 we find that value at the intersection of the 13th row and 7th column which corresponds to a z value of 1.26. Since z is located to the left of 0 it has to be negative, hence – 1.26.


The normal distribution13

The Normal Distribution

  • Section 6-4

  • Applications of the Normal Distribution

    • To solve problems for a normally distributed variable with a  0 or  1 we MUST transform the variable to a standard normal variable, that is

      P(x1 < X < x2) becomes P(z1 < Z < z2) which

      allows us to use the standard normal table.

    • Using z = value – mean = x - 

      standard dev.


The normal distribution14

-2.58 0

The Normal Distribution

  • Example

    • A survey found that people keep their television sets an average of 4.8 years. The standard deviation is 0.89 year. If a person decides to buy a new TV set, find the probability that he or she has owned the set for the following amount of time. Assume the variable is normally distributed.

      • Less than 2.5 years

      • Between 3 and 4 years

      • More than 4.2 years

    •  = 4.8  = 0.89

  • (a) P(x < 2.5) becomes P(z<-2.58) because z = (2.5 – 4.8)/ 0.89 = -2.58

    The area under the curve at Z=2.58 is 0.4951 therefore the P(z<-2.58) = 0.5 – 0.4951 = 0.0049


The normal distribution15

-2.02-.9 0

The Normal Distribution

  • (b) P(3 < X < 4) becomes P(-2.02 < z < -0.9) because z = (3-4.8)/ .89 = -2.02 and z=(4-4.8)/.89 = -0.90

  • from the standard normal table at a z of 2.02 we get .4783 and at a z of .9 we get .3159 so the P(-2.02 < z < -0.9) = .4783 - .3159 = .1624


The normal distribution16

-.67 0

The Normal Distribution

  • (c) P (x > 4.2) becomes P(z > -0.67) because z = (4.2-4.8)/.89 = -0.67

  • from the standard normal table at z of .67 we get .2486 so the P(z > -0.67) = 0.2486 + 0.5 = 0.7486


The normal distribution17

-.67 0 .67

The Normal Distribution

  • Review Exercises #9

    • Area (%age) = .5

    • = 100  = 15

  • We can find the X values that correspond to the z values by using the same transformation equation.

  • -0.67 = (x – 100)/15 and0.67 = (x -100)/15

    15(-.67) = x – 10015(.67) = x - 100

    x = 89.95x = 110.05

    therefore the highest and lowest scores are in the range (89.95 < x < 110.05)


  • Login