1 / 63

Section 1.3

Section 1.3. The Normal Distribution. Strategy for Exploring Data. Always plot your data make a graph  usually a histogram or a stemplot Look for the overall pattern and for major deviations, such as outliers Remember: Shape , Center , and Spread

uri
Download Presentation

Section 1.3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Section 1.3 The Normal Distribution

  2. Strategy for Exploring Data • Always plot your data • make a graph  usually a histogram or a stemplot • Look for the overall pattern and for major deviations, such as outliers • Remember: Shape, Center, and Spread • Calculate a numerical summary to briefly describe center and spread • Sometimes the overall pattern of a large number of observations is so regular that we can describe it by a smooth curve

  3. Density Curves • A density curve is a curve that: • 1) always sits on or above the horizontal axis • 2) has area exactly 1 underneath it • A density curve describes the overall pattern of a distribution and is a mathematical model for the distribution • a mathematical model is an idealized description • A mathematical model gives a compact picture of the overall pattern of the data but ignores minor irregularities as well as any outliers

  4. Density Curves • It is easier to work with a smooth curve than with a histogram • Remember, histograms depend on our choice of classes (i.e., bin width) • The area under the curve and above any range of values is the proportion of all observations that fall in that range • Density curves, like distributions, come in many shapes • Symmetric, skewed (left or right), etc.

  5. Examples…

  6. Examples…

  7. Examples…

  8. The mean and median of a Density Curve • The median of a density curve is the equal-areas point • The point that divides the area under the curve in half • The mean of a density curve is the balance point • The point at which the curve would balance if made of solid material

  9. Symmetric Density Curve

  10. Right Skewed Density Curve

  11. Left Skewed Density Curve

  12. Word of Caution • Because a density curve is an idealized description of the distribution of data, we need to distinguish between the mean and standard deviation of the density curve and the mean and standard deviation computed from the actual (sample) observations actual observations density curve Mean standard deviation

  13. Normal Distributions • Normal curves are density curves that are: • symmetric • single-peaked • bell-shaped • describe Normal distributions • Normal distributions are described by giving its mean and its standard deviation • The mean is equal to the median (which property makes this true?)

  14. Normal Distributions • Changing  without changing  moves the Normal curve along the horizontal axis without changing its spread (location) • The standard deviation  controls the spread of a Normal curve • the larger  is, the larger the spread of the curve • We abbreviate the Normal distribution with mean  and standard deviation  as N(, )

  15. Normal distributions • Normal distributions are good descriptions for some distributions of real data • examples: manufacturing fill rates, crop yields, etc. • Normal distributions are good approximations to the results of many kinds of chance outcomes • examples: tossing a coin 1,000 times • Many statistical inference procedures based on Normal distributions work well for other roughly symmetric distributions

  16. The 68 - 95 - 99.7 Rule • In the Normal distribution with mean  and standard deviation : • 68% of the observations fall within 1 of  • 95% of the observations fall within 2 of  • 99.7% of the observations fall within 3 of  • By remembering these numbers, you can think about Normal distributions without constantly making detailed calculations

  17. The 68 – 95 – 99.7 Rule

  18. Example • The distribution of weights of 9 oz bags of potato chips is approximately Normal with mean  = 9.12 oz and standard deviation  = 0.15 oz • N( 9.12 , 0.15 ) • range for 68% of data: • 9.12 - 0.15 = 8.97 and 9.12 + 0.15 = 9.27 → ( 8.97 , 9.27 ) • range for 95% of data: • 8.97 – 0.15 = 8.82 and 9.27 + 0.15 = 9.42 → ( 8.82 , 9.42 ) • range for 99.7% of data: • 8.82 – 0.15 = 8.67 and 9.42 + 0.15 = 9.57 → ( 8.67 , 9.57 )

  19. Graph for Example

  20. Example Cont. • About what percent of bags weigh more than 9.12 ounces? • About what percent of bags weigh more than 9.42 ounces? • About what percent of bags weigh less than 8.67 ounces? We expect that 50 % of the bags weigh more than 9.12 oz We expect that 2.5 % of the bags weigh more than 9.42 oz We expect that 0.15 % of the bags weigh less than 8.67 oz

  21. The z-score • If x is an observation from a distribution that has mean  and standard deviation , the standardized value of x is often called a z-score • equation:

  22. The z-score • A z-score tells us how many standard deviations the original observation falls away from the mean, and in what direction • observations larger than the mean are positive when standardized • observations smaller than the mean are negative when standardized

  23. Chip Example continued… • Standardized weight: • z-score for bag weighing 9.3 ounces: • z-score for bag weighing 8.7 ounces:

  24. Standard Normal Distribution • Standardizing a variable that has a Normal distribution produces a new variable that has the standard Normal distribution • Normal distribution N(0, 1) with mean 0 and standard deviation 1 • if a variable x has a Normal distribution N(, ) with mean  and standard deviation , then the variable has the standard Normal distribution

  25. Standard Normal Dist. N(0,1)

  26. Normal Distribution Calculations • An area under a density curve is a proportion of the observations in a distribution • any question about what proportion of observations lies in some range of values can be answered by finding an area under the curve • because all Normal distributions are the same when we standardize, we can find areas under any Normal curve from a single table (Table A)

  27. The Standard Normal Table • Table A is a table of areas under the standard Normal curve • the table entry for each value z is the area under the curve to the left of z • example: z = 2.56 has an area of 0.0052 to the right of it

  28. Finding Normal Proportions • Step 1: state the problem in terms of the observed variable x • Step 2: standardize x to restate the problem in terms of a standard Normal variable z • Remember to draw a picture • Step 3: find the required area under the standard Normal curve using Table A and the fact that the total area under the curve is 1

  29. Chip Example continued… • What proportion of all 9-ounce bags of potato chips weighs less than 9.3 ounces? • N(9.12, 0.15) • standardized weight corresponding to 8.7 ounces: • See Graph on page 62, figure 1.23(a) • area from Table A:0.8849 (about 88.49 %)

  30. Chip Example continued… • What proportion of all 9-ounce bags of potato chips weighs less than 8.7 ounces? • N(9.12, 0.15) • standardized weight corresponding to 8.7 ounces: • area from Table A:0.0026 (about 0.26 %)

  31. Example 1.20 • The annual rate of return on stock indexes is approximately Normal • Since 1945, the S&P’s 500-stock index has had a mean yearly return of about 12%, with a standard deviation of 16.5% • The market is down for the year if the return on the index is less than zero • In what proportion of years is the market down?

  32. Step 1: • State problem in terms of the observed variable x • The annual rate of return for the S&P 500 is our variable x, which has the N(12 ,16.5) distribution. We want to find the proportion of years with x < 0.

  33. Step 2: • Standardize x to restate the problem in terms of a standard Normal variable z x < 0 Draw the picture!

  34. Step 3: • Find the required area under the standard Normal curve using Table A and the fact that the total area under the curve is 1 • Find the z-score to the first decimal place in the left-hand column labeled “z” • Follow that row to the right until you are under the column that equals the second decimal place of z. • This value is the proportion of all values from the distribution that are less than your observed z-score. z = -0.73 → Area = 0.2327

  35. Conclusion: • Interpret the result from Step 3 in terms of the original question of interest • The S&P 500 is down on an annual basis about 23.3% of the time. • By simply taking 100 % - 23.3 % = 76.7 %, we can also conclude that this stock index is up on an annual basis about 76.7% of the time.

  36. Example 1.21 • What percent of years have annual rates of return between 12% and 50%?

  37. Step 1: • State problem in terms of the observed variable x • We want the proportion of years with 12 ≤ x ≤ 50

  38. Step 2: • Standardize x to restate the problem in terms of a standard Normal variable z 12 ≤ x ≤ 50 Draw the picture!

  39. Step 3: • Find the required area under the standard Normal curve using Table A and the fact that the total area under the curve is 1 • the area between 0 and 2.30 is the area below 2.30 minus the area below 0. Use the picture to visualize! area between 0 and 2.30 = area below 2.30 – area below 0.00 = 0.9893 – 0.5000 = 0.4893

  40. Conclusion: • Interpret the result from Step 3 in terms of the original question of interest • About 49 % of years have annual rates of return between 12 % and 50 %

  41. General Information • The proportion of observations with x < 0 is the same as the proportion with x 0 (property of continuous curves) • There is no area under the curve at an exact value • for example: the proportion of years with 0% return is 0, even if there is such a year in the actual data • Sometimes we encounter a value of z more extreme than those appearing in Table A • for practical purposes, we can act as if there is zero area outside the range of Table A

  42. “Backward” Normal Calculations • We may want to find the observed value with a given proportion of the observations above or below it • To do this: • find the given proportion in the inside of the table, read the corresponding z from the left column and top row, then “unstandardize” to get the observed value • general formula to unstandardize a z-score

  43. Example 1.22 • Miles per gallon ratings of compact cars (2001 model year) follow approximately the N(25.7, 5.88) distribution • How many miles per gallon must a vehicle get to place in the top 10% of all 2001 model year compact cars?

  44. State the problem • We want to find the miles per gallon rating x with area 0.1 to its right under the Normal curve with mean µ = 25.7 and standard deviation σ = 5.88. (That’s the same as finding the miles per gallon rating x with area 0.9 to its left)

  45. Use the table • Look in the body of Table A for the entry closest to 0.9. • This is the entry corresponding to z = 1.28 • So z = 1.28 is the standardized value with area 0.9 to its left

  46. Unstandardize • Transform the solution from the z back to the original x scale. x = 25.7 + (1.28)(5.88) = 33.2

  47. Conclusion • Interpret the result in terms of the original question of interest • A compact car must receive a rating of at least 33.2 miles per gallon to place in the highest 10% of all 2001 model year compact cars.

More Related