What you’ll learn • Properties of the standard normal distn • How to transform scores into normal distn scores • Determine the proportion of observations above, below and between two stated numbers in a normal distribution. • Calculate the point for a variable with a normal distribution for which a stated proportion of values lie either above or below. • Comparing individuals from different distributions
Standard Normal Distribution • The Standard Normal Distribution (also known as the “z-distribution”) N( 0, 1)
Standardizing Scores • We find that all normal distributions are the same if we measure in units of σ. • We
Using the Standard Normal Distribution • The level of cholesterol in the blood is important because high cholesterol levels may increase the risk of heart disease. We know that the distribution of blood cholesterol levels in a large population of people of the same age and sex is roughly normal. For 14-year-old boys, the mean is μ=170 mg/dl and the standard deviation, σ=30m/dl. Levels above 240 mg/dl may require medical attention.
Steps to solving a “normal” distn problem. • Step 1: • Write the question as a probability statement. • Step 2: • Calculate a z-score • Draw a picture and shade the region • Step 3: • Find the appropriate region using a standard normal table • Step 4: • Write the answer in the context of the problem
Question with Area belowWhat percent of 14-year-old boys have less than 160 mg/dl of cholesterol? • Step 1 (probability statement) • P(X< 160) • Step 2: (z-score) • Since we want the percent of boys whose cholesterol is less than 160, we will find the percent of boys whose cholesterol -.33σ or more below the mean.
Step 3: (Area from Table A)We can now use Table A to find the percent of observations below -0.33. (Remember that Table A always gives the area under the curve below a given value.
Step 3 (cont.) • The area under the curve (the proportion of observations) below -3.3σ is .3707 • Step 4: (Context) The percent of 14-year-old boys whose cholesterol level is less than 160mg/dl is approximately 37.07%
Question with Area aboveWhat percent of 14-year-od boys have more that 240mg/dl of cholesterol? • Step 1 (probability statement) • P(X> 240) • Step 2: (z-score) • Since we want the percent of boys whose cholesterol is greater than 240, we will find the percent of boys whose cholesterol 2.33σ or more above the mean.
Step 3: (Area from Table A)We can now use Table A to find the percent of observations below 2.33. (below because that’s what our table gives us)
Step 3: Area (continued) • The value from the table is .9901. We need to remember that the table gives us area below a value. Since the total area under the curve is 1, to find the area above we can subtract the area from the table from 1. So 1- .9901 = .0099 • Step 4: (context) • The percent of 14-year-old boys whose cholesterol level is more than 240 mg/dl is approximately .99%.
Question between two valuesWhat percent of 14-year-old boys have cholesterol levels between 170mg/dl and 240 mg/dl • Step 1 (probability statement) • P(170 < X < 240) • Step 2: (z-scores, we need to find z-scores for both endpoints) • Since we want the percent of boys whose cholesterol is between 170 mg/dl and 240mg/dl, we will find the percent of boys whose cholesterol is between 0σ and 2.33σ.
Step 3: (Area from Table A)We can now use Table A to find the percent of observations below 2.33 and the area below z= 0.00 (below because that’s what our table gives us)
Step 3: Area (continued) • The values from the table are .9901 for the z-score of 2.33 and .5000 for the z-score of 0. We need to remember that the table gives us area below a value. We can take the area from 2.33 (.9901) and subtract the area from 0 (.5000) to get the area between. So: .9901 - .5000 = .4901 Step 4: Context--- The percent of 14-year-old boys whose cholesterol is between 170 and 240 is approximately 49.01%
Finding the value of the variable when we know the percent above or below • What cholesterol level do the top 10% of 14-year-old boys have? Step 1: Write a probability statement P ( X >x)= .10 This statement says: we want to find the value that separates the top 10% from the bottom 90% of the curve. Since our table gives area below the curve, we will find a z-score that corresponds to 90% area
Step 2: Find the z-score from the table. Remember that the area is located on the “inside” of the table. Since the z-score that we are looking for is above the mean, we know the z-score will be positive. We’ll look for a value close to .9000. The closest value is .8997, so we will use a z-score of 1.28
Step 3: Using the z-score found, use the formula to standardize values substituting the three known values. Now using algebra, solve the equation for X Step 4: Write a statement back in context A 14-year-old boys cholesterol level must be at least 208.40 to be in the top 10% of cholesterol levels.
Comparing Individuals • One of the best reasons to standardize values (find their corresponding z-scores) is to be able to compare individuals from different distributions. • Consider again the three baseball players that we looked at earlier in the year Ty Cobb Ted Williams George Brett .420 .406 .390 How can we compare the batting averages of these players when they played in different eras under different conditions? Was Ty Cobb actually the best hitter of these three? Let’s find out.
Comparing Individuals (Cont.) • We know that batting averages are quite symmetric and reasonably normal with the following characteristics for each era:
Now, using that information, find the corresponding z-score for each player. • Ty Cobb Ted Williams George Brett Now that we have standardized each score onto the standard normal curve, we can compare the scores of these three individuals. Since, in this case, a larger value indicates a better batting average---it appears that Ted Williams is the best batter of these three. 4.26 > 4.15 > 4.07
Additional Resources • Practice of Statistics, Pg 83-97