1 / 54

Basic Quantitative Methods in the Social Sciences (AKA Intro Stats)

Basic Quantitative Methods in the Social Sciences (AKA Intro Stats). 02-250-01 Lecture 4. A Quick Review. The entire area under the normal curve can be considered to be a proportion of 1.00 A proportion of .50 lies to the left of the mean, and a proportion of .50 lies to the right of mean.

holly-moore
Download Presentation

Basic Quantitative Methods in the Social Sciences (AKA Intro Stats)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Basic Quantitative Methods in the Social Sciences(AKA Intro Stats) 02-250-01 Lecture 4

  2. A Quick Review • The entire area under the normal curve can be considered to be a proportion of 1.00 • A proportion of .50 lies to the left of the mean, and a proportion of .50 lies to the right of mean

  3. Area Under the Normal Distribution and Z-Scores Normal Distribution with z-score points of reference:

  4. Properties of Area Under the Normal Distribution • Since the normal curve is a bell shape, the proportion of scores between whole z-scores is not equal • For example, .3413 of the scores lie between the z-scores of 0 (the mean) and 1 (or -1), while only .1359 of the scores lie between the z-scores of 1 and 2 (or -1 and -2)

  5. Properties of Area Under the Normal Distribution .3413 .3413 .1359 .1359 .0215 .0215 .0013 .0013 Z = -3 -2 -1 0 +1 +2 +3

  6. Properties of Area Under the Normal Distribution Z-scores* Proportion under the curve -1 to +1 .6826 (.3413+.3413) -2 to +2 .9544 -3 to +3 .9974 -4 to +4 1.0000 *Z-scores are expressed in standard deviation units, i.e., a z-score of -1 represents one standard deviation below (to the left of) the mean

  7. Normal Distribution Example • A study of 2500 University of Windsor students showed that the average amount of sleep lost in the week prior to writing a statistics exam (in hours) was normally distributed with = 7.79 and = 1.75 (don’t worry, this isn’t real data!) • This distribution is shown with the abscissa (x-axis) marked in raw score and z-score units:

  8. Normal Distribution Example .3413 .3413 .1359 .1359 .0215 .0215 .0013 .0013 X = 2.54 4.29 6.04 7.79 9.54 11.29 13.04 Z = -3 -2 -1 0 +1 +2 +3 Z = -3 -2 -1 0 +1 +2 +3

  9. Example cont. • We can see from this diagram that 34.13% of U of W students lost between 6.04 and 7.79 hours of sleep in the week prior to a stats test (between z=-1 and z=0) • 13.59% of students lost between 9.54 and 11.29 hours of sleep in that week (between z=+1 and z=+2) • 49.87% of students lost between 2.54 & 7.79 hours of sleep (between z=-3 and z=0) (.0215+.1359+.3413 = .4987 = 49.87%)

  10. Properties of Area Under the Normal Distribution • The symbol is used to denote the z-score having area (alpha) to its right under the normal curve • The proportion of area under the curve between the mean and a z-score can be found with the help of a table (Table E.10, Howell, p. 452) and a little math… • In this example, we want to know the area between the mean and z = 0.20: • Look under the column “mean to z” at z=0.20 • The proportion = 0.0793 • Therefore, .0793 (or almost 8%) is the proportion of data scores between the mean and the score that has a z score of 0.20

  11. Example cont. • This means that the area between the mean and z = 0.20 has an area under the curve of 0.0793: .0793 .4207 Z: 0 0.20

  12. Example cont. • Since half of the normal distribution has an area of .5000, we can determine the area beyond z = .20 by subtracting the area from the mean to z = .20 from .5000: • Area beyond z=.20 = .5000 - .0793 • Area beyond z=.20 = .4207 • (Note: If you look at the “smaller portion” in the table, you will see it’s .4207)

  13. Example cont. • Since the normal curve is symmetrical, the area between the mean and z = -.20 is equal to the area between the mean and z = +.20: .0793 .0793 .4207 .4207 Z: -0.20 0 +0.20

  14. Normal Distribution Table • Table E.10 has 3 columns: • Mean to z • Larger portion • Smaller portion

  15. Table: Mean to z

  16. Table: Larger Portion

  17. Table: Smaller Portion

  18. A Couple of Notes • 1) Always report proportions (area under the curve) to four decimal places. This means that if you report an area as a percentage, it will have two decimal places (e.g., .7943 = 79.43%) • 2) When using Table E.10, be careful not to confuse z=.20 with z=.02 (this is a common mistake) • 3) Remember that a negative z value has the same proportion under the curve as the positive z value because the normal distribution is symmetrical • 4) When working on z-score problems, it is highly recommended that you draw a normal distribution and plot the mean, x, and their corresponding z-scores

  19. Another Example! • We often want to know what the area between two scores is, as in this example: • Assume that the marks in this class are normally distributed with = 69.5 and = 7.4. What proportion of students have marks between 50 and 80?

  20. Example: Area Between 2 Scores 1) Calculate the z-scores for X values (50 & 80) z = (50-69.5)/7.4 = -19.5/7.4 = -2.64 z = (80-69.5)/7.4 = 10.5/7.4 = 1.42 2) Find the proportions between the mean and both z-scores (consult Table E.10) z(-2.64) = .4959 is the proportion between the mean and z. z(1.42) = .4222 is the proportion between the mean and z.

  21. Example: Area Between 2 Scores • Third, add these proportions together to find your answer: .4959 + .4222 = .9181 • This means that 91.81% of students have Stats marks between 50 and 80

  22. Smaller and Larger Portions • Smaller portion = proportion in the tail • Larger portion = proportion in the body • Using the same data ( = 69.5 and = 7.4) we can calculate areas using the Smaller and Larger Portions in the Normal Distribution table: • Find the number of students who have stats marks of less than 80.6 • z = (80.6-69.5)/7.4 = +1.5

  23. Larger Portion • Area below z = +1.5 = 0.9332 • This means that 93.32% of students had a mark of 80.6 or less in this class

  24. Smaller Portion • Find the number of students who have marks of 76.93 or better: • z = (76.93-69.5)/7.4 = 1.00 • Area in smaller portion = .1587 • This means that 15.87% of students in this class had a mark of 76.93 or better

  25. Converting Back to X • Assume = 30 and = 5, what raw scores correspond to z=-1.00 and z=+1.5?

  26. Proportion • What proportion of scores lie between z=-1.00 and z=+1.50? • Area from mean to z=-1.00 = .3413 • Area from mean to z=+1.50 = .4332 • Add them together to get the proportion that lies between these two z-scores: .3413+.4332 = .7745

  27. Finding for Number of Observations • In this example, if we know the sample size, (e.g., n=212) we can calculate how many people lie between z=-1.00 and z=+1.50: • Area between z=-1.00 and z=+1.50 = .7745 (see the last slide) • Multiply the proportion by n: (.7745)(212) = 164.19 Approximately 164 people

  28. And a Little More • Finally, we can find a z-score from the table if we know the proportion of scores (i.e., we can work backwards): • Suppose the birth weight of newborns is normally distributed with = 7.73 and = 0.83 • What birth weight identifies the top (heaviest) 10% of newborns?

  29. Example cont. • Look at Table E.10 and find the z-score that identifies the top proportion of 0.1000: look in the smaller portion column (the tail) .1000 z = ?

  30. Example cont. • Looking in the smaller portion column, we find that • z=1.28 has an area of .1003 • z=1.29 has an area of .0985 • Which do we pick? • Pick the one that is closest to an area of .1000: this is z=1.28

  31. Example cont. • Now solve for X: X = (1.28)(0.83) + 7.73 = 1.06 + 7.73 = 8.79 So any weight equal to or greater than 8.79 pounds is in the top 10% of birth weights

  32. Probability • Everything that can possibly happen has some likelihood of happening: probability is a measure of that likelihood • Probability: The quantitative expression of likelihood of occurrence

  33. Probability • Probability is a ratio of frequencies • The numerator (top) is the frequency of the outcome of interest • The denominator (bottom) is the frequency of all possible outcomes

  34. Coin Toss Example • If a fair* coin is tossed in the air, it can land on either heads or tails • This means a coin has 2 possible outcomes • If we want to know the probability of tossing a fair* coin and having it land on heads, we calculate as follows: *Note: fair means a normal coin, one that is not weighted differently

  35. Coin Toss Frequency of interest Frequency of all possible outcomes For a coin toss, this is : 1 2 The probability of the coin landing on heads is: p(heads) = ½, or p(heads) = .5

  36. Another Example • Suppose there are 90 students in a class, 59 of them are women and 31 are men • If one of the students is chosen at random, the probability of choosing a woman is: p(woman) = 59/90

  37. More Probability • If the entire class was women (e.g., there were no male students), the probability of choosing a woman would be 90/90 • If the entire class was men, the probability of choosing a woman would be 0/90

  38. More Probability • As a numerical value, probabilities can range from 0.00 to 1.00 • The numerator can range from a minimum of 0 to a maximum equal to the denominator

  39. Express Yourself! • Probability can be expressed as a fraction, e.g., p(woman) = 59/90 • Or as a decimal fraction: p(woman) = .6556 • Although not usually expressed as a percentage (e.g., 65.56%), they often are in popular media

  40. Probability cont. • Even if we do not know the actual observed frequencies (e.g., the number of women), probabilities can be determined theoretically • Without throwing a die, we can deduce the probability of landing on a 5

  41. Die Example cont. • We know the die has 6 sides - 6 possible outcomes • We are only interested in one side (the 5), so the probability of landing on a 5 is: p(5) = 1/6 = 0.1667

  42. Probability and the Normal Distribution • The normal distribution can be thought of as a probability distribution. Here’s how: • We know (from Table E.10) the proportion of scores that fall above or below a given z score • If you were to randomly pick a score from a sample of scores, what is the probability that you would pick a score that has a corresponding z score of .40 or greater?

  43. Probability and the Normal Distribution • The proportion of scores above or below a given z score is the same as the probability of selecting a score above or below the z score • e.g., the probability of selecting a score from a normal distribution that has a z score of .40 or greater is .3446 (the area in the smaller portion of z = .40)

  44. Example #1 • Suppose people’s scores on a personality test are normally distributed with a mean of 50 and a population standard deviation of 10. • If you were to pick a person completely at random, what is the probability that you would pick someone with a score on this personality test that is higher than 60?

  45. Example #1 • Step #1: Write down what you know • Step #2: What do you want to find? • Step #3: Draw the normal distribution, write in the mean, standard deviation, and the X and shade the area you are looking for

  46. Example #1, Step #3 X: 20 30 40 50 60 70 80

  47. Example #1 • Step #4: Calculate z score(s) • Step #5: Use Table E.10 to find the probability of selecting a score in your shaded area • Here we want or • Look up the smaller portion of z=1.00

  48. Example #1 • Step #6: Interpret: • The probability of picking someone at random who has a personality test score of 60 or greater is .1587

  49. Example #2 • Length of time spent waiting in line to buy tickets at the movies is normally distributed with a mean of 12 minutes and a population standard deviation of 3 minutes. • If you go to see a movie, what is the probability that you will wait in line to buy tickets for between 7.5 and 15 minutes?

  50. Example #2 • Step #1: Write down what you know • Step #2: What do you want to find? • Step #3: Draw the normal distribution, write in the mean, standard deviation, and both X scores and shade the area you are looking for

More Related