Chapter 6: Probability Distributions

Chapter 6: Probability Distributions Section 6.1: How Can We Summarize Possible Outcomes and Their Probabilities?

Learning Objectives • Random variable • Probability distributions for discrete random variables • Mean of a probability distribution • Summarizing the spread of a probability distribution • Probability distribution for continuous random variables

Learning Objective 1:Randomness • The numerical values that a variable assumes are the result of some random phenomenon: • Selecting a random sample for a population or • Performing a randomized experiment

Learning Objective 1:Random Variable • A random variable is a numerical measurement of the outcome of a random phenomenon.

Learning Objective 1:Random Variable • Use letters near the end of the alphabet, such as x, to symbolize • Variables • A particular value of the random variable • Use a capital letter, such as X, to refer to the random variable itself. Example: Flip a coin three times • X=number of heads in the 3 flips; defines the random variable • x=2; represents a possible value of the random variable

Learning Objective 2:Probability Distribution • The probability distribution of a random variable specifies its possible values and their probabilities. Note: It is the randomness of the variable that allows us to specify probabilities for the outcomes

Learning Objective 2:Probability Distribution of a Discrete Random Variable • A discrete random variableX has separate values (such as 0,1,2,…) as its possible outcomes • Its probability distribution assigns a probability P(x) to each possible value x: • For each x, the probability P(x) falls between 0 and 1 • The sum of the probabilities for all the possible x values equals 1

Learning Objective 2:Example • What is the estimated probability of at least three home runs? P(3)+P(4)+P(5)=0.13+0.03+0.01=0.17

Learning Objective 3:The Mean of a Discrete Probability Distribution • The mean of a probability distribution for a discrete random variable is where the sum is taken over all possible values of x. • The mean of a probability distribution is denoted by the parameter, µ. • The mean is a weighted average; values of x that are more likely receive greater weight P(x)

Learning Objective 3:Expected Value of X • The mean of a probability distribution of a random variable X is also called the expected value of X. • The expected value reflects not what we’ll observe in a single observation, but rather that we expect for the average in a long run of observations. • It is not unusual for the expected value of a random variable to equal a number that is NOT a possible outcome.

Learning Objective 3:Example • Find the mean of this probability distribution. The mean: = 0(0.23) + 1(0.38) + 2(0.22) + 3(0.13) + 4(0.03) + 5(0.01) = 1.38

Learning Objective 4:The Standard Deviation of a Probability Distribution The standard deviation of a probability distribution, denoted by the parameter, σ, measures its spread. • Larger values of σ correspond to greater spread. • Roughly, σ describes how far the random variable falls, on the average, from the mean of its distribution

Learning Objective 5:Continuous Random Variable • A continuous random variable has an infinite continuum of possible values in an interval. • Examples are: time, age and size measures such as height and weight. • Continuous variables are measured in a discrete manner because of rounding.

Learning Objective 5:Probability Distribution of a Continuous Random Variable • A continuous random variable has possible values that form an interval. • Its probability distribution is specified by a curve. • Each interval has probability between 0 and 1. • The interval containing all possible values has probability equal to 1.

Chapter 6: Probability Distributions Section 6.2: How Can We Find Probabilities for Bell-Shaped Distributions?

Learning Objectives • Normal Distribution • 68-95-99.7 Rule for normal distributions • Z-Scores and the Standard Normal Distribution • The Standard Normal Table: Finding Probabilities • Using the TI-calculator: find probabilities

Learning Objectives • Using the Standard Normal Table in Reverse • Using the TI-calculator: find z-scores • Probabilities for Normally Distributed Random Variables • Percentiles for Normally Distributed Random Variables • Using Z-scores to Compare Distributions

Learning Objective 1:Normal Distribution The normal distribution is symmetric, bell-shaped and characterized by its mean µ and standard deviation . • The normal distribution is the most important distribution in statistics • Many distributions have an approximate normal distribution • Approximates many discrete distributions well when there are a large number of possible outcomes • Many statistical methods use it even when the data are not bell shaped

Learning Objective 1:Normal Distribution • Normal distributions are • Bell shaped • Symmetric around the mean • The mean () and the standard deviation () completely describe the density curve • Increasing/decreasing  moves the curve along the horizontal axis • Increasing/decreasing  controls the spread of the curve

Learning Objective 1:Normal Distribution • Within what interval do almost all of the men’s heights fall? Women’s height?

Learning Objective 2:68-95-99.7 Rule for Any Normal Curve • 68% of the observations fall within one standard deviation of the mean • 95% of the observations fall within two standard deviations of the mean • 99.7% of the observations fall within three standard deviations of the mean

Learning Objective 2:Example : 68-95-99.7% Rule • Heights of adult women • can be approximated by a normal distribution • = 65 inches; =3.5 inches • 68-95-99.7 Rule for women’s heights • 68% are between 61.5 and 68.5 inches [ µ = 65  3.5 ] • 95% are between 58 and 72 inches [ µ 2 = 65  2(3.5) = 65  7 ] • 99.7% are between 54.5 and 75.5 inches [ µ 3 = 65  3(3.5) = 65  10.5 ]

68% (by 68-95-99.7 Rule) ? 16% -1 +1 65 68.5 (height values) ? = 84% Learning Objective 2:Example : 68-95-99.7% Rule • What proportion of women are less than 69 inches tall?

Learning Objective 3:Z-Scores and the Standard Normal Distribution • The z-score for a value x of a random variable is the number of standard deviations that x falls from the mean • A negative (positive) z-score indicates that the value is below (above) the mean • z-scores can be used to calculate the probabilities of a normal random variable using the normal tables in the back of the book

Learning Objective 3:Z-Scores and the Standard Normal Distribution • A standard normal distribution has mean µ=0 and standard deviation σ=1 • When a random variable has a normal distribution and its values are converted to z-scores by subtracting the mean and dividing by the standard deviation, the z-scores have the standard normal distribution.

Learning Objective 4:Table A: Standard Normal Probabilities Table A enables us to find normal probabilities • It tabulates the normal cumulative probabilities falling below the point +z To use the table: • Find the corresponding z-score • Look up the closest standardized score (z) in the table. • First column gives z to the first decimal place • First row gives the second decimal place of z • The corresponding probability found in the body of the table gives the probability of falling below the z-score

Learning Objective 4:Example: Using Table A • Find the probability that a normal random variable takes a value less than 1.43 standard deviations above µ; P(z<1.43)=.9236 TI Calculator = Normcdf(-1e99,1.43,0,1)= .9236

Learning Objective 4:Example: Using Table A • Find the probability that a normal random variable takes a value greater than 1.43 standard deviations above µ: P(z>1.43)=1-.9236=.0764 TI Calculator = Normcdf(1.43,1e99,0,1)= 0.0764

Learning Objective 4:Example: • Find the probability that a normal random variable assumes a value within 1.43 standard deviations of µ • Probability below 1.43σ = .9236 • Probability below -1.43σ = .0764 (1-.9236) • P(-1.43<z<1.43) =.9236-.0764=.8472 TI Calculator = Normcdf(-1.43,1.43,0,1)= .8472

Learning Objective 5:Using the TI Calculator To calculate the cumulative probability • 2nd DISTR; 2:normalcdf(lower bound, upper bound,mean,sd) • Use –1E99 for negative infinity and 1E99 for positive infinity

Learning Objective 5:Find Probabilities Using TI Calculator • Find probability to the left of -1.64 • P(z<-1.64)=normcdf(-1e99,-1.64,0,1)=.0505 • Find probability to the right of 1.56 • P(z>1.56)=normcdf(1.56,1e99,0,1)=.0594 • Find probability between -.50 and 2.25 • P(-.5<z<2.25)=normcdf(-.5,2.25,0,1)=.6793

Learning Objective 6:How Can We Find the Value of z for a Certain Cumulative Probability? • To solve some of our problems, we will need to find the value of z that corresponds to a certain normal cumulative probability • To do so, we use Table A in reverse • Rather than finding z using the first column (value of z up to one decimal) and the first row (second decimal of z) • Find the probability in the body of the table • The z-score is given by the corresponding values in the first column and row

Learning Objective 6:How Can We Find the Value of z for a Certain Cumulative Probability? • Example: Find the value of z for a cumulative probability of 0.025. • Look up the cumulative probability of 0.025 in the body of Table A. • A cumulative probability of 0.025 corresponds to z = -1.96. • Thus, the probability that a normal random variable falls at least 1.96 standard deviations below the mean is 0.025.

Learning Objective 6:How Can We Find the Value of z for a Certain Cumulative Probability? • Example: Find the value of z for a cumulative probability of 0.975. • Look up the cumulative probability of 0.975 in the body of Table A. • A cumulative probability of 0.975 corresponds to z = 1.96. • Thus, the probability that a normal random variable takes a value no more than 1.96 standard deviations above the mean is 0.975.

Learning Objective 7:Using the TI Calculator to Find Z-Scores for a Given Probability • 2nd DISTR 3:invNorm; Enter • invNorm(percentile,mean,sd) • Percentile is the probability under the curve from negative infinity to the z-score • Enter

Learning Objective 7:Examples • The probability that a standard normal random variable assumes a value that is ≤ z is 0.975. What is z? Invnorm(.975,0,1)=1.96 • The probability that a standard normal random variable assumes a value that is > z is 0.0275. What is z? Invnorm(.975,0,1)=1.96 • The probability that a standard normal random variable assumes a value that is ≥ z is 0.881. What is z? Invnorm(1-.881,0,1)=-1.18 • The probability that a standard normal random variable assumes a value that is < z is 0.119. What is z? Invnorm(.119,0,1)= -1.18

Learning Objective 7:Example • Find the z-score z such that the probability within z standard deviations of the mean is 0.50. • Invnorm(.75,0,1)= .67 • Invnorm(.25,0,1)= -.67 • Probability = P(-.67<Z<.67)=.5

Learning Objective 8:Finding Probabilities for Normally Distributed Random Variables • State the problem in terms of the observed random variable X, i.e., P(X<x) • Standardize X to restate the problem in terms of a standard normal variable Z • Draw a picture to show the desired probability under the standard normal curve • Find the area under the standard normal curve using Table A

Learning Objective 8:P(X<x) • Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure less than 100? • P(X<100) = • Normcdf(-1E99,100,120,20)=.1587 • 15.9% of adults have systolic blood pressure less than 100

Learning Objective 8:P(X>x) • Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure greater than 100? • P(X>100) = 1 – P(X<100) • P(X>100)= 1-.1587=.8413 • Normcdf(100,1e99,120,20)=.8413 • 84.1% of adults have systolic blood pressure greater than 100

Learning Objective 8:P(X>x) • Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure greater than 133? • P(X>133) = 1 – P(X<133) • P(X>133)= 1-.7422=.2578 • Normcdf(133,1E99,120,20)=.2578 • 25.8% of adults have systolic blood pressure greater than 133

Learning Objective 8: P(a<X<b) • Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure between 100 and 133? • P(100<X<133) = P(X<133)-P(X<100) • Normcdf(100,133,120,20)=.5835 • 58% of adults have systolic blood pressure between 100 and 133

Learning Objective 9:Find X Value Given Area to Left • Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What is the 1st quartile? • P(X<x)=.25, find x: • Look up .25 in the body of Table A to find z= -0.67 • Solve equation to find x: • Check: • P(X<106.6) P(Z<-0.67)=0.25 • TI Calculator = Invnorm(.25,120,20)=106.6

Learning Objective 9:Find X Value Given Area to Right • Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. 10% of adults have systolic blood pressure above what level? • P(X>x)=.10, find x. • P(X>x)=1-P(X<x) • Look up 1-0.1=0.9 in the body of Table A to find z=1.28 • Solve equation to find x: • Check: • P(X>145.6) =P(Z>1.28)=0.10 • TI Calculator = Invnorm(.9,120,20)=145.6

Learning Objective 10:Using Z-scores to Compare Distributions Z-scores can be used to compare observations from different normal distributions • Example: • You score 650 on the SAT which has =500 and =100 and 30 on the ACT which has =21.0 and =4.7. On which test did you perform better? • Compare z-scores SAT: ACT: • Since your z-score is greater for the ACT, you performed better on this exam

Chapter 6: Probability Distributions Section 6.3: How Can We Find Probabilities When Each Observation Has Two Possible Outcomes?

Learning Objectives • The Binomial Distribution • Conditions for a Binomial Distribution • Probabilities for a Binomial Distribution • Factorials • Examples using Binomial Distribution • Do the Binomial Conditions Apply? • Mean and Standard Deviation of the Binomial Distribution • Normal Approximation to the Binomial

Learning Objective 1:The Binomial Distribution • Each observation is binary: it has one of two possible outcomes. • Examples: • Accept, or decline an offer from a bank for a credit card. • Have, or do not have, health insurance. • Vote yes or no on a referendum.

Learning Objective 2:Conditions for the Binomial Distribution • Each of n trials has two possible outcomes: “success” or “failure”. • Each trial has the same probability of success, denoted by p. • The ntrials are independent. • The binomial random variable X is the number of successes in the n trials.

Learning Objective 3:Probabilities for a Binomial Distribution • Denote the probability of success on a trial by p. • For n independent trials, the probability of x successes equals:

Chapter 6: Probability Distributions