Chapter 6: Probability Distributions

1 / 63

# Chapter 6: Probability Distributions - PowerPoint PPT Presentation

Chapter 6: Probability Distributions. Section 6.1: How Can We Summarize Possible Outcomes and Their Probabilities?. Learning Objectives. Random variable Probability distributions for discrete random variables Mean of a probability distribution

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Chapter 6: Probability Distributions' - nicholas-sherman

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Chapter 6: Probability Distributions

Section 6.1: How Can We Summarize Possible Outcomes and Their Probabilities?

Learning Objectives
• Random variable
• Probability distributions for discrete random variables
• Mean of a probability distribution
• Summarizing the spread of a probability distribution
• Probability distribution for continuous random variables
Learning Objective 1:Randomness
• The numerical values that a variable assumes are the result of some random phenomenon:
• Selecting a random sample for a population

or

• Performing a randomized experiment
Learning Objective 1:Random Variable
• A random variable is a numerical measurement of the outcome of a random phenomenon.
Learning Objective 1:Random Variable
• Use letters near the end of the alphabet, such as x, to symbolize
• Variables
• A particular value of the random variable
• Use a capital letter, such as X, to refer to the random variable itself.

Example: Flip a coin three times

• X=number of heads in the 3 flips; defines the random variable
• x=2; represents a possible value of the random variable
Learning Objective 2:Probability Distribution
• The probability distribution of a random variable specifies its possible values and their probabilities.

Note: It is the randomness of the variable that allows us to specify probabilities for the outcomes

Learning Objective 2:Probability Distribution of a Discrete Random Variable
• A discrete random variableX has separate values (such as 0,1,2,…) as its possible outcomes
• Its probability distribution assigns a probability P(x) to each possible value x:
• For each x, the probability P(x) falls between 0 and 1
• The sum of the probabilities for all the possible x values equals 1
Learning Objective 2:Example
• What is the estimated probability of at least three home runs?

P(3)+P(4)+P(5)=0.13+0.03+0.01=0.17

• The mean of a probability distribution for a discrete random variable is

where the sum is taken over all possible values of x.

• The mean of a probability distribution is denoted by the parameter, µ.
• The mean is a weighted average; values of x that are more likely receive greater weight P(x)
Learning Objective 3:Expected Value of X
• The mean of a probability distribution of a random variable X is also called the expected value of X.
• The expected value reflects not what we’ll observe in a single observation, but rather that we expect for the average in a long run of observations.
• It is not unusual for the expected value of a random variable to equal a number that is NOT a possible outcome.
Learning Objective 3:Example
• Find the mean of this probability distribution.

The mean:

= 0(0.23) + 1(0.38) + 2(0.22) + 3(0.13) + 4(0.03) + 5(0.01) = 1.38

The standard deviation of a probability distribution, denoted by the parameter, σ, measures its spread.

• Larger values of σ correspond to greater spread.
• Roughly, σ describes how far the random variable falls, on the average, from the mean of its distribution
Learning Objective 5:Continuous Random Variable
• A continuous random variable has an infinite continuum of possible values in an interval.
• Examples are: time, age and size measures such as height and weight.
• Continuous variables are measured in a discrete manner because of rounding.
Learning Objective 5:Probability Distribution of a Continuous Random Variable
• A continuous random variable has possible values that form an interval.
• Its probability distribution is specified by a curve.
• Each interval has probability between 0 and 1.
• The interval containing all possible values has probability equal to 1.

### Chapter 6: Probability Distributions

Section 6.2: How Can We Find Probabilities for Bell-Shaped Distributions?

Learning Objectives
• Normal Distribution
• 68-95-99.7 Rule for normal distributions
• Z-Scores and the Standard Normal Distribution
• The Standard Normal Table: Finding Probabilities
• Using the TI-calculator: find probabilities
Learning Objectives
• Using the Standard Normal Table in Reverse
• Using the TI-calculator: find z-scores
• Probabilities for Normally Distributed Random Variables
• Percentiles for Normally Distributed Random Variables
• Using Z-scores to Compare Distributions
Learning Objective 1:Normal Distribution

The normal distribution is symmetric, bell-shaped and characterized by its mean µ and standard deviation .

• The normal distribution is the most important distribution in statistics
• Many distributions have an approximate normal distribution
• Approximates many discrete distributions well when there are a large number of possible outcomes
• Many statistical methods use it even when the data are not bell shaped
Learning Objective 1:Normal Distribution
• Normal distributions are
• Bell shaped
• Symmetric around the mean
• The mean () and the standard deviation () completely describe the density curve
• Increasing/decreasing  moves the curve along the horizontal axis
• Increasing/decreasing  controls the spread of the curve
Learning Objective 1:Normal Distribution
• Within what interval do almost all of the men’s heights fall? Women’s height?
Learning Objective 2:68-95-99.7 Rule for Any Normal Curve
• 68% of the observations fall within one standard deviation of the mean
• 95% of the observations fall within two standard deviations of the mean
• 99.7% of the observations fall within three standard deviations of the mean
Learning Objective 2:Example : 68-95-99.7% Rule
• can be approximated by a normal distribution
• = 65 inches; =3.5 inches
• 68-95-99.7 Rule for women’s heights
• 68% are between 61.5 and 68.5 inches

[ µ = 65  3.5 ]

• 95% are between 58 and 72 inches

[ µ 2 = 65  2(3.5) = 65  7 ]

• 99.7% are between 54.5 and 75.5 inches

[ µ 3 = 65  3(3.5) = 65  10.5 ]

68%

(by 68-95-99.7 Rule)

?

16%

-1

+1

65 68.5 (height values)

? = 84%

Learning Objective 2:Example : 68-95-99.7% Rule
• What proportion of women are less than 69 inches tall?
• The z-score for a value x of a random variable is the number of standard deviations that x falls from the mean
• A negative (positive) z-score indicates that the value is below (above) the mean
• z-scores can be used to calculate the probabilities of a normal random variable using the normal tables in the back of the book
• A standard normal distribution has mean µ=0 and standard deviation σ=1
• When a random variable has a normal distribution and its values are converted to z-scores by subtracting the mean and dividing by the standard deviation, the z-scores have the standard normal distribution.
Learning Objective 4:Table A: Standard Normal Probabilities

Table A enables us to find normal probabilities

• It tabulates the normal cumulative probabilities falling below the point +z

To use the table:

• Find the corresponding z-score
• Look up the closest standardized score (z) in the table.
• First column gives z to the first decimal place
• First row gives the second decimal place of z
• The corresponding probability found in the body of the table gives the probability of falling below the z-score
Learning Objective 4:Example: Using Table A
• Find the probability that a normal random variable takes a value less than 1.43 standard deviations above µ; P(z<1.43)=.9236

TI Calculator = Normcdf(-1e99,1.43,0,1)= .9236

Learning Objective 4:Example: Using Table A
• Find the probability that a normal random variable takes a value greater than 1.43 standard deviations above µ: P(z>1.43)=1-.9236=.0764

TI Calculator = Normcdf(1.43,1e99,0,1)= 0.0764

Learning Objective 4:Example:
• Find the probability that a normal random variable assumes a value within 1.43 standard deviations of µ
• Probability below 1.43σ = .9236
• Probability below -1.43σ = .0764 (1-.9236)
• P(-1.43<z<1.43) =.9236-.0764=.8472

TI Calculator = Normcdf(-1.43,1.43,0,1)= .8472

Learning Objective 5:Using the TI Calculator

To calculate the cumulative probability

• 2nd DISTR; 2:normalcdf(lower bound, upper bound,mean,sd)
• Use –1E99 for negative infinity and 1E99 for positive infinity
Learning Objective 5:Find Probabilities Using TI Calculator
• Find probability to the left of -1.64
• P(z<-1.64)=normcdf(-1e99,-1.64,0,1)=.0505
• Find probability to the right of 1.56
• P(z>1.56)=normcdf(1.56,1e99,0,1)=.0594
• Find probability between -.50 and 2.25
• P(-.5<z<2.25)=normcdf(-.5,2.25,0,1)=.6793
Learning Objective 6:How Can We Find the Value of z for a Certain Cumulative Probability?
• To solve some of our problems, we will need to find the value of z that corresponds to a certain normal cumulative probability
• To do so, we use Table A in reverse
• Rather than finding z using the first column (value of z up to one decimal) and the first row (second decimal of z)
• Find the probability in the body of the table
• The z-score is given by the corresponding values in the first column and row
Learning Objective 6:How Can We Find the Value of z for a Certain Cumulative Probability?
• Example: Find the value of z for a cumulative probability of 0.025.
• Look up the cumulative probability of 0.025 in the body of Table A.
• A cumulative probability of 0.025 corresponds to z = -1.96.
• Thus, the probability that a normal

random variable falls at least 1.96

standard deviations below the

mean is 0.025.

Learning Objective 6:How Can We Find the Value of z for a Certain Cumulative Probability?
• Example: Find the value of z for a cumulative probability of 0.975.
• Look up the cumulative probability of 0.975 in the body of Table A.
• A cumulative probability of 0.975 corresponds to z = 1.96.
• Thus, the probability that a normal

random variable takes a value no more

than 1.96 standard deviations above

the mean is 0.975.

Learning Objective 7:Using the TI Calculator to Find Z-Scores for a Given Probability
• 2nd DISTR 3:invNorm; Enter
• invNorm(percentile,mean,sd)
• Percentile is the probability under the curve from negative infinity to the z-score
• Enter
Learning Objective 7:Examples
• The probability that a standard normal random variable assumes a value that is ≤ z is 0.975. What is z? Invnorm(.975,0,1)=1.96
• The probability that a standard normal random variable assumes a value that is > z is 0.0275.

What is z? Invnorm(.975,0,1)=1.96

• The probability that a standard normal random variable assumes a value that is ≥ z is 0.881.

What is z? Invnorm(1-.881,0,1)=-1.18

• The probability that a standard normal random variable assumes a value that is < z is 0.119.

What is z? Invnorm(.119,0,1)= -1.18

Learning Objective 7:Example
• Find the z-score z such that the probability within z standard deviations of the mean is 0.50.
• Invnorm(.75,0,1)= .67
• Invnorm(.25,0,1)= -.67
• Probability = P(-.67<Z<.67)=.5
Learning Objective 8:Finding Probabilities for Normally Distributed Random Variables
• State the problem in terms of the observed random variable X, i.e., P(X<x)
• Standardize X to restate the problem in terms of a standard normal variable Z
• Draw a picture to show the desired probability under the standard normal curve
• Find the area under the standard normal curve using Table A
Learning Objective 8:P(X<x)
• Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure less than 100?
• P(X<100) =
• Normcdf(-1E99,100,120,20)=.1587
• 15.9% of adults have systolic blood pressure less than 100
Learning Objective 8:P(X>x)
• Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure greater than 100?
• P(X>100) = 1 – P(X<100)
• P(X>100)= 1-.1587=.8413
• Normcdf(100,1e99,120,20)=.8413
• 84.1% of adults have systolic blood pressure greater than 100
Learning Objective 8:P(X>x)
• Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure greater than 133?
• P(X>133) = 1 – P(X<133)
• P(X>133)= 1-.7422=.2578
• Normcdf(133,1E99,120,20)=.2578
• 25.8% of adults have systolic blood pressure greater than 133
Learning Objective 8: P(a<X<b)
• Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What percentage of adults have systolic blood pressure between 100 and 133?
• P(100<X<133) = P(X<133)-P(X<100)
• Normcdf(100,133,120,20)=.5835
• 58% of adults have systolic blood pressure between 100 and 133
Learning Objective 9:Find X Value Given Area to Left
• Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. What is the 1st quartile?
• P(X<x)=.25, find x:
• Look up .25 in the body of Table A to find z= -0.67
• Solve equation to find x:
• Check:
• P(X<106.6) P(Z<-0.67)=0.25
• TI Calculator = Invnorm(.25,120,20)=106.6
Learning Objective 9:Find X Value Given Area to Right
• Adult systolic blood pressure is normally distributed with µ = 120 and σ = 20. 10% of adults have systolic blood pressure above what level?
• P(X>x)=.10, find x.
• P(X>x)=1-P(X<x)
• Look up 1-0.1=0.9 in the body of Table A to find z=1.28
• Solve equation to find x:
• Check:
• P(X>145.6) =P(Z>1.28)=0.10
• TI Calculator = Invnorm(.9,120,20)=145.6
Learning Objective 10:Using Z-scores to Compare Distributions

Z-scores can be used to compare observations from different normal distributions

• Example:
• You score 650 on the SAT which has =500 and

=100 and 30 on the ACT which has =21.0 and

=4.7. On which test did you perform better?

• Compare z-scores

SAT: ACT:

• Since your z-score is greater for the ACT, you performed better on this exam

### Chapter 6: Probability Distributions

Section 6.3: How Can We Find Probabilities When Each Observation Has Two Possible Outcomes?

Learning Objectives
• The Binomial Distribution
• Conditions for a Binomial Distribution
• Probabilities for a Binomial Distribution
• Factorials
• Examples using Binomial Distribution
• Do the Binomial Conditions Apply?
• Mean and Standard Deviation of the Binomial Distribution
• Normal Approximation to the Binomial
Learning Objective 1:The Binomial Distribution
• Each observation is binary: it has one of two possible outcomes.
• Examples:
• Accept, or decline an offer from a bank for a credit card.
• Have, or do not have, health insurance.
• Vote yes or no on a referendum.
Learning Objective 2:Conditions for the Binomial Distribution
• Each of n trials has two possible outcomes: “success” or “failure”.
• Each trial has the same probability of success, denoted by p.
• The ntrials are independent.
• The binomial random variable X is the number of successes in the n trials.
• Denote the probability of success on a trial by p.
• For n independent trials, the probability of x successes equals:
Learning Objective 4:Factorials

Rules for factorials:

• n!=n*(n-1)*(n-2)…2*1
• 1!=1
• 0!=1

For example,

• 4!=4*3*2*1=24
Learning Objective 5:Example: Finding Binomial Probabilities
• John Doe claims to possess ESP.
• An experiment is conducted:
• A person in one room picks one of the integers 1, 2, 3, 4, 5 at random.
• In another room, John Doe identifies the number he believes was picked.
• Three trials are performed for the experiment.
• Doe got the correct answer twice.
Learning Objective 5:Example 1

If John Doe does not actually have ESP and is actually guessing the number, what is the probability that he’d make a correct guess on two of the three trials?

• The three ways John Doe could make two correct guesses in three trials are: SSF, SFS, and FSS.
• Each of these has probability: (0.2)2(0.8)=0.032.
• The total probability of two correct guesses is 3(0.032)=0.096.
Learning Objective 5:Example 1
• The probability of exactly 2 correct guesses is the binomial probability with n = 3 trials, x = 2 correct guesses and p = 0.2 probability of a correct guess.

2nd Vars

0:binampdf(n,p,x)

Binampdf(3,.2,2)=0.096

Learning Objective 5:Binomial Example 2
• 1000 employees, 50% Female
• None of the 10 employees chosen for management training were female.
• The probability that no females are chosen is:
• Binompdf(10,.5,0)=9.765625E-4
• It is very unlikely (one chance in a thousand) that none of the 10 selected for management training would be female if the employees were chosen randomly
Learning Objective 6:Do the Binomial Conditions Apply?
• Before using the binomial distribution, check that its three conditions apply:
• Binary data (success or failure).
• The same probability of success for each trial (denoted by p).
• Independent trials.
• The data are binary (male, female).
• If employees are selected randomly, the probability of selecting a female on a given trial is 0.50.
• With random sampling of 10 employees from a large population, outcomes for one trial does not depend on the outcome of another trial
Learning Objective 7:Binomial Mean and Standard Deviation
• The binomial probability distribution for n trials with probability p of success on each trial has mean µ and standard deviation σ given by:
Learning Objective 7: Example: Racial Profiling?
• Data:
• 262 police car stops in Philadelphia in 1997.
• 207 of the drivers stopped were African-American.
• In 1997, Philadelphia’s population was 42.2% African-American.
• Does the number of African-Americans stopped suggest possible bias, being higher than we would expect (other things being equal, such as the rate of violating traffic laws)?
Learning Objective 7:Example: Racial Profiling?
• Assume:
• 262 car stops represent n = 262 trials.
• Successive police car stops are independent.
• P(driver is African-American) is p = 0.422.
• Calculate the mean and standard deviation of this binomial distribution:
Learning Objective 7: Example: Racial Profiling?
• Recall: Empirical Rule
• When a distribution is bell-shaped, close to 100% of the observations fall within 3 standard deviations of the mean.
Learning Objective 7:Example: Racial Profiling?
• If there is no racial profiling, we would not be surprised if between about 87 and 135 of the 262 drivers stopped were African-American.
• The actual number stopped (207) is well above these values.
• The number of African-Americans stopped is too high, even taking into account random variation.
• Limitation of the analysis:
• Different people do different amounts of driving, so we don’t really know that 42.2% of the potential stops were African-American.
Learning Objective 8:Approximating the Binomial Distribution with the Normal Distribution
• The binomial distribution can be well approximated by the normal distribution when the expected number of successes, np, and the expected number of failures, n(1-p) are both at least 15.