1 / 43

Chapter 7

Chapter 7. Random Variables and Discrete Probability Distributions. Random Variables…. A random variable is a function or rule that assigns a number to each outcome of an experiment. Basically it is just a symbol that represents the outcome of an experiment.

Download Presentation

Chapter 7

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 7 Random Variables and Discrete Probability Distributions

  2. Random Variables… • A random variable is a function or rule that assigns a number to each outcome of an experiment. Basically it is just a symbol that represents the outcome of an experiment. • X = number of heads when the experiment is flipping a coin 20 times. • C = the daily change in a stock price. • R = the number of miles per gallon you get on your auto during a family vacation. • Y = the amount of medication in a blood pressure pill. • V = the speed of an auto registered on a radar detector used on I-20

  3. Two Types of Random Variables… • Discrete Random Variable – usually count data [Number of] • * one that takes on a countable number of values – this means you can sit down and list all possible outcomes without missing any, although it might take you an infinite amount of time. • X = values on the roll of two dice: X has to be either 2, 3, 4, …, or 12. • Y = number of accidents on the UTA campus during a week: Y has to be 0, 1, 2, 3, 4, 5, 6, 7, 8, ……………”real big number” • Continuous Random Variable – usually measurement data [time, weight, distance, etc] • * one that takes on an uncountable number of values – this means you can never list all possible outcomes even if you had an infinite amount of time. • X = time it takes you to drive home from class: X > 0, might be 30.1 minutes measured to the nearest tenth but in reality the actual time is 30.10000001…………………. minutes?) • Exercise: try to list all possible numbers between 0 and 1.

  4. Probability Distributions… • A probability distribution (density function) is a table, formula, or graph that describes the values of a random variable and the probability associated with these values. • – Discrete Probability Distribution, (this chapter) • X = outcome of rolling one die • – Continuous Probability Distribution (Chapter 8)

  5. Discrete Probability Notation… • An upper-case letter will represent the name of the random variable, usually X. • Its lower-case counterpart, x, will represent the value of the random variable. • The probability that the random variable X will equal x is: • P(X = x) or more simply P(x) • X = number of heads in 10 flips of coin • P(X = 5) = P(5) = probability of 5 heads (x) in 10 flips

  6. Discrete Probability Distributions… • Probabilities, P(x),associated with Discrete random variables have the following properties.

  7. Developing Discrete Probability Distributions • Probability distributions can be estimated from relative frequencies. Consider the discrete (countable) number of televisions per household (X) from US survey data (Example 7.1)… 1,218 ÷ 101,501 = 0.012 e.g. P(X=4) = P(4) = 0.076 = 7.6%

  8. Questions you might want answered • E.g. what is the probability there is at least one television but no more than three in any given household? “at least one television but no more than three” P(1 ≤ X ≤ 3) = P(1) + P(2) + P(3) = .319 + .374 + .191 = .884

  9. Developing Discrete Probability Distributions • Techniques covered in the Probability Chapter can be used to develop probability distributions, for example, a mutual fund sales person knows that there is 20% chance of closing a sale on each call she makes. • What is the probability distribution of the number of sales if she plans to call three customers? • Random Variable = X = # Sales Made in 3 Attempts • Let S denote probability of closing a sale P(S)=.20 • Thus SC is not closing a sale, and P(SC)=.80 • Seems reasonable to assume that sales are independent.

  10. Sample Space: List of all possible outcomes • S1S2S3 : P(X = 3) = (.2)*(.2)*(.2) = 0.008 : P(3) = .008 • SSSC : P(X = 2) = (.2)*(.2)*(.8) = 0.032 • SSCS : P(X = 2) = (.2)*(.8)*(.2) = 0.032 : P(2) = .032+.032+.032 • SCSS : P(X = 2) = (.8)*(.2)*(.2) = 0.032 (Additive Law) • SSCSC : P(X = 1) = (.2)*(.8)*(.8) = 0.128 • SCSSC : P(X = 1) = (.8)*(.2)*(.8) = 0.128 : P(1) = .128+.128+.128 • SCSCS : P(X = 1) = (.8)*(.8)*(.2) = 0.128 (Additive Law) • SCSCSC : P(X = 0) = (.8)*(.8)*(.8) = 0.512 : P(0) = .512 • NOTE: P(S1S2S3) = P(S1) * P(S2/S1) * P(S3/S1S2) “Mult. Rule” • = P(S1) * P(S2) * P(S3) “independent?” • = (.2)*(.2)*(.2) = 0.008

  11. P(S)=.2 P(S)=.2 P(SC)=.8 P(S)=.2 P(S)=.2 P(SC)=.8 P(SC)=.8 P(S)=.2 P(S)=.2 P(SC)=.8 P(SC)=.8 P(S)=.2 P(SC)=.8 P(SC)=.8 Another Approach: Tree Diagram • Developing a Probability Distribution… Sales Call 1 Sales Call 2 Sales Call 3 (.2)(.2)(.8)= .032 S S S S S SC S SC S S SC SC SC S S SC S SC SC SC S SC SC SC • X P(x) • .23 = .008 • 3(.032)=.096 • 3(.128)=.384 • 0 .83 = .512 P(X=2) is illustrated here…

  12. Final Discrete Probability Distribution • The mean of a discrete random variable is the weighted average of all of its values. The weights are the probabilities. This parameter is also called the expected value of X and is represented by E(X). • The variance is • The standard deviation is

  13. Computing Mean, Variance, and Std. Dev. for Discrete Random Variable • Mean = 0*(.008) + 1*(.096) + 2*(.384) + 3*(.512) • = 2.4 • Variance = (0-2.4)2*(.008) + (1-2.4)2*(.096) • + (2-2.4)2*(.384) + (3-2.4)2*(.512) • = .046 + .188 + .061 + .184 = .479 • Std. Dev. = SQRT(.479) = .692 • We are as smart as the goddess of statistics now, since we know the true mean, variance, and standard deviation of the population.

  14. Laws of Expected Value…”Useful to know” • E(c) = c * The expected value of a constant (c) is just the value of the constant. • E(X + c) = E(X) + c • * The expected value of a random variable plus a constant is the expected value of the random variable plus the constant • 3. E(cX) = cE(X) • The expected value of a constant times a random variable is the constant times the expected value of the random variable.

  15. Laws of Expected Value…”Useful to know” • E(c1X1 + c2X2 + c3X3 + c4X4 + c5X5) • = c1E(X1) + c2E(X2) + c3E(X3) + c4E(X4) + c5E(X5) • Example: what is the expected mean weight of a surgical pack containing 5 components [maybe we could weigh the pack to determine if one of the components is missing]. • True when random variables are independent!!!

  16. Laws of Variance… • V(c) = 0 • The variance of a constant (c) is zero. • V(X + c) = V(X) • The variance of a random variable and a constant is just the variance of the random variable. • V(cX) = c2V(X) • The variance of a random variable and a constant coefficient is the coefficient squared times the variance of the random variable.

  17. Example: You weight all 30,000 students • Random Variable: X = students weight • Mean(X) = X-Bar = 160 lbs • Variance(X) = s2 = 900 lbs2 • StdDev(X) = s = 30 lbs • ************************************* • You now discover that the scales reported a student’s weight 5 lbs too heavy. The student’s real weights (Y) should have been Y = X – 5. What are the mean and variance of the student’s REAL weights • Mean(Y) = Mean(X) – 5 = 160 – 5 = 155 lbs • Variance(Y) = Variance(X) = 900 • StdDev(Y) = SQRT(900) = 30

  18. Example: You measure the height of all 30,000 students • Random Variable: X = students height in “Feet” • Mean(X) = X-Bar = 5.8 feet • Variance(X) = s2 = 0.09 feet2 • StdDev(X) = s = 0.3 feet • ************************************* • You now discover that the President wanted to measure student’s heights in “Inches” and not “Feet”. The student’s height in “Inches” (Y) should have been Y = 12*X . What are the mean and variance of the student’s heights in Inches? • Mean(Y) = 12*Mean(X) = 12*5.8 = 69.6 inches • Variance(Y) = 122*Variance(X) = 144*(.09) = 12.96 • StdDev(Y) = SQRT(12.96) = 3.6

  19. Laws… • We can derive laws of expected value and variance for the sum of two independent random variables as follows… • E(X + Y) = E(X) + E(Y) • V(X + Y) = V(X) + V(Y) • ************************************************************** • X = weight of right shoes: Mean(X) = .5 lbs and Var(X) = .0004 • Y = weight of left shoes: Mean(Y) = .5 lbs and Var(Y) = .0004 • ************************************************************** • What is the mean and variance of a “Pair” of shoes. P = X +Y • E(P) = E(X + Y) = E(X) + E(Y) = .5 + .5 = 1.0 • V(P) = V(X+Y) = V(X) + V(Y) = .0004 + .0004 = .0008 • NOTE: WEIGHTS OF RIGHT AND LEFT SHOE INDEPENDENT • *************************************************************** • ? How could you determine the mean and variance of the weight of an automobile after you make all the parts but before you assemble the automobile

  20. Binomial Distribution… 2 parameters [n and p] • The binomial distribution is the probability distribution that results from doing a “binomial experiment”. Binomial experiments have the following properties: • Fixed number of trials, represented as n. • Each trial has two possible outcomes, a “success” and a “failure”. • P(success)=p (and thus: P(failure)=1–p), for all trials. • The trials are independent, which means that the outcome of one trial does not affect the outcomes of any other trials.

  21. Success and Failure… • …are just labels for a binomial experiment, there is no value judgment implied. You may define either one of the 2 possible outcomes as “Success” • For example a coin flip will result in either heads or tails. If we define “heads” as success then necessarily “tails” is considered a failure (inasmuch as we attempting to have the coin lands heads up). • Other potential examples of binomial random variables: • A firecracker pops or fails to pop • A patient get an infection during an operation or does not get an infection

  22. Binomial Random Variable… • The random variable of a binomial experiment is defined as the number of successes, X, in the n trials, where the probability of success on a single trial is p. • E.g. flip a fair coin 10 times… • 1) Fixed number of trials n=10 • 2) Each trial has two possible outcomes  {heads (success), tails (failure)} • 3) P(success)= 0.50; P(failure)=1–0.50 = 0.50  • 4) The trials are independent (i.e. the outcome of heads on the first flip will have no impact on subsequent coin flips). • Hence flipping a coin ten times is a binomial experiment since all conditions were met.

  23. Binomial Distribution [formula] • The binomial random variable (# of successes in n trials) can take on values 0, 1, 2, …, n. Thus, its a discrete random variable. • Once we know a random variable is binomial, we can calculate the probability associated with each value of the random variable from the binomial distribution: • x = # successes and n-x = # failures for x=0, 1, 2, …, n

  24. Ways to Calculate Binomial Probabilities • Use the binomial distribution formula [not a good approach unless n is fairly small] • Use the binomial tables at the back of most stat books [not real good unless your specific value of “n” and “p” happen to be included in the tables] • Approximate the binomial probabilities from some other distributional form (normal) [no need to do this now that we have access to various statistical software that will do it for us] • Use Excel stat function “=BINOMDIST(x,n,p,false)” which will return the individual probability. Replace false with true and you will get the sum of the binomial probabilities from 0 up to x.

  25. Problem: Pat Statsdud… • Pat Statsdud failed to study for the next stat exam. Pat’s exam strategy is to rely on luck for the next quiz. The quiz consists of 10 multiple-choice questions (n=10). Each question has five possible answers, only one of which is correct (p=0.2). Pat plans to guess the answer to each question. • What is the probability that Pat gets no answers correct? • P(X=0) = P(0) = • What is the probability that Pat gets two answers correct? • P(X=2) = P(2) =

  26. Pat Statsdud… • n=10, and P(success) = .20 • What is the probability that Pat gets no answers correct? • I.e. # success, x, = 0; hence we want to know P(x=0) Pat has about an 11% chance of getting no answers correct using the guessing strategy.

  27. Pat Statsdud… • n=10, and P(success) = .20 • What is the probability that Pat gets two answers correct? • I.e. # success, x, = 2; hence we want to know P(x=2) Pat has about a 30% chance of getting exactly two answers correct using the guessing strategy.

  28. Cumulative Probability… • “Find the probability that Pat fails the quiz” • If a grade on the quiz is less than 50% (i.e. 5 questions • out of 10), that’s considered a failed quiz. • P(fail quiz) = P(X < 4) = P(0)+P(1)+P(2)+P(3)+P(4) • Called a cumulative probability, that is, P(X ≤ x) • Note: Calculating all these individual probabilities would be tedious and time consuming, however, the Binomial tables at back of book gives you the cumulative probabilities [n=10, p=0.2, x=4]

  29. Pat Statsdud… • Calculate Individual Probabilities and Add Up! • P(X ≤ 4) = P(0) + P(1) + P(2) + P(3) + P(4) • We already know P(0) = .1074 and P(2) = .3020. Using the binomial formula to calculate the others: • P(1) = .2684 , P(3) = .2013, and P(4) = .0881 • Hense P(X ≤ 4) = .1074 + .2684 + … + .0881 = .9672 • OR • Use binomial tables at back of book for n=10, p=0.2, and x=4 “Next Slide”

  30. Binomial Table… • “What is the probability that Pat fails the quiz”? • i.e. what is P(X ≤ 4), given P(success) = .20 and n=10 ? P(X ≤ 4) = .967

  31. Binomial Table… • “What is the probability that Pat gets no answers correct?” • i.e. what is P(X = 0), given P(success) = .20 and n=10 ? P(X = 0) = P(X ≤ 0) = .107

  32. Binomial Table… • “What is the probability that Pat gets two answers correct?” • i.e. what is P(X = 2), given P(success) = .20 and n=10 ? P(X = 2) = P(X≤2) – P(X≤1) = .678 – .376 = .302 remember, the table shows cumulative probabilities…

  33. =BINOMDIST() Excel Function… • There is a binomial distribution function in Excel that can also be used to calculate these probabilities. For example: • What is the probability that Pat gets two answers correct? # successes # trials P(success) True: cumulative prob. False: individual prob. P(X=2)=.3020

  34. =BINOMDIST() Excel Function… • There is a binomial distribution function in Excel that can also be used to calculate these probabilities. For example: • What is the probability that Pat fails the quiz? # successes # trials P(success) cumulative (i.e. P(X≤x)?) P(X≤4)=.9672

  35. Binomial Distribution… • As you might expect, statisticians have determined formulas for the mean, variance, and standard deviation of a binomial random variable. They are: • Previous example: n=10, p=0.2 • μ = n*p = 10*0.2 = 2 • σ2 = n*p*(1-p) = 10*0.2*0.8= 1.6 • σ = SQRT(1.6) = 1.26

  36. Poisson Distribution… 1 parameter [μ] • Named for Simeon Poisson, the Poisson distribution is a discrete probability distribution and refers to the number of events (a.k.a. successes) within a specific time period or region of space. For example: • The number of cars arriving at a service station in 1 hour. (The interval of time is 1 hour.) • The number of flaws in a bolt of cloth. (The specific region is a bolt of cloth.) • The number of accidents in 1 day on a particular stretch of highway. (The interval is defined by both time, 1 day, and space, the particular stretch of highway.)

  37. Poisson Probability Distribution… • The probability that a Poisson random variable assumes a value of x is given by: • Note: μ is the only parameter [tell me μ and I can calculate the probabilities] • and e is the natural logarithm base. • FYI:

  38. Example 7.12… • The number of typographical errors in new editions of textbooks varies considerably from book to book. After some analysis he concludes that the number of errors is Poisson distributed with a mean of 1.5 typos per 100 pages. The instructor randomly selects 100 pages of a new book. What is the probability that there are no typos? • That is, what is P(X=0) given that = 1.5? “There is about a 22% chance of finding zero errors”

  39. Poisson Distribution… • As mentioned on the Poisson experiment slide: • The probability of a success is proportional to the size of the interval • Thus, knowing an error rate of 1.5 typos per 100 pages, we can determine a mean value for a 400 page book as: • =1.5(4) = 6 typos / 400 pages.

  40. Example 7.13… • For a 400 page book, what is the probability that there are • no typos? • P(X=0) = “there is a very small chance there are no typos”

  41. Example 7.13… • For a 400 page book, what is the probability that there are five or less typos? • P(X≤5) = P(0) + P(1) + … + P(5) • This is rather tedious to solve manually. A better alternative is to refer to Table 2 in Appendix B… • …k=5, =6, and P(X ≤ k) = .446 “there is about a 45% chance there are 5 or less typos”

  42. Example 7.13… • …Excel is an even better alternative:

  43. Poisson Practice • The number of infections [X] in a hospital each week has been shown to follow a poisson distribution with mean 3.0 infections per week. Calculate the following probabilities. • P(X = 0) = • P(X < 4) = • P(X > 9) = • If you found 9 infections next week, what would you say??

More Related