1 / 43

STAT 111 Introductory Statistics

STAT 111 Introductory Statistics. Lecture 7: More on Random Variables, Probability, and Sampling May 27, 2004. Today’s Topics. Finishing up mean and variance Conditional Probability Multiplication Rule and Independence Tree Diagrams Bayes’s Rule Sampling distributions

Download Presentation

STAT 111 Introductory Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STAT 111 Introductory Statistics Lecture 7: More on Random Variables, Probability, and Sampling May 27, 2004

  2. Today’s Topics • Finishing up mean and variance • Conditional Probability • Multiplication Rule and Independence • Tree Diagrams • Bayes’s Rule • Sampling distributions • Binomial distribution for sample counts

  3. Recall: Rules for Means • Let X and Y be (not necessarily independent) random variables and let a and b be constants. • Rule 1 E( a + bX ) = a + b E(X) • Rule 2 E( a X + b Y ) = a E(X) + b E(Y) • If X and Y are independent, then E(XY) = E(X) E(Y)

  4. Recall: Rules for Variances • Let X and Y be random variables, and let a and b again be constants. • Rule 1 Var(a + b X) = b2 Var(X) • Rule 2 If X and Y are independent, then Var(X ± Y) = Var(X) + Var(Y) • Rule 3 If X and Y have correlation ρ, then Var(X ± Y) = Var(X) + Var(Y) ± 2ρσX σY

  5. Example: Pick 3 Ticket • Suppose you buy a $1 Pick 3 ticket on each of two different days. The payoffs X and Y on the two tickets are independent. Let X + Y be the total payoff. Calculate • Expected value of the total payoff • Variance of the total payoff • Standard deviation of the total payoff

  6. Example: Heights of Women • The height of young women between 18 and 24 in America is approximately normally distributed with mean µ = 64.5 and s.d. σ = 2.5. • Two women are randomly chosen from this age group. • What are the mean and s.d. of the difference in their heights? • What is the probability that one is at least 5” taller than the other? • What is the IQR of heights in this age group?

  7. Conditional Probability • The probability of an event can change if we know some other event has occurred. • The conditional probability of an event gives us the probability of one event under the condition that we know the outcome of another event. • Let A and B be any two events such that P(B) > 0. The conditional probability of A assuming that B has already occurred is written P(A | B):

  8. Example: Rolling Dice • Let A be the event that a 4 appears on a single roll of a fair 6-sided die, and let B be the event that an even number appears. • Find P(A | B) and P(B | A). • Suppose we add another (different-colored so we can distinguish between the two) die to the mix, and let C be the event that the sum of the two dice is greater than 8. • Find P(A | C) and P(C | A).

  9. Example: Gender of Children • Suppose we have a family with two children. Assume all four possible outcomes ({older boy, younger boy},…) are equally likely. What is the probability that both are girls given that at least one is a girl? • Suppose instead that we ignored the age of the children and distinguished only three family types. How would this change the above probability?

  10. Example: Drawing Cards • Draw 2 cards off the top of a well-shuffled deck. • What is the probability that the second card is an Ace, given that the first card was an Ace? • On the other hand, consider only the first card for a minute. Suppose you do not see what the card is, and your friend tells you the card is a King. What is the probability that the card is a diamond?

  11. Multiplication Rule • The probability that both event A and event B occur is given by P(A and B) = P(A) P(B | A) = P(B) P(A | B) • Here, P(A | B) and P(B | A) have the usual meaning of being conditional probabilities.

  12. Example: Home Security • House security experts estimate that an untrained house dog has a 70% probability of detecting an intruder – and, given detection, a 50% chance of scaring the intruder away. • What is the probability that Fido successfully thwarts a burglar? (The probability of a trained watchdog detecting and running off an intruder is estimated to be around 0.75)

  13. Example: Drawing Chips from an Urn • An urn contains 5 white chips and 4 blue chips. Two chips are drawn sequentially and without replacement. What is the probability of obtaining the sequence (W, B)? • The multiplication rule can be extended to higher-order intersections. For example, suppose we throw 3 red chips and 5 yellow chips into our urn. Five chips are drawn sequentially and without replacement. What is the probability of obtaining the sequence (W, R, W, B, Y)?

  14. Independence • Recall that two events A and B are independent if knowing one occurs does not change the probability that the other occurs. • When two events are independent, we have that P(B | A) = P(B) and P(A | B) = P(A) • Recall our example about the probability of a single card draw being a diamond given that we are told it is a King.

  15. Many Independent Events • Suppose we have n independent events A1, A2, …, An. Then the multiplication rule is

  16. Example: Ten Rolls of a Die • Roll a die ten times. • What is the probability that we roll a 2 10 times? • What is the probability that we roll at least one 2?

  17. Example: Height of Women • Randomly select 8 American young women aged 18 to 24. • What is the probability that all 8 women are more than 65 inches tall? • What is the probability at least one of the women is between 63 and 67 inches tall?

  18. Tree Diagrams • A tree diagram is often helpful for solving more elaborate calculations, and in particular, problems that have several stages. • In a tree diagram, each segment in the tree represents one stage of the problem. • Each complete branch shows a possible path. • Tree diagrams combine both the addition and multiplication rules.

  19. Sample Tree Diagram Conditional probability of outcome in Stage 2 given the outcome in Stage 1 Stage 2 Probability of outcome in Stage 1 Stage 1 H 0.5 H Second flip 0.5 0.5 T First flip 0.5 H 0.5 Second flip T 0.5 T

  20. HH HT TH TT Example: Tossing a Coin Twice P(HH) = P(H)P(H) = (0.5)(0.5) = 0.25 P(HT) = P(H)P(T) = (0.5)(0.5) = 0.25 P(TH) = P(T)P(H) = (0.5)(0.5) = 0.25 P(TT) = P(T)P(T) = (0.5)(0.5) = 0.25 H H Second flip T First flip H Second flip T T

  21. Example: Dependent Coin Flips • The previous example has independent coins flips, but we can imagine situations where coin flips will be dependent. • Consider the following situation. We have two coins, one fair (P(H) = 0.5 = P(T)) and one biased (P(H) = 0.75, P(T) = 0.25). • Flip fair coin; if H, use biased coin next; otherwise, use fair coin again. • Flip biased coin after every H, fair coin after every T. • Coin flips now are no longer independent.

  22. Example: Chips in an Urn • As a slightly different example, consider an urn with 5 white chips and 4 blue chips. We draw three chips sequentially and without replacement. What is the probability of obtaining the sequence (W, B, B)? Calculate this using a tree diagram.

  23. Example: Lab Testing • A lab test can yield either a positive or negative result. For people with a particular disease, it will produce a positive result 90% of the time. But it will also produce a positive result in 0.1% of all healthy people. Suppose that 0.01% of the population actually has the disease. • What is the probability that an individual is healthy? • That a sick individual produces a negative test result? • That an individual is healthy and has a negative test result?

  24. Ill and + Ill but - Healthy but + Health and - Tree Diagram for Lab Testing Stage 2 Stage 1 + 0.90 ill 0.0001 Test ? - Individual 0.001 + ? healthy Test ? -

  25. More on Lab Testing • We know from our initial conditions that if a person chosen randomly has the disease, we get a positive test with probability 0.90. • What we’re more interested in usually is diagnosing individuals with the disease. In other words, we want to know what the probability is that an individual has the disease given that his test result is positive.

  26. Bayes’s Rule • What we need is a method that allows us to use our known conditional probabilities to compute the conditional probabilities “in the other direction.” • The formula we use is called Bayes’s Rule and can be stated as follows: if A and B are any two events whose probabilities are not 0 or 1, then

  27. Derivation of Bayes’s Rule

  28. Example: Lab Testing • In our example, A is the event that an individual is ill, and B is the event that the test result is positive. • So, let’s calculate the probability of an individual being ill given a positive test result.

  29. Example: Coins and Urns • A biased coin, twice as likely to come up heads as it is tails, is tossed once. • If heads, draw chip from urn I, which contains 3 white chips and 4 red chips. • If tails, draw chip from urn II, which contains 6 white chips and 3 red chips. • Given that a white chip was drawn, what is the probability that the coin came up tails?

  30. Population and Sampling Distributions • The population distribution of a variable is the distribution of its values for all members of the population. • The population distribution is also the probability distribution of the variable when we choose one individual from the population at random.

  31. Population and Sampling Distributions • A statisticis any numeric measure that is used to describe the data we obtain. • If the data are obtained using random sampling, a statistic is a random variable, and hence its value varies from sample to sample. • The probability distribution of the statistic is known as its sampling distribution.

  32. Population and Sampling Distributions • The sampling distribution of a statistic depends not only on the population distribution, but also on the sample size and the method used to collect the data from the population. • A statistic can be used to estimate the parameter of the population.

  33. Example: Two Different Surveys • Two national surveys are planned to estimate the proportion p of people who are skeptical of news media. • First survey randomly selects 1000 people. • Second survey randomly selects 10,000 people. • Will the results be the same? • Are the survey results biased? • If the these two surveys are repeatedly performed, which one will yield less variable estimates?

  34. The Binomial Setting • There are a fixed number n of trials. • The n trials are all independent. • Each trial has one of two possible outcomes, labeled “success” and “failure.” • The probability of success, p, remains the same for each trial.

  35. Example: Chips in an Urn • An urn contains 5 white chips and 4 red chips. Randomly draw 3 chips with replacement from this urn. • Let S represent the outcome where a red chip is drawn, and F the outcome where a white chip is drawn. • This is a binomial experiment with p = 4/9. • Would this still be a binomial experiment if the chips were drawn without replacement?

  36. The Binomial Distribution • We briefly mentioned the binomial distribution in passing previously. • The distribution of the count X of successes in the binomial setting is called the binomial distribution with parameter n and p, where • n is the number of trials • p is the probability of a success on any trial • The count X is a discrete random variable, typically abbreviated as X ~ B(n, p).

  37. The Binomial Distribution • Possible values of X are the whole numbers from 0 to n. • We know that the count of successes is a binomial random variable. • Is it true that the count of failure is also a binomial random variable? • If it is, what are the parameters of its distribution?

  38. The Binomial Distribution • If X ~ B(n,p), then • Examples: Let n = 3.

  39. Developing Binomial Probabilities for n = 3 S3 P(SSS) = p3 P(SSF) = p2(1 – p) P(SFS) = p2(1 – p) P(SFF) = p(1 – p)2 P(FSS) = p2(1 – p) P(FSF) = p(1 – p)2 P(FFS) = p(1 – p)2 P(FFF) = (1 – p)3 S2 p S1 p F3 1-p F2 S3 p p 1-p F3 1-p S2 S3 p p F3 1-p 1-p S3 p F1 1-p F2 1-p F3

  40. Binomial Probabilities for n = 3 • Let X be the number of successes in three trials. P(FFF) = (1 – p)3 P(SSF) = p2(1 – p) P(SFS) = p2(1 – p) P(SFF) = p(1 – p)2 P(FSS) = p2(1 – p) P(FSF) = p(1 – p)2 P(FFS) = p(1 – p)2 P(SSS) = p3 P(X = 0) = (1 – p)3 P(X = 1) = 3p(1 – p) 2 P(X = 2) = 3p2(1 – p) P(X = 3) = p3 X=0 X=1 X=2 X=3

  41. Example: Rolling a Die • Roll a die 4 times, let X be the number of times the number 5 appears. • “Success” = get a roll of 5, so P(Success) = 1/6.

  42. Example: Rolling a Die • Find the probability that we get at least 2 rolls of 5.

  43. Example: Flying • Suppose an airline operates a daily shuttle service from Altoona to Hoboken. • Two round-trip flights, one on a plane with two engines, the other on a plane with four engines. • Each engine on each plane fails independently with probability p. • Each plane arrives safely only if at least half of its engines remain in working order. • For what values of p would you prefer to fly in the two-engine plane?

More Related