Class 04. Wunderdog and the Normal Distribution

Class 04. Wunderdog and the Normal Distribution EMBS Section 6.2

Class 03 Assignment • Answers are posted on the course website • My office hours are IN THE CLASSROOM. • 3 to 430 on class days • Or email me for an appointment pfeiferp@virginia.edu • TA Office hours • Sundays and Tuesday Nights • MCoB 266 • 7 to 8:30 pm

What we learned last class • Hypothesis Testing • H0: She is guessing • Randomized double blind experiment • Test statistic: number correct • Specify α=0.05 (level of significance) • Observe 7 correct • P(x>=7│H0) = 1-BINOMDIST(6,10,.5,true) = 0.17 • Since this pvalue > α, the result is NOT statistically significant.

Case: Wunderdog Sports Picks

Wunderdog is just like LTT? • H0: He is guessing (p=.5, independent events) • Ha: He is skillful (p>.5) • Test statistic: Number correct = 87. • P( X≥87 │H0 ) = 1 – BINOMDIST(86,149,.5,true) • = 0.024 • Conclusion: Statistically significant at the α=0.05 level.

Wunderdog • X is number correct • X is binomial, n=149, p=0.5, if H0 is true. • Mean = E(X) = n*p = 74.5 • Variance = Var(X) = n*p*(1-p) = 37.25 • Standard deviation = 37.25^.5 = 6.1

Binomial pmf with n=149, p=0.5 Each possible outcome x has a mass of probability calculated as BINOM.DIST(x,149,.5,false)

As n gets big, the binomial “looks like” the normal (bell-shaped curve) • So if n is big, we sometimes use the normal distribution to approximate the binomial. • X is actually binomial. • It would be better to use BINOMDIST • But the probabilities we calculate come out pretty much the same if we use the appropriate normal distribution.

Binomials distributions for n=149 All three are “bell-shaped curves” P=0.5 P=0.8 P=0.2

The Normal Distribution • X is continuous • Applies to LOTS of random variables • Parameters are mean μ and the standard deviation σ. • Mean or E(X) = μ • Variance = σ2 • Standard deviation = σ • Symmetric: mean = median = mode (all = μ)

EMBS Fig 6.4, p 249

To calculate probabilities • P(X=x) = 0 • P(X≤x) = NORMDIST(x,μ,σ,true) • P(X<x) = NORMDIST(x,μ,σ,true)

Just like the binomial, the normal is a FAMILY of distributions. The member of the Normal family we want to use is the one with the mean and standard deviation that match our binomial. Mean=E(X)=74.5 Standard deviation = 6.1 Normal with μ=74.5, σ=6.1

P(x≥87) = 1-BINOMDIST(86,149,.5,true) =0.024 X is discrete P(x=87) = 0.008 P(x≥87) = P(x>87) = 1-NORMDIST(87,74.5,6.1,true) =0.020 X is continuous P(x=87) = 0

To calculate probabilities • P(X=x) = 0 • P(X≤x) = NORMDIST(x,μ,σ,true) • P(X<x) = NORMDIST(x,μ,σ,true) • P(X>x) = 1 – NORMDIST(x,μ,σ,true) • P(x1<X<x2) = NORMDIST(x2, μ,σ,true)-NORMDIST(x1,μ,σ,true) Weights of CEO’s are normally distributed with µ = 155 and σ=25. What percentage of CEO’s do we expect weigh between 160 and 200? =NORMDIST(200,155,25,true)-NORMDIST(160,155,25,true) = 0.964 – 0.579 = 0.385

To go backwards from a p to an x • The find the x value such that P(X<x) = p, use =NORMINV(p,μ,σ) EMBS problem 21, page 260 A person must score in the top 2% of the population on an IQ test to qualify for membership in MENSA (U.S. Airways Attache, September 2000). If the population of IQ scores is normal with mean of 100 and standard deviation of 15, what score qualifies one for MENSA? We want the score, x, such that the probability(X<x) is 0.98. =NORMINV(.98,100,15) = 130.8

Fun facts about the normal distribution • Let X be normal with mean μ and standard deviation σ. • X ~ N(μ,σ) • If Y = a + b*X • Then Y will be normal with mean a+b*μ and standard deviation b*σ • Y ~ N(a+b*μ,b*σ) • So if weight in pounds is normal, weight in kilograms will also be normal. • If Temperature in degrees F is Normal, temperature in degrees C will also be normal. • If I add 10 points to all exams, I add ten points to the mean but do not change the standard deviation. • If I multiply all scores by 1.5, I multiply the mean and the standard deviation by 1.5.

More Fun Facts • There are a multitude of normal distributions…one for each possible pair of μ and σ values. • But…they all follow the same “curve” and have identical properties so that, in that sense, there is only ONE normal distribution.

EMBS Fig 6.4, p 249

Before there was NORMDIST • We asked everyone to convert their probability question about x into a probability question about z. Because then we needed only ONE table of normal probabilities. Those that applied to z. z tells us where x is on its normal curve. z is how far x is above/below the mean in units of standard deviation. z is all we need to answer a probability quesgtion.

A changing world… • We can use =NORMDIST(x,μ,σ,true) to answer our probability questions. • We used to have to use =NORMSDIST([x- μ]/σ) The standard normal distribution. Uses z as the input. We needed calculate the z in order to answer probability questions.

There is math and calculus behind all this… =NORMDIST(x1,μ,σ,true)

Class 04. Wunderdog and the Normal Distribution