# Exam

## Exam

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Exam • Exam starts two weeks from today

2. Amusing Statistics • Use what you know about normal distributions to evaluate this finding: The study, published in Pediatrics, the journal of the American Academy of Pediatrics, found that among the 4,508 students in Grades 5-8ﾊwho participated, 36 per cent reported excellent school performance, 38 per cent reported good performance, 20 per cent said they were average performers, and 7 per cent said they performed below average.

3. Review • The Z-test is used to compare the mean of a sample to the mean of a population and

4. Review • The Z-score is normally distributed

5. Review • The Z-score is normally distributed • Thus the probability of obtaining any given Z-score by random sampling is given by the Z table

6. Review • We can likewise determine critical values for Z such that we would reject the null hypothesis if our computed Z-score exceeds these values • For alpha = .05: • Zcrit (one-tailed) = 1.64 • Zcrit (two-tailed) = 1.96

7. Confidence Intervals • A related question you might ask: • Suppose you’ve measured a mean and computed a standard error of that mean • What is the range of values such that there is a 95% chance of the population mean falling within that range?

8. Confidence Intervals • There is a 2.5% chance that the population mean is actually 1.96 standard errors more than the observed mean True mean? 2.5% 95% 1.96

9. Confidence Intervals • There is a 2.5% chance that the population mean is actually 1.96 standard errors less than the observed mean True mean? 2.5% 95% -1.96

10. Confidence Intervals • Thus there is a 95% chance that the true population mean falls within + or - 1.96 standard errors from a sample mean

11. Confidence Intervals • Thus there is a 95% chance that the true population mean falls within + or - 1.96 standard errors from a sample mean • Likewise, there is a 95% chance that the true population mean falls within + or - 1.96 standard deviations from a single measurement

12. Confidence Intervals • This is called the 95% confidence interval…and it is very useful • It works like significance bounds…if the 95% C.I. doesn’t include the mean of a population you’re comparing your sample to, then your sample is significantly different from that population

13. Confidence Intervals • Consider an example: • You measure the concentration of mercury in your backyard to be .009 mg/kg • The concentration of mercury in the Earth’s crust is .007 mg/kg. Let’s pretend that, when measured at many sites around the globe, the standard deviation is known to be .002 mg/kg

14. Confidence Intervals • The 95% confidence interval for this mercury measurement is

15. Confidence Intervals • This interval includes .007 mg/kg which, it turns out, is the mean concentration found in the earth’s crust in general • Thus you would conclude that your backyard isn’t artificially contaminated by mercury

16. Confidence Intervals • Imagine you take 25 samples from around Alberta and you found:

17. Confidence Intervals • Imagine you take 25 samples from around Alberta and you found: • .009 +/- (1.96 x .0004) = .008216 to .009784 • This interval doesn’t include the .007 mg/kg value for the earth’s crust so you would conclude that Alberta has an artificially elevated amount of mercury in the soil

18. Power • we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05

19. Power • we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05 • we say that we have a significant result…

20. Power • we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05 • we say that we have a significant result… • but what if p is > .05?

21. Power • What are the two reasons why p comes out greater than .05?

22. Power • What are the two reasons why p comes out greater than .05? • Your experiment lacked Statistical Power and you made a Type II Error • The null hypothesis really is true

23. Power • Two approaches: • The Hopelessly Jaded Grad Student Solution • The Wise and Well Adjusted Professor Procedure

24. Power 1. Hopelessly Jaded Grad Student Solution - conclude that your hypothesis was wrong and go directly to the grad student pub

25. Power - This is not the recommended course of action

26. Power 2. The Wise Professor Procedure - consider the several reasons why you might not have detected a significant effect

27. Power - recommended by wise professors the world over

28. Power • Why might p be greater than .05 ? • Recall that: and

29. Power • Why might p be greater than .05 ? 1. Small effect size: • The effect doesn’t stand out from the variability in the data • You might be able to increase your effect size (e.g. with a larger dose or treatment) is quite close to the mean of the population

30. Power • Why might p be greater than .05 ? 2. Noisy Data • A large denominator will swamp the small effect • Take greater care to reduce measurement errors and therefore is quite large

31. Power • Why might p be greater than .05 ? 3. Sample Size is Too Small • A large denominator will swamp the small effect • Run more subjects is quite large because is small

32. Power • The solution in each case is more power:

33. Power • The solution in each case is more power: • Power is like sensitivity - the ability to detect small effects in noisy data

34. Power • The solution in each case is more power: • Power is like sensitivity - the ability to detect small effects in noisy data • It is the opposite of Type II Error rate

35. Power • The solution in each case is more power: • Power is like sensitivity - the ability to detect small effects in noisy data • It is the opposite of Type II Error rate • So that you know: there are equations for computing statistical power

36. Power • An important point about power and the null hypothesis: • Failing to reject the null hypothesis DOES NOT PROVE it to be true!!!

37. Power • Consider an example: • How to prove that smoking does not cause cancer: • enroll 2 people who smoke infrequently and use an antique X-Ray camera to look for cancer • Compare the mean cancer rate in your group (which will probably be zero) to the cancer rate in the population (which won’t be) with a Z-test

38. Power • Consider an example: • If p came out greater than .05, you still wouldn’t believe that smoking doesn’t cause cancer

39. Power • Consider an example: • If p came out greater than .05, you still wouldn’t believe that smoking doesn’t cause cancer • You will, however, often encounter statements such as “The study failed to find…” misinterpreted as “The study proved no effect of…”

40. Experimental Design • We’ve been using examples in which a single sample is compared to a population

41. Experimental Design • We’ve been using examples in which a single sample is compared to a population • Often we employ more sophisticated designs

42. Experimental Design • We’ve been using examples in which a single sample is compared to a population • Often we employ more sophisticated designs • What are some different ways you could run an experiment?

43. Experimental Design • Compare one mean to some value • Often that value is zero

44. Experimental Design • Compare one mean to some value • Often that value is zero • Compare two means to each other

45. Experimental Design • There are two general categories of comparing two (or more) means with each other

46. Experimental Design • Repeated Measures - also called “within-subjects” comparison • The same subjects are given pre- and post- measurements • e.g. before and after taking a drug to lower blood pressure • Powerful because variability between subjects is factored out • Note that pre- and post- scores are linked - we say that they are dependant • Note also that you could have multiple tests

47. Experimental Design • Problems with Repeated-Measure design: • Practice/Temporal effect - subjects get better/worse over time • The act of measuring might preclude further measurement - e.g. measuring brain size via surgery • Practice effect - subjects improve with repeated exposure to a procedure

48. Experimental Design 2. Between-Subjects Design • Subjects are randomly assigned to treatment groups - e.g. drug and placebo • Measurements are assumed to be statistically independent

49. Experimental Design 2. Problems with Between-Subjects design • Can be less powerful because variability between two groups of different subjects can look like a treatment effect • Often needs more subjects

50. Experimental Design • We’ll need some statistical tests that can compare: • One sample mean to a fixed value • Two dependent sample means to each other (within-subject) • Two independent sample means to each other (between-subject)