Download Presentation
## Exam

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Exam**• Exam starts two weeks from today**Amusing Statistics**• Use what you know about normal distributions to evaluate this finding: The study, published in Pediatrics, the journal of the American Academy of Pediatrics, found that among the 4,508 students in Grades 5-8ﾊwho participated, 36 per cent reported excellent school performance, 38 per cent reported good performance, 20 per cent said they were average performers, and 7 per cent said they performed below average.**Review**• The Z-test is used to compare the mean of a sample to the mean of a population and**Review**• The Z-score is normally distributed**Review**• The Z-score is normally distributed • Thus the probability of obtaining any given Z-score by random sampling is given by the Z table**Review**• We can likewise determine critical values for Z such that we would reject the null hypothesis if our computed Z-score exceeds these values • For alpha = .05: • Zcrit (one-tailed) = 1.64 • Zcrit (two-tailed) = 1.96**Confidence Intervals**• A related question you might ask: • Suppose you’ve measured a mean and computed a standard error of that mean • What is the range of values such that there is a 95% chance of the population mean falling within that range?**Confidence Intervals**• There is a 2.5% chance that the population mean is actually 1.96 standard errors more than the observed mean True mean? 2.5% 95% 1.96**Confidence Intervals**• There is a 2.5% chance that the population mean is actually 1.96 standard errors less than the observed mean True mean? 2.5% 95% -1.96**Confidence Intervals**• Thus there is a 95% chance that the true population mean falls within + or - 1.96 standard errors from a sample mean**Confidence Intervals**• Thus there is a 95% chance that the true population mean falls within + or - 1.96 standard errors from a sample mean • Likewise, there is a 95% chance that the true population mean falls within + or - 1.96 standard deviations from a single measurement**Confidence Intervals**• This is called the 95% confidence interval…and it is very useful • It works like significance bounds…if the 95% C.I. doesn’t include the mean of a population you’re comparing your sample to, then your sample is significantly different from that population**Confidence Intervals**• Consider an example: • You measure the concentration of mercury in your backyard to be .009 mg/kg • The concentration of mercury in the Earth’s crust is .007 mg/kg. Let’s pretend that, when measured at many sites around the globe, the standard deviation is known to be .002 mg/kg**Confidence Intervals**• The 95% confidence interval for this mercury measurement is**Confidence Intervals**• This interval includes .007 mg/kg which, it turns out, is the mean concentration found in the earth’s crust in general • Thus you would conclude that your backyard isn’t artificially contaminated by mercury**Confidence Intervals**• Imagine you take 25 samples from around Alberta and you found:**Confidence Intervals**• Imagine you take 25 samples from around Alberta and you found: • .009 +/- (1.96 x .0004) = .008216 to .009784 • This interval doesn’t include the .007 mg/kg value for the earth’s crust so you would conclude that Alberta has an artificially elevated amount of mercury in the soil**Power**• we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05**Power**• we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05 • we say that we have a significant result…**Power**• we perform a Z-test and determine that the difference between the mean of our sample and the mean of the population is not due to chance with a p < .05 • we say that we have a significant result… • but what if p is > .05?**Power**• What are the two reasons why p comes out greater than .05?**Power**• What are the two reasons why p comes out greater than .05? • Your experiment lacked Statistical Power and you made a Type II Error • The null hypothesis really is true**Power**• Two approaches: • The Hopelessly Jaded Grad Student Solution • The Wise and Well Adjusted Professor Procedure**Power**1. Hopelessly Jaded Grad Student Solution - conclude that your hypothesis was wrong and go directly to the grad student pub**Power**- This is not the recommended course of action**Power**2. The Wise Professor Procedure - consider the several reasons why you might not have detected a significant effect**Power**- recommended by wise professors the world over**Power**• Why might p be greater than .05 ? • Recall that: and**Power**• Why might p be greater than .05 ? 1. Small effect size: • The effect doesn’t stand out from the variability in the data • You might be able to increase your effect size (e.g. with a larger dose or treatment) is quite close to the mean of the population**Power**• Why might p be greater than .05 ? 2. Noisy Data • A large denominator will swamp the small effect • Take greater care to reduce measurement errors and therefore is quite large**Power**• Why might p be greater than .05 ? 3. Sample Size is Too Small • A large denominator will swamp the small effect • Run more subjects is quite large because is small**Power**• The solution in each case is more power:**Power**• The solution in each case is more power: • Power is like sensitivity - the ability to detect small effects in noisy data**Power**• The solution in each case is more power: • Power is like sensitivity - the ability to detect small effects in noisy data • It is the opposite of Type II Error rate**Power**• The solution in each case is more power: • Power is like sensitivity - the ability to detect small effects in noisy data • It is the opposite of Type II Error rate • So that you know: there are equations for computing statistical power**Power**• An important point about power and the null hypothesis: • Failing to reject the null hypothesis DOES NOT PROVE it to be true!!!**Power**• Consider an example: • How to prove that smoking does not cause cancer: • enroll 2 people who smoke infrequently and use an antique X-Ray camera to look for cancer • Compare the mean cancer rate in your group (which will probably be zero) to the cancer rate in the population (which won’t be) with a Z-test**Power**• Consider an example: • If p came out greater than .05, you still wouldn’t believe that smoking doesn’t cause cancer**Power**• Consider an example: • If p came out greater than .05, you still wouldn’t believe that smoking doesn’t cause cancer • You will, however, often encounter statements such as “The study failed to find…” misinterpreted as “The study proved no effect of…”**Experimental Design**• We’ve been using examples in which a single sample is compared to a population**Experimental Design**• We’ve been using examples in which a single sample is compared to a population • Often we employ more sophisticated designs**Experimental Design**• We’ve been using examples in which a single sample is compared to a population • Often we employ more sophisticated designs • What are some different ways you could run an experiment?**Experimental Design**• Compare one mean to some value • Often that value is zero**Experimental Design**• Compare one mean to some value • Often that value is zero • Compare two means to each other**Experimental Design**• There are two general categories of comparing two (or more) means with each other**Experimental Design**• Repeated Measures - also called “within-subjects” comparison • The same subjects are given pre- and post- measurements • e.g. before and after taking a drug to lower blood pressure • Powerful because variability between subjects is factored out • Note that pre- and post- scores are linked - we say that they are dependant • Note also that you could have multiple tests**Experimental Design**• Problems with Repeated-Measure design: • Practice/Temporal effect - subjects get better/worse over time • The act of measuring might preclude further measurement - e.g. measuring brain size via surgery • Practice effect - subjects improve with repeated exposure to a procedure**Experimental Design**2. Between-Subjects Design • Subjects are randomly assigned to treatment groups - e.g. drug and placebo • Measurements are assumed to be statistically independent**Experimental Design**2. Problems with Between-Subjects design • Can be less powerful because variability between two groups of different subjects can look like a treatment effect • Often needs more subjects**Experimental Design**• We’ll need some statistical tests that can compare: • One sample mean to a fixed value • Two dependent sample means to each other (within-subject) • Two independent sample means to each other (between-subject)