1 / 74

Review of yesterday

Review of yesterday. Consider. Don’t sum the cars Don’t reject the actual hypothesis H 1 (when p<0.05 you reject the Null hypothesis H 0 ) Use simple variable names not Y/N or Å. Don’t use function names as variable names, e.g., length mean t. 16. 14. 12. 10. 8. 6. 4. Red ants.

belita
Download Presentation

Review of yesterday

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Review of yesterday

  2. Consider • Don’t sum the cars • Don’t reject the actual hypothesis H1(when p<0.05 you reject the Null hypothesis H0) • Use simple variable names not Y/N or Å. • Don’t use function names as variable names, e.g., • length • mean • t

  3. 16 14 12 10 8 6 4 Red ants Black ants Logistic regression 2  2 tables Categoric 1.0 Melica 0.8 0.6 Prob. of choosing Melica 0.4 0.2 0.0 Response variable Luzula 4.5 5.5 6.5 7.5 Ant size Regression Anova Continuous - - Seed size Continuous Categoric Explanatory variable

  4. You now can test a lot! • Most cases with: • 1 response • 1 explanatory • Today: • Summarize • Add some more technical issues • Tomorrow: • 2 explanatory variables

  5. Binomial test

  6. Binomial test binom.test(18,20) Exact binomial test data: 18 and 20 number of successes = 18, number of trials = 20, p-value = 0.0004025 alternative hypothesis: true probability of success is not equal to 0.5 95 percent confidence interval: 0.6830173 0.9876515 sample estimates: probability of success 0.9

  7. 2x2 Fisher Test

  8. 2x2 Fisher Test fisher.test(naildata) Fisher's Exact Test for Count Data data: naildata p-value = 0.5006

  9. Logistic regression

  10. Logistic regression mod.glm<-glm(Y.N~Length,binomial) anova(mod.glm,test=”Chi”) Df Deviance Resid. Df Resid. Dev P(>|Chi|) NULL 46 64.109 Length 1 1.855 45 62.254 0.173

  11. Regression Ash seeds

  12. Regression mod.reg<-lm(falltime~seedlength) summary(mod.reg) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.55923 0.65656 2.375 0.024 * seedlength 0.02933 0.01905 1.540 0.13

  13. Critical values

  14. Risk of by chance only > 5 % 95% 70 60 50 40 Normal probability 30 2,5% 2,5% 20 10 0 Difference

  15. Risk by chance = 6.5 +6.5 = 13 % 70 60 50 40 Normal probability 30 6.5% 6.5% 20 10 0 Difference

  16. Regression mod.reg<-lm(falltime~seedlength) summary(mod.reg) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1.55923 0.65656 2.375 0.024 * seedlength 0.02933 0.01905 1.540 0.13

  17. Anova

  18. Anova mod.aov<-aov(twiglength~treeside) summary(mod.aov) Df Sum Sq Mean Sq F value Pr(>F) treeside 1 15.129 15.129 7.5222 0.009244 ** Residuals 38 76.427 2.011

  19. Anova

  20. Anova

  21. Confidence intervals • …shows how sure we are of a group mean. • The confidence interval will contain the ”true” mean in 95 % of the time. • The larger our sample size the more sure (= confident!) we are of our sample mean  the confidence interval decreases • And (of course…), the more variation within groups, the less sure we get  confidence interval increases

  22. Is there any difference? Seed size in plants

  23. p = 0,047

  24. Seed size

  25. p = 0,0047

  26. Flower size in Geranium

  27. p = 0,0037

  28. Flower size in Geranium

  29. p = 0,094

  30. Orchid seeds

  31. p = 0, 40

  32. Bat penises

  33. p < 0,0001

  34. bootstrap • to pull oneself up by one's bootstraps(idiomatic) • To begin an enterprise or recover from a setback without any outside help; to succeed only on one's own effort or abilities.We can't get a loan, so we'll just have to pull ourselves up by our bootstraps.

  35. shrimp-booting...

  36. Bootstrap for tests 120 80 No. boot-samples 60 40 20 0 -5 0 5 10 15 20 25 boot.difference

  37. Break?

  38. Anova mod.aov<-aov(twiglength~treeside) summary(mod.aov) Df Sum Sq Mean Sq F value Pr(>F) treeside 1 15.129 15.129 7.5222 0.009244 ** Residuals 38 76.427 2.011

  39. How does this F-test work? • The idea behind the F-test is to check whether the variation caused by the explanatory variable is larger than the random variation in each data point. • In this case the variation caused by the explanatory variable is the variation among groups. • The random variation is simply the variation that is NOT explained by the explanatory variable.

  40. How does this F-test work? • Under the hood. • A frog example. female male

  41. We measure 4 + 4 frogs 10 15 12 14 8 11 9 7

  42. Frog stripchart

  43. Frog stripchart

  44. Group means

  45. Explained by sex

  46. Random variation = residuals

  47. Is a significant amount explained by sex?

  48. The F-tests uses Sums of squares Sums of squares explained by sex= (group mean – total mean)2 for all points= the squares of the blue lines male - total = -2; female - total = 2 22 + 22 + 22 + 22 + 22 + 22 + 22 + 22 = 32

  49. The F-tests uses Sums of squares Sums of squares for resisduals= (y value - group mean)2 for all points= the squares of the red lines -0.752 + 2.252 + 0.252 + -1.752 -2.752 + 2.252 + -0.752 + 1.252 = 23.5

  50. Get the ”avarage” variation explained by sex and residuals Divide by degress of freedom Sex  # groups -1 [  32/1 = 32] Resdiduals  # frogs – sex df -1[  23.5 / 6 =3.9 ] The F-ratio is then 32/3.9 = 8.2

More Related