1 / 39

BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS

BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS. Mathematical expectation. The mean (x) of random variable x is:. where n is the number of observations, the variance (s 2 ) is:. Mathematical expectation. The standard deviation ( s ) is:. The coefficient of variation is:.

gwyn
Download Presentation

BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BRIEF REVIEW OF STATISTICAL CONCEPTS AND METHODS

  2. Mathematical expectation The mean (x) of random variable x is: where n is the number of observations, the variance (s2) is:

  3. Mathematical expectation The standard deviation (s) is: The coefficient of variation is:

  4. Precision, bias, and accuracy

  5. Basic probability The probability of a event occurring is expressed as: P(event) The probability of the event not occurring, 1- P(event) or P(~event). If events are independent, the probability of events A and B occurring is estimated as: p(A) * p(B).

  6. Probability example The probability of catching a fish during a single event: p(capture) On all three sampling occasions is: p(capture)*p(capture)*p(capture) = p(capture)3, the probability of not catching it during any of the 3 occasions is: (1- p(capture))*(1- p(capture))*(1- p(capture)) = (1- p(capture))3, and the probability of catching it on at least 1 occasion is the complement of not catching it during any of the occasions: 1- (1- p(capture))3.

  7. Models and fisheries management “True” Models • Fundamental assumption: there is no “true” model that generates biological data • Truth in biological sciences has essentially infinite dimension; hence, • full reality cannot be revealed with finite samples. • Biological systems are complex with many small effects, interactions, individual • heterogeneity, and environmental covariates. • Greater amounts of data are required to model smaller effects. • Thus all models are approximations of reality

  8. Models = hypotheses Models and hypotheses • Hypotheses are unproven theories, suppositions that are tentatively • accepted to explain facts or as the basis for further investigation • Models are very explicit representations of hypotheses • Several models can represent a single hypotheses • Models are tools for evaluating hypotheses

  9. Models and hypotheses: example Hypothesis: shoal bass reproduction success is greater when there are more reproductively active adults Y = aN Number of young is proportional to the number of adults Number of young increases with the number of adults until nesting areas are saturated Y = aN/(1+bN) Number of young is increases until the carrying capacity of nesting and rearing areas is reached Y = aNe-bN

  10. Y = aN Y = aN/(1+bN) Number of YOY Y = aNe-bN Number of shoal bass

  11. Tapering Effect Sizes • Biological systems there are often large important effects, followed by smaller • effects, and then yet smaller effects. • These effects might be sequentially revealed as sample size increases • because information content increases • Rare events yet are more difficult to study (e.g. fire, flood, volcanism) Big effects small effects

  12. Model selection • Determine what is the best explanation given the data • Determine what is the best model for predicting the response • Two approaches in fisheries/ecology • Null hypothesis testing • Information theoretic approaches

  13. Null hypothesis testing Develop an a priori hypothesis Deduce testable predictions (i.e., models) Carry out suitable test (experiment) Compare test results with predictions Retain or reject hypothesis

  14. Hypothesis testing example: Density independence for lake sturgeon populations Hypothesis: lake sturgeon reproduction is density independent Prediction: there is no relation between adult density and age 0 density Model: Y = B0 Test: measure age 0 density for various adult densities over time Compare: Linear regression between age 0 and adult sturgeon densities, P value = 0.1839 Using a critical a-level = 0.05, we conclude no significant relationship Result: Retain hypothesis lake sturgeon reproduction is density independent

  15. Model selection based on p-values • No theoretical basis for model selection • P-values ~ precision of estimate • P-values strongly dependent on sample size P(the data (or more extreme data)| Model) vs. L(model | the data) JUST SAY NO TO STATISTICAL SIGNIFICANCE TESTING

  16. If you really need a p-value…. Mark implements likelihood ratio tests -nested models only e.g., Full model: S = f(temperature, flow) Nested model: S = f(flow) Ho: survival related flow Ha: survival related to temperature and flow

  17. Information theory If full reality cannot be included in a model, how do we tell how close we are to truth. truth Kullback-Leibler distance based on information theory The measures how much information is in accounted for in a model Entropy is synonymous with uncertainty

  18. Information theory K,L distance (information) is represented by: I(truth| model) It represents information lost when the candidate model is used to Approximate truth thus SMALL values mean better fit AIC is based on the concept of minimizing K-L distance Akaike noticed that the maximum log likelihood Log( L (model or parameter estimate | the data) ) was related to K-L distance

  19. 45 40 35 30 25 20 15 10 5 0 0 5 10 What a maximum likelihood estimate? It is those parameter values that maximize the value of the likelihood, given the data Sums of squares in regression also is a measure of the relative fit of a model SSE = Sdeviations2

  20. The maximum log likelihood (and SSE) is a biased estimate of K-L distance Akaike’s contribution was that he showed that: AIC = -2ln(L (model | the data)) + 2K It is based on the principle of parsimony Bias2 Variance Few Many Number of parameters Heuristic interpretation AIC = -2ln(likelihood) + 2*K Measures model lack of fit Penalty for increasing model size (enforces parsimony)

  21. AIC: Small sample bias adjustment If ratio of n/K is < 40 then use AICc AICc = -2*ln(likelihood | data) + 2*K + (2*K*(K+1))/(n-K-1) As n gets big…. (2*K*(K+1))/(n-K-1) = 1/very large number (2*K*(K+1))/(n-K-1) = 0 AICc = AIC So….

  22. Model selection with AIC What is model selection? AIC by itself is relatively meaningless. Recall that we find the best model by comparing various models and examining Their relative distance to the “truth” We do this by calculating the difference between the best fitting model (lowest AIC) and the other models. Model selection uncertainty Which model is the best? What about if you collect data at the same spot next year, next week, next door? AIC weights-- long run interpretation vs. Bayesian. Confidence set of models analogous to confidence intervals

  23. Where do we get AIC? K -2ln(L (model | the data))

  24. Interpreting AIC Best model (lowest AICc) Difference between lowest AIC and model (relative distance from truth)

  25. Interpreting AIC AICc weight, ranges 0-1 with 1 = best model Interpreted a relative likelihood that model is best, given the data and the other models in the set

  26. Interpreting AIC Ratio of 2 weights interpreted as the strength of evidence for one model over another Here the best model is 0.86748/0.13056 = 6.64 times more likely to be The best model for estimating striped bass population size

  27. Confidence model set Analogous to a confidence interval for a parameter estimate Using a 1/8 (0.12) rule for weight of evidence, my confidence set includes the top two models (both model likelihoods > 0.12).

  28. Linear models review Y: response variable (dependent variable) X: predictor variable (independent variable) Y = b0 + b1*X + e b0 is the intercept b1 is the slope (parameter) associated with X e is the residual error

  29. Linear models review When Y is a probability it is bounded by 0, 1 Y = b0 + b1*X Can provide values <0 and > 1, we need to transform or use a link function For probabilities, the logit link is the most useful

  30. Logit link p h = ln( ) 1- p h is the log odds p is the probability of an event

  31. Log linear models (logistic regression) h = b0 + b1*X h is the log odds b0 is the intercept b1 is the slope (parameter) associated with X Betas are on a logit scale and the log-odds needs to be back transformed

  32. Back transformation: Inverse logit link 1 p = -h 1+exp( ) h is the log odds p is the probability of an event

  33. Back transformation example h = b0 + b1*X b0 = - 2.5 b1 = 0.5 X = 2

  34. Back transformation example h = -2.5 + 0.5*2 h = -1.5 1 = 0.18 or 18% 1+exp(1.5)

  35. Interpreting beta estimates Betas are on a logit scale, to interpret calculate odds ratios Using the exponential function b1 = 0.5 exp(0.5) = 1.65 Interpretation: for each 1 unit increase in X, the event is 1.65 times more likely to occur For example, for each 1 inch increase in length, a fish is 1.65 times more likely to be caught

  36. Goodness-of-fit C

  37. Goodness-of-fit MARK output

  38. Overdispersion Extra variability Missing covariates Heterogeneity in S, p, etc. Possible solutions Include additional covariates Heterogeneity models c-hat adjustment in MARK quasi-AIC (QAIC) variances and confidence intervals

  39. Goodness of fit Bootstrap GOF Median c-hat Residual plots

More Related