 Download Presentation General Linear Models; Generalized Linear Models General Linear Models; Generalized Linear Models - PowerPoint PPT Presentation

Download Presentation General Linear Models; Generalized Linear Models
An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

1. General Linear Models;Generalized Linear Models Hal Whitehead BIOL4062/5062

2. Transformations • Analysis of Covariance • General Linear Models • Generalized Linear Models • Non-Linear Models

3. Common Transformations • Logarithmic: X’=Log(X) • Most common, morphometrics, allometry • Squareroot: X’=√X • Counts, Poisson distributed • X’=√(X+0.5) if counts include zeros • Arcsine-squareroot: X’=arcsine(√X) • Proportions (or percentages /100) • Box-Cox • General transformation

4. Regression and ANOVA • Multiple regression: Y = β0 + β1·X1 + β2·X2 + β3·X3 + … + Error {X’s are continuous variables} • ANOVA: Y = γ0 + γ1 (Z1)+ γ2(Z2) + γ3(Z3) + … + Error {Z’s are categorical variables, defining groups}

5. Analysis of Covariance(mixture of ANOVA and regression) Y = β0+β1·X1+β2·X2+…+γ1(Z1)+γ2(Z2)+... +Error {X’s are continuous variables} {Z’s are categorical variables, defining groups} • Important assumption:Parallelism: β’s the same for all groups • Estimate β’s and γ’s using least squares

6. Analysis of Covariance • Data: • Catch rates of sperm whales (per whaling day) by Yankee whalers from logbooks of Yankee whalers off Galapagos Islands 1830-1850 • Questions: • Was there a significant change in catch rate over this period? • Was there a significant seasonal pattern?

7. Analysis of Covariance • Model: Catch (m,t) = β0 + β1·t + γ(m) + Error t =1830-1850 [continuous] m= Jan-Feb, Mar-Apr, …, Nov-Dec

8. Analysis of Covariance • Model: Catch (m,t) = β0 + β1·t + γ(m) + Error • Parameter estimates: β0 = 4.528 [constant] β1 =-0.002 [change/yr] γ(Jan-Feb) = 0.016 γ(Mar-Apr) = 0.013 γ(May-Jun) =-0.038 γ(Jul-Aug) =-0.020 γ(Sep-Oct) = 0.000 γ(Nov-Dec) = 0.000

9. Analysis of Covariance • Model: Catch (m,t) = β0 + β1·t + γ(m) + Error • Analysis of Variance Table: Source SS df MS F-ratio P YEAR 0.014 1 0.014 3.653 0.061 MONTH 0.034 5 0.007 1.782 0.131 Error 0.220 57 0.004

10. Analysis of Covariance Durbin-Watson D Statistic: 1.923 First Order Autocorrelation: 0.034

11. General Linear Model:Analysis of Covariance plus Interactions Y = β0 + β1·X1 + β2·X2 + … + γ1 (Z1) + γ2 (Z2) + … + β12·X1·X2 + … + γ12 (Z1, Z2) + … + α12 (Z1)·X1 + … + Error {X’s are continuous variables} {Z’s are categorical variables, defining groups}

12. Characteristics of General Linear Models • The response Y has a normal distribution with vector mean μ and SD σ2. • A coefficient vector (b=[β’s, γ’s, α’s]) defines a linear combination of the predictors (X’s). • The model equates the two as: μ = X·b

13. General Linear Models • Coefficients (β’s, γ’s, α’s), and fit of model (σ² or r²) estimated using least squares • Subsets of predictor variables may be selected using stepwise methods, etc. • Beware: • Collinearity • Empty or nearly-empty cells (combinations of categorical variables with few units)

14. General Linear Model • Data: • Movements of sperm whales (displacement per 12-hr) off Galapagos Islands with year, clan, and shit rate • Questions: • Are movements of sperm whales affected by year, clan, shit rate or combinations of them?

15. General Linear Model Potential X variables:Year (Categorical: 1987 and 1989) Clan (Categorical: ‘Plus-one’ and ‘Regular’) Shit-rate (Continuous, Arcsine-Squareroot transform) Year*Clan Year*Shit-rate Clan*Shit-rate

16. General Linear Model X variables selected by stepwise selection (P-to-enter = 0.15/ P-to-remove = 0.15) Backward Forward Year Year Clan Clan Shit-rate Shit-rate Year*Clan Year*Clan Year*Shit-rate Year*Shit-rate Clan*Shit-rate Clan*Shit-rate

17. Backward Y =c + Clan + Year*Clan Forward Y =c + Shit-rate*Clan General Linear Model

18. Backward Y =c + Clan + Year*Clan Forward Y =c + Shit-rate*Clan General Linear ModelWhy two “best models”? 1987 1989

19. Backward Y =c + Clan + Year*Clan Forward Y =c + Shit-rate*Clan General Linear ModelWhich is “best”? 1987 1989 r²=0.347 1 d.f. r²=0.264 2 d.f.

20. General Linear Models • The response Y has a normal distribution with vector mean μ and SD σ2. • A coefficient vector (b=[β’s, γ’s, α’s]) defines a linear combination of the predictors (X’s). • The model equates the two as: μ = X·b

21. Generalized Linear Models • The response Y has a distribution that may be normal, binomial, Poisson, gamma, or inverse Gaussian, with parameters including amean µ. • A coefficient vector (b=[β’s, γ’s, α’s]) defines a linear combination of the predictors (X’s). • A link function f defines the link between the two as : f(μ) = X·b

22. Generalized linear models • Examine assumptions using residuals • Examine fit using “deviance”: • a generalization of the residual sum of squares • twice difference of log-likelihoods of model in question and full model • fits of different models can be compared • Related to AIC

23. Generalized Linear Models:can fit non-linear relationships using ‘link functions’ and can consider non-normal errors MATLAB: glmdemo

24. Proportion of sexually-mature animals at different weights MATLAB: glmdemo

25. Two problems with linear regression:1) probabilities <0 and >12) clearly non-linear MATLAB: glmdemo

26. Polynomial Regression better, but also:1) probabilities <0 and >12) inflections are not real MATLAB: glmdemo

27. Instead fit “logistic regression”using generalized linear model and binomial distribution Y= 1/(1+e β0+β1·X) MATLAB: glmdemo

28. Compare two generalized linear models Y= 1/(1+e β0+β1·X) Y= 1/(1+e β0+β1·X +β2·X·X) Difference in deviance =0.70; P=0.40 MATLAB: glmdemo

29. Examine assumptions using residuals MATLAB: glmdemo

30. Making predictions: MATLAB: glmdemo

31. Non-linear models, e.g.Y= c + EXP(ß0 + ß1·X) + EY= ß0 + ß1·X·[X>XK] + E • More general than generalized linear models • But harder to fit: • iterative process • may not converge • non-unique solution • harder to compare

32. Summary:Methods with One Dependent Variable Simple Linear Regression One-way ANOVA Multiple Linear Regression Multi-way ANOVA Analysis of Covariance General Linear Model Generalized Linear Model Non-Linear Model Increasing Complexity