1 / 49

GENERAL LINEAR MODELS

GENERAL LINEAR MODELS. Oneway ANOVA, GLM Univariate (n-way ANOVA, ANCOVA) . Dependent variable is continuous Independent variables are nominal, categorical (factor, CLASS) or continuous (covariate)

otylia
Download Presentation

GENERAL LINEAR MODELS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GENERAL LINEAR MODELS Oneway ANOVA, GLM Univariate (n-way ANOVA, ANCOVA)

  2. Dependent variable is continuous Independent variables are nominal, categorical (factor, CLASS) or continuous (covariate) Are the group means of the dependent variable different across groups defined by the independents Main effects, interactions and nested effects Often used for testing hypotheses with experimental data BASICS

  3. BASICS 3 X 2 full factorial design (full: each cell has observations) Balanced design: each cell has equal number of observations

  4. Enough observations in each group? (n >20) Independence of observations Similarity of variance-covariance matrices (no problem if largest group variance < 1.5*smallest group variance, 4* if balanced design) Normality Linearity No outlier-observations ASSUMPTIONS

  5. Model significance? F-test and R square Welch, if unequal group variances (this can be tested using Levene or Brown-Forsythe test) Significance of effects? (F-test and partial eta squared) Which group differences are significant? Post hoc or contrast tests What are the group differences like? Estimated marginal means for groups STEPS OF INTERPRETATION

  6. A continuous dependent variable (y) and one categorical independent variable (x), with min. 3 categories, k= number of categories assumptions: y normally distributed with equal variance in each x category H0: mean of y is the same in all x categories Variance of y is divided into two components: within groups (error) and between groups (model, treatment) Test statistic= between mean square / within mean square follows F-distribution with k-1, n-k degrees of freedom F-test can be replaced by Welch if variances are unequal Oneway ANOVA

  7. If the F test is significant, you can use post hoc tests for pairwise comparison of means across the groups Alternatively (in experiments) you can define contrasts ex ante Oneway ANOVA

  8. SAS: oneway ANOVA

  9. SAS: oneway ANOVA Use this instead of F if variances are not equal BF or Levene, H0: group variances are equal

  10. SAS: oneway ANOVA Post hoc -tests

  11. SAS: oneway ANOVA

  12. SAS: oneway ANOVA

  13. MODEL FIT

  14. EQUALITY OF VARIANCES

  15. GROUP MEANS

  16. POST HOC TEST

  17. BOXPLOTS

  18. Multiway ANOVA, GLM • A continuousdependentvariable y, twoormorecategoricalindependentvariables (factorial design) • ANCOVA, iftherearecontinuousindependents (covariates) • main effects and interactioneffectscanbemodeled • fixedfactor, ifallgroupsarepresent and randomfactor, ifonlysomegroupsarerandomlyrepresented in the data • Eta squared = SSK/SST expresseshowmany % of the variance in y is explainedby x (not in EG! SAS code: model y = x1 x2 / ss3 EFFECTSIZE;)

  19. INTERACTION EFFECT • Synergy of two factors, the effect of one factor is different in the groups of the other factor • Crossing effect = interaction effect • Ordinal (lines in means plot have different slopes, but do not cross) • Disordinal (lines cross in the means plot)

  20. NO INTERACTION Size and industry both have a significant main effect No interaction, homogeneity of slopes

  21. INTERACTIONS Ordinal interaction (the effect of size is stronger in manufacturing than in trade) Dis-ordinal interaction (the effect of size has a different sign in manufacturing and trade)

  22. NESTED EFFECTS • Nested effect B(A) ”B nested within A” • size (industry): the effect of size is estimated separately for each industry group • Difference between nested and interaction effect is that the main effect of B (size) is not included • The slope of B (size) is different in each category of A (industry)

  23. ESTIMATED GROUP MEANS • Estimated marginal means or LS (least squares) means • Predicted group means are calculated using the estimated model coefficients • The effects of other independent variables are controlled for • Is not equal to the group means from the sample

  24. SUM OF SQUARES • Type I SS does not control for the effects of other independent variables which are specified later into the model • Type II SS controls for the effects of all other independents • Types III and IV SS are better in unbalanced designs, IV if there are empty cells

  25. POST HOC TESTS • Multiple comparison procedures, mean separation tests • The idea is to avoid the risk of Type I error which results from doing many pairwise tests, each at 5% risk level • E.g. Bonferroni, Scheffe, Sidak,… • Tukey-Kramer is most powerful • H0: equal group means -> rejection means that group means are not equal, but failure to reject does not necessarily mean that they are equal (small sample size -> low power -> failure to reject the null)

  26. ANCOVA • The model includes a covariate (= continuous independent variable, often one whose effect you want to control for) • Regress y on the covariate -> then ANOVA with factors explaining the residual • The relationship between covariate and y must be linear, and the slope is assumed to be the same at all factor levels • The covariate and factor should not be too much related to each other • Do not include too many covariates, max 0.1*n – (k-1)

  27. SAS: analyze – ANOVA – linearmodels

  28. Effects to beestimated Interactionhere, firstselectbothvariables, thenclickCross

  29. Sums of squares

  30. Otheroptions, defaults ok

  31. Post hoc-tests

  32. Plots

  33. SAS - code PROC GLM DATA=libname.datafilename PLOTS(ONLY)=DIAGNOSTICS(UNPACK) PLOTS(ONLY)=RESIDUALS PLOTS(ONLY)=INTPLOT ; CLASS Elinkaari Perheyr; MODEL growthorient= ln_hlo Elinkaari PerheyrElinkaari*Perheyr / SS3 SOLUTION SINGULAR=1E-07 EFFECTSIZE ; LSMEANS Elinkaari PerheyrElinkaari*Perheyr / PDIFF ADJUST=BON ; RUN; QUIT;

  34. Modelsignificance and fit

  35. Significance of predictors

  36. EFFECT SIZE OF PREDICTORS

  37. Parameterestimates

  38. Prediction for 6 cells • Elinkaari=2 & perheyr=0 (growthphase, nonfamily) Growth= 3.20 + 0.16*ln_hlo + 0.37 – 0.86 + 1.25 = 3.96 + 0.16*ln_hlo • Elinkaari=3 & perheyr=0 (maturephase, nonfamily) Growth = 3.20 + 0.16*ln_hlo – 0.04 – 0.86 + 0.65 = 2.95 + 0.16*ln_hlo • Elinkaari=4 & perheyr=0 (declinephase, nonfamily) Growth = 3.20 + 0.16*ln_hlo + 0.00 – 0.86 + 0.00 = 2.34 + 0.16*ln_hlo • Elinkaari=2 & perheyr=1 (growthphase, family) Growth = 3.20 + 0.16*ln_hlo + 0.37 + 0.00 + 0.00 = 3.57 + 0.16*ln_hlo • Elinkaari=3 & perheyr=1 (maturephase, family) Growth = 3.20 + 0.16*ln_hlo - 0.04 + 0.00 + 0.00 = 3.16 + 0.16*ln_hlo • Elinkaari=4 & perheyr=1 (declinephase, family) Growth = 3.20 + 0.16*ln_hlo + 0.00 + 0.00 + 0.00 = 3.20 + 0.16*ln_hlo

  39. Parameterestimates

  40. Homoskedasticity

  41. Outlierdiagnostics

  42. Residualdistribution

  43. Modelfit

  44. Influencediagnostics

  45. Residual vs. covariate

  46. Significance of groupdifferences, main effects

  47. Significance of groupdifferences, interaction Non-familyfirms in growthphasedifferfromnon-familyfirms in maturephase

  48. REPORTING GLM • Modelfit: F + df + p and R Square • Nature and significance of effects: parameterestimatesB+s.e.+t+p and F+p • estimatedgroupmeans (meansplot) • posthoctestresults

  49. Meansplot Employees at itsmeanvalue (20)

More Related