1 / 27

LME4

LME4. Andrea Friese, Silvia Artuso , Danai Laina. What we are going to do. introduction of the package linear mixed models: introduction to example fixed and random effects crossed and nested random effects random slopes and random intercepts models

soniam
Download Presentation

LME4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LME4 Andrea Friese, Silvia Artuso, DanaiLaina

  2. What we are going to do... • introduction of the package • linear mixed models: introduction to example • fixed and random effects • crossed and nested random effects • random slopes and random intercepts models • comparison and evaluation of the models • statistics

  3. About lme4 • provides functions to fit and analyze statistical linear mixed models, generalized mixed models, and nonlinear mixed models • we focus on linear mixed models • what they are? → I’ll get there

  4. mainly used in fields like biology, physics, or social sciences • developed by Bates, Mächler, Bolker and Walker around 2012 • fundamentally based on its predecessor nlme • lme4 as baseline for multilevel models + various packages that enhance its feature set

  5. Now: what is a linear mixed model? • picture a basic linear model with one dependent variable and one independent variable ( = fixed effect) • add various fixed effects • add random effects • random effect: a factor we want to control, because it can affect the result; “grouping-factor”

  6. In lme4: • command to analyze fixed effects on the dependent variable: lm <- lm(variable ~ fixed effect 1 + … , data = lmm.data) summary(lm) • to see if/ how individual fixed effects respond in other combinations replace the + operator by * or /: ...fixed effect 1 * fixed effect 2 + fixed effect 3...

  7. to fit a random effect into the function: lmer <- lmer(variable ~ fixed effect 1 + ....+ (1 | random), data = lmm.data) summary(lmer)

  8. Exercises • Exercise 1: Run the data that’s provided, then use head(lmm.data) and str(lmm.data) and take a look at your data • Exercise 2: Analyze the fixed effects and the dependent variable from our example • Exercise 3: Add the random variable ‘school’ To simplify the code check the dataset for abbreviations of fixed effects

  9. Types of random effects • we used (1|random)to fit our random effect • on the right side of | operator is the so-called “grouping factor” (--> “clusters”) • random effects (factors) can be crossed or nested - it depends on the relationship between the variables

  10. Crossed random effects Means that a given factor appears in more than one level of the upper level factor. We can have observations of the same subject across all the random variable ranges (crossed) or at least some of them (partially crossed). We would then fit the identity of the subject and the random factor ranges as (partially) crossed random effects. For example, there are pupils within classes measured over several years. To specify for them: (1|class) + (1|pupil)

  11. Be careful: from Wikipedia: “Multilevel models (also known as hierarchical linear models, linear mixed-effects models, mixed models, nested data models, random coefficient, random-effects models, random parameter models, or split-plot designs)” But while all “hierarchical linear models” are mixed models, not all mixed models are hierarchical. The reason is because you can have crossed (or partially crossed) random factors, that do not represent levels in a hierarchy.

  12. Nested random effects When a lower level factor appears only within a particular level of an upper level factor. For example, pupils within classes at a fixed point in time, or classes within schools (implicit nesting). There are two ways to explicitly specify for it in lme4: (1|class) + (1|class:pupil)or(1|class/pupil) Or, we can create a new nested factor: dataset <- within(dataset, sample <- factor(class:pupil))

  13. But the nesting can be also explicit in the dataset, when we code differently and uniquely the variables. In this case, using (1|class/pupil) or (1|class)+(1|pupil)in the model will produce the same effect.

  14. The same model specification can be used to represent both (partially) crossed or nested factors, so you can’t use the models specification to tell you what’s going on with the random factors, you have to look at the structure of the factors in the data. Example:head(dataset) or str(dataset) To check if there is explicit nesting, we can use the functionisNested: with(dataset, isNested(pupil, class)) # FALSE or TRUE

  15. And if we need to make explicit the nesting of a random effect in a fixed effect? In this case: fixed + (1|fixed : random) Because the interaction between a fixed and a random effect is always random.

  16. Exercise • Exercise 4: In the dataset lmm.data, check if class is nested in school • Exercise 5: add the random variable class as a nested effect in school and look at the results;usesummary()andplot() • Exercise 6: Look at the variance explained by the school factor. What happens if you treat it as a fixed effect? Try out

  17. Random slopes vs. random intercepts models Random effects: Groups Name Variance Std.Dev. class:school (Intercept) 8.2046 2.8644 school (Intercept) 93.8433 9.6873 Residual 0.9684 0.9841 Number of obs: 1200, groups: class:school, 24; school, 6 Our model is what is called a random intercept model: subjects and items are allowed to have different intercept, but not different slopes → coef(). But what if we think that a fixed effect is not the same for all subjects and items?

  18. Random slopes model (1 + fixed|random) Example: randomslopes_models <- lmer(response ~ fixedA + fixedB + (1 + fixedA|random), data=data) This allows the slope of the fixedA variable to vary by the random one.

  19. Exercise • Exercise 7: try to fit to our dataset a random slope model, with openness’ slope varying by school • Look at the results

  20. How can we compare and evaluate our mixed models? #Check the arguments of lmer command and find the REML argument ?lmer # Run the following model #Model_1 lmer <- lmer(extro ~ open + agree + social +(1|school/class), data= lmm.data, REML = T) summary(lmer) • Check the REML argument, what does it stand for? • REML stands for the Restricted Maximum Likelihood • It is the default option when constructing a model in lme4 but it is not suitable for comparing models

  21. How can we compare and evaluate our mixed models? # Run the following model #Model_1 lmer <- lmer(extro ~ open + agree + social +(1|school/class), data= lmm.data,REML = F) summary(lmer) Do you notice any differences in the output compared to the previous ones? Linear mixed model fit by maximum likelihood ['lmerMod'] Formula: extro ~ open + agree + social + (1 | school/class) Data: lmm.data AIC BIC logLik deviance df.resid 3545.1 3580.7 -1765.5 3531.1 1193

  22. How can we compare and evaluate our mixed models? #Run the following model #Model_2 lmer2 <- lmer(extro ~ social + (1|class) + (1|school), data=lmm.data,REML = F) summary(lmer2) • Which is the output this time? Linear mixed model fit by maximum likelihood ['lmerMod'] Formula: extro ~ social + (1 | class) + (1 | school) Data: lmm.data AIC BIC logLik deviance df.resid 4715.6 4741.0 -2352.8 4705.6 1195 • Which model should we select?

  23. How can we simply test statistically our models? # Test statistically the two models anova(lmer, lmer2) • Check the output. Which model would you choose? # Record the deviance and df of residuals that you calculated before for the two models # lmer: deviance: 3531.1; df resid: 1193 # lmer2: deviance: 4705.6; df resid: 1195 # Perform a chi-squared test pchisq(4705.6 – 3531.1, 1195 – 1193, lower.tail=F)

  24. And what if we drop out all our fixed factors? # Fit the following models, a full model and a reduced model in which we dropped our fixed effects Full.lmer <- lmer(extro ~ open + agree + social + (1|school/class), data= lmm.data,REML = F) Reduced.lmer <- lmer(extro ~ 1 + (1|school/class), data=lmm.data,REML=F) # Compare them anova(Reduced.lmer, Full.lmer) What would you conclude?

  25. And after choosing the appropriate model? REML calculates the most optimal estimates of the variables, optimizing the model So our model is re-fitted with this criterion • # After choosing the appropriate model • #Model_1 • lmer <- lmer(extro ~ open + agree + social +(1|school/class), data= lmm.data,REML = T) • summary(lmer)

  26. Conclusions • To compare different models with different fixed effects and evaluate the amount of explained information → Maximum Likelihood measures The lower the measures, the better our models fit to the data set • Depending always on our questions, we construct and select the best fitting models, optimizing them with the REML criterion

  27. THANK YOU FOR YOUR ATTENTION

More Related