1 / 25

Bare-bones pragmatic highlights of Baayen, Davidson, Bates, 2008

Outline. The basic arguments for mixed-effects models (continuous dependent variables, with random effects)What you get from a modelHow to compare models with different random effectsHow to get p-valuesHow to do it all in RRemaining issues/questions:Contrast tests?Comparing models with differ

margaux
Download Presentation

Bare-bones pragmatic highlights of Baayen, Davidson, Bates, 2008

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. Bare-bones pragmatic highlights of Baayen, Davidson, & Bates, (2008) by Scottevil For MLM reading group 2/12/09

    2. Outline The basic arguments for mixed-effects models (continuous dependent variables, with random effects) What you get from a model How to compare models with different random effects How to get p-values How to do it all in R Remaining issues/questions: Contrast tests? Comparing models with different fixed effects? Multi-level/hierarchical versions

    3. The pros of mixed-effects More powerful (i.e., less likely to make Type II errors) Very robust against missing data No need for averaging, etc. Can correct for heteroskedacity in a more principled way by explicitly modeling (e.g.) subject effects on slope, as well as intercept Can explicitly test which random effects should be included in the model Likelihood ratio tests with some limits (we’ll get to that)

    4. Modeling error terms Three basic options: Random effect on the intercept Random effect on the slope Nesting of random effects Combinations of those A little terminology: Nesting: “multi-level” or “hierarchical” models Non-nesting: “crossed” random factors

    5. Comparing models Likelihood ratio test: tests chi-square distribution of difference in log-likelihood between two models. Explicit test to determine which error terms should be in a model RESTRICTIONS: Not for testing different models of fixed effects (only appropriate for determining which error terms to use) Can only compare nested models, where one model contains a proper subset of the other model’s error terms Only valid for comparing models of same set of data

    6. Getting p-values with mixed-effects Fitted model gives t-value, but picking degrees of freedom is very unclear. In experiments with large numbers of observations (hundreds), picking Df is less of a problem, can more or less assume that t > 2 is significant. Alternative: using Markov chain Monte Carlo (MCMC) sampling to estimate a p-value. BDB use data from Raaijmakers et al. (1999) and simulations to show that p-values estimated from MCMC method match up very well (in terms of Type I & II errors) with alternatives such as quasi-F, with fewer drawbacks.

    7. Examples in R Packages needed: languageR (by Baayen) requires: zipfR coda lme4 (by Bates) requires: Matrix lattice How to install these from a basic installation of R: Install.packages(c(“Matrix”, “lattice”, “lme4”, “coda”, “zipfR”, “languageR”), repos = “http://cran.r-project.org”) After packages installed, you need to load these: library(languageR)

    8. Getting data into R Make an ASCII table of your data, with a header row (e.g., from Excel, save as a tab-delimited .txt file) Each row is an observation, with columns for all the various fixed & random factors. No need for averaging by-subject or by-item. No need for data replacement. NB: R does NOT like empty cells, however. Just leave the row blank if there is missing data, or fill in with some code that you can use to get a subset of the data out in R that does not include those rows.

    9. Getting data into R Use the read.table() function: priming= read.table(“baayen2008.txt”, header = TRUE) Type the name of your data set to have R show it to you. A few other useful things: nrow() : tells you how many rows there are in your data set colnames() : tells you what the column names are data[1:10, ] : displays just the first 10 rows of your data

    10. Running lmer Form: lmer.object = lmer(formula, data = name of data table) lmer.object : the name of the output object name of data table : the name of the data as you imported it formula : where all the action is.

    11. Formulas in R The basic form is: X ~ Y, where X is your dependent variable, and Y is your list of factors. On the Y side: “+” is used to add a factor “|” is used to specify error terms “:” is used to specify interactions “*” is used for a “fully-crossed” design. E.g. X*Y is the same as X + Y + X:Y

    12. Looking at the data Type “priming” Look at names of factors, etc. NB: in R, columns are of basically three types Characters Numbers Factors In order to use a column as a factor in an analysis in R, it must be of “factor” type. By default, read.table() makes character columns into factors. To show this, type “levels(priming$SOA)”. “$” notation very useful: the thing after the $ is the name of the column, so “priming$SOA” selects the column named “SOA” from the data table “priming”

    13. Example lmer priming.lmer = lmer(RT ~ SOA + (1|Item) + (1+SOA|Subj), data = priming) SOA is the fixed effect (1|Item) specifies Item as a random effect on just the intercept (i.e., Item is expected to have an effect on overall how fast or slow a trial is responded to) ( 1+SOA|Subj) specifies Subj as a random effect on both intercept and slope. This is an example of modeling heteroskedacity! (SOA|Subj) basically says that subjects vary in how big the effect of SOA is at different levels of SOA

    14. Examining the model summary(priming.lmer) In random effects, high correlation (-1!) for the subject effects suggests that they are redundant (we’ll test this later). In fixed effects, t-value given. This can be helpful for a quick look at whether something is likely to be significant, but not directly interpretable with a p-value. Estimate in fixed effects tells you the effect size & means.

    15. Trying some other models Different options for random effects: priming.lmer2 = lmer(RT ~ SOA + (1|Item) + (1|Subj), data = priming) Comparing the models with likelihood ratio test: anova(priming.lmer, priming.lmer2) Results (p = 0.2288) tell us that changing the model to the simpler one (priming.lmer2) makes no difference, so it should be preferred over the more complex one.

    16. Lookin’ for p in all the right places Testing for significance, using languageR’s pvals.fnc() Uses Markov chain Monte Carlo method mcmc = pvals.fnc(priming.lmer2, nsim = 10000) 10,000 is default Generates plots of posterior values. The less skewed these are, the better. Type “mcmc$fixed” Two p-values: pMCMC and Pr(>|t|) First is p-value based on MCMC Second is two-tailed t-test, with Df at the upper bound (observations minus fixed effect), anticonservative for small samples! The more observations in your data, the more similar these two values. The more different these two values, the more skewed the posterior distributions are generated by the MCMC NOTE: pvals.fnc() does not work with random slope effects (yet)! Must resort to some other method, like using estimated Df’s for t-test

    17. Summary of procedure Get data in Fit model using lmer Check lmer and run some model comparisons (using anova function) to find the right array of random effects Use pvals.fnc to evaluate fixed effects for significance, using MCMC method (and t-test with upper-bound Df)

    18. Good things Can deal with Items & Subjects simultaneously High power & low Type I error compared to alternatives No need for averaging, data-replacement, etc. Can help figure out which random effects should be in the model Pretty quick & easy!

    19. Remaining issues & questions Contrast tests? I personally don’t know how to do them (yet) Quené & van den Bergh (2008) mention that it’s “complicated” Method for distinguishing models with different fixed effects? Nested/hierarchical random factors?

    20. Nested (multi-level) random effects How to do it in R: RT ~ SOA + (1|Subj) + (1|Item:Subj) RT ~ SOA + (1|Subj) + (1|Item) Or RT ~ SOA + (1|Item/Subj) Means, effect of Subject on intercept, and effect of Item *within* Subj Baayen et al. (2008) suggest that nesting items within subjects may not always be best model. Assumes that effects of item can be completely different for each subject. May not be appropriate for items with properties that should have similar effects across subject (e.g., length) Other problem: likelihood ratio test not valid when models are not nested wrt random effects (Pinheiro & Bates 2000) Can’t use method to distinguish between models with nesting vs. models without nesting (aka “crossed random effects” models)

    21. the end…(?!?) Another example follows if there is time/interest…

    22. Another example splitplot is a data set in languageR that has a traditional counterbalanced-type design (items are in both conditions, but only across subjects, in two counterbalanced lists) lmer can be run with no additional need for specification of this design.

    23. Working through example 2 Type “splitplot” to make sure you have it loaded. If not, make sure you have languageR loaded (with library(languageR)) Make a new data file called “dat” that is identical to “splitplot”, just to make this example parallel with the one in Baayen et al. 2008 dat = splitplot Fit the following models with languageR, using different arrangements of random effects. dat.lmer1 = lmer(RT ~ list * priming + (1+priming|subjects) + (1+list|items), data = dat) dat.lmer2 = lmer(RT ~ list * priming + (1+priming|subjects) + (1|items), data = dat) dat.lmer3 = lmer(RT ~ list * priming + (1|subjects) + (1|items), data = dat) dat.lmer4 = lmer(RT ~ list * priming + (1|subjects) , data = dat)

    24. Working through example 2 Next, run the likelihood ratio test: anova(dat.lmer1, dat.lmer2, dat.lmer3, dat.lmer4) The results of this test tell us that dat.lmer3 is a significant improvement over dat.lmer4, but that 1 & 2 are not worth the additional complications. This tells us that having subjects and items as random effects is important for this data, but that we probably don’t need to worry about heteroskedacity (i.e., modeling random effects of the slopes we tried adds nothing)

    25. Working through example 2 Finally, we run the tests for significance on the model with the best random effects model, dat.lmer3 (it should take a few seconds) dat.lmer3.pvals = pvals.fnc(dat.lmer3, nsim = 10000) dat.lmer3.pvals$fixed We can see that the effect of priming is significant by both pMCMC and t-test, and the effects involving list are not. Notice the high similarity between the pMCMC and the t-test p values. This is because the number of observations is relatively high (20 items in both conditions for 20 subjects). This also means we could have just eye-balled the t-values in the lmer model itself, looking for t > 2.

    26. The end (I think for reals, this time)

More Related