Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)?

1 / 49

# Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? - PowerPoint PPT Presentation

Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)?. Oi-man Kwok Texas A &amp; M University. Road Map . Why do we want to analyze longitudinal data under multilevel modeling (MLM) framework? Dependency issue

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)?' - betty_james

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Longitudinal Data Analysis:Why and How to Do it With Multi-Level Modeling (MLM)?

Oi-man Kwok

Texas A & M University

• Why do we want to analyze longitudinal data under multilevel modeling (MLM) framework?
• Dependency issue
• Advantages of using MLM over traditional Methods (e.g., Univariate ANOVA, Multivariate ANOVA)
• Review of important parameters in MLM
• How can we do it under SPSS?
Regression Model:

e.g.

DV: Test Scores of 1st Year Grad-Level Statistics

IV: GRE_M (GRE Math Test Score)

150 Students (i = 1,…,150)

One of the important Assumptions for OLS regression?

(Observations are independent from each other)

Ignoring the clustered structure (or dependency between observations) in the analyses can result in:
• Bias in the standard errors

*Bias in the test of significance and confidence interval

(Type I errors: Inflated alpha level (e.g. set α=.05; actual α=.10))

 non-replicable results

• Univariate ANOVA—Restriction on the error structure: Compound Symmetry (CS) type error structure (higher statistical power but not likely to be met in longitudinal data)
• Multivariate ANOVA—No restriction on the error structure: Unstructured (UN) type error structure (often too conservative, lower statistical power); can only handle completely balanced data (Listwise deletion)
• More…
Analyzing Longitudinal Data:
• Example
• (Based on Actual Data—variable names changed for ease of presentation):

Compare two different teaching methods on Achievement over time

• Teaching Methods:

78 students are randomly assigned to either:

A. Lecture (Control group; 39 students) or

B. Computer (Treatment group; 39 students)

• 4 Achievement (Ach) scores (right after the course, 1 year after, 2 year after, & 3 year after) were collected from each student after treatment (i.e. statistics course)

Achievement

Computer

Time=0 : Immediately posttest measure

Lecture

1

3

2

Time (Year)

Student 36

Acht

β1

e3

e2

e1

e0

β0

Timet

1

2

3

0

Multi-Level Model (MLM)

A Simple Regression Model for ONE student (student 36)

• Introduce treatment in example at end

(t=0,1,2,3)

V(eti)=σ2

et: Captures variation of individual achievement

scores from the fitted regression model

WITHIN student 36

Student 27

Student 36

Achti

β1_Student 36

β1_Student 27

Β0_Student 36

Student 52

Β0_Student 27

Β0_Student 52

Timeti

1

2

3

0

Compare to

(Micro Level Model)

(i=1,2,3,…,78)

Student 27

Student 36

Achti

β1_Student 36

β1_Student 27

Β0_Student 36

Student 52

Β0_Student 27

Β0_Student 52

Timeti

1

2

3

0

Grand Intercept

Variance of the

intercepts

Grand Slope

Variance of the

Slopes

Ach

γ00

Time

0

Captures the deviations of

the 78 slopes from the

Grand slope γ10

Student 27

Captures the deviations of

the 78 intercepts from the

grand intercept γ00

Student 36

Overall Model

Student 52

No variation among the 78 intercepts

Ach

Student 27

γ10

Overall Model

γ10

Student 36

γ10

Student 52

γ10

No variation among the 78 slopes

Time

Ach

Overall Model

Time

Summary
• G: Captures between- student differences
• R: Captures within-student random errors

Variance of the

Intercepts

Grand Intercept

Variance of the

Slopes

Grand Slope

Covariance between

Intercepts and Slopes

V(eti)=σ2

MACRO vs. MICRO (Cont.)
• MODELS:

MICRO level model:

regression model fits the observations within each MACRO unit

MACRO level model:

model captures the differences between the overall model and individual regression models from different macro units

Dependent Variable:

Math Achievement (Achieve, Repeat measures /Micro Level)

• Predictors:
• Repeated measure (MICRO) Level Predictor:

Time (& any time varying covariates)

• Student (MACRO) Level Predictor:

Computer (Different teaching methods) (& any time-invariant variables such as gender)

Data format under MANOVA approaches:
• Student Treat T0 T1 T2 T3
• S1 0 5 3 2 3
• S2 1 5 25 -- 33
• S3 1 -- 19 17 26
• S1 has responses on all time points
• S2 has missing response at time 2 (indicated by "--")
• S3 has missing response at time 0.
• MANOVA: only retains S1 in the analysis

(SPSS Data Format)

Data format for MANOVA

Student Treat T0 T1 T2 T3

S1 0 5 3 2 3

S2 1 5 25 -- 33

S3 1 -- 19 17 26

Student Treat Time DV

S1 0 0 5

S1 0 1 3

S1 0 2 2

S1 0 3 3

S2 1 0 5

S2 1 1 25

S2 1 3 33

S3 1 1 19

S3 1 2 17

S3 1 3 26

Data format for Multilevel Model

(All 3 students are included in the analyses)

Student Treat Time DV

S1 0 0 5

S1 0 7 3

S1 0 12 2

S1 0 13 3

S2 1 1 5

S2 1 3 9

S2 1 4 5

S2 1 6 25

S3 1 3 18

S3 1 15 19

S3 1 28 17

S3 1 31 26

Can you

transform this

dataset back into

multivariate

format???

Questions
• 1. On average, is there any trend of the math achievement over time?
• 2. Are there any differences between students on the trend of math achievement over time? (Do all students have the same trend of math achievement over time?)

Grand Intercept

Micro Level (Level 1):

Macro Level (Level 2):

Grand Slope

Micro Level

V(U1i)=τ11

V(U0i)=τ00

Combined Model

Between School Differences

Grand Intercept

Within School Errors

V(eti)=σ2

Grand Slope

Macro Level

Red: Computer

Blue: Lecture

MAti =γ00 + γ10Timeti+U0i +U1iTimeti+ eti

SPSS MIXED Syntax:

MIXED mathach with Time

/METHOD = REML

/Fixed = intercept Time

/Random = intercept Time

|Subject(Subid) COVTYPE (UN)

/PRINT = G SOLUTION TESTCOV.

Execute.

DV with Continuous IV by Categorical IV

1

2

Default: REML

(Restricted Maximum Likelihood)

Other option:

ML (Maximum Likelihood)

Specify random effects:

Effects capture the between-

School differences

3

Captures the overall model

4

5

Structure of

G matrix

(Unstructured)

Produce asymptotic

standard errors and

Wald Z-tests for

The covariance

Parameter estimates

Print G matrix

Requests for regression

coefficients

identity variable for Macro level

Units (e.g., Subid)

SPSS Output

Basic Information

(γ00) Average MA score at Time=0

(γ10)Average Trend of the MA score

Requested by the “Solution” command in the PRINT statement (Line 5)

Requested by the “TESTCOV” command in the PRINT statement (Line 5)

σ2

τ00

τ01

τ10

τ11

Asymptotic standard errors and Wald Z-tests

Requested by the “G” command

in the PRINT statement (Line 5)

τ00

τ01

τ10

τ11

Can I have a simpler G matrix (i.e. τ01= τ10 =0)
• Compare

Likelihood Ratio Test!

With

-2LL: ?

-2LL: 2509.873

Syntax for fitting simpler G

SPSS syntax

/random = intercept Time |subject(Subid) COVTYPE (Diag)

Choose This

(Model with τ01= τ10 =0)

-2 Res Log Likelihood 2509.873

(or Deviance)

(Model with τ01= τ10 ≠0)

-2 Res Log Likelihood 2509.873

(or Deviance)

χ2(1)=.000, p=1.00

Compare to model with τ11= 0

SPSS syntax

/random =intercept |subject(Subid) COVTYPE (Diag)

Choose This

(Model with τ01=τ10=0, τ11≠0)

-2 Res Log Likelihood 2509.873

(Model with τ11=τ01=τ10= 0)

-2 Res Log Likelihood 2524.387

χ2(1)=14.51, p<.001

Halved P-value

Result of the final Model

σ2

τ00

τ11

γ00

γ10

• 2. Are there any differences between students on the trend of math achievement over time? (Or, do all students have the same trend of math achievement over time?)

τ00 = 201.71 τ11 = 14.56

• Q3. If Yes to Q2, what causes the differences?

Null Hypothesis:

Different teaching methods have SAME effects on achievement over time

(H0: γ11 = 0)

• Micro Level (Level 1):

MAti = 0i + 1i Timeti+ eti

(Variance of eti = σ2)

• Combined Model:

MAti =γ00 + γ01 Compi + γ10 Timeti + γ11Timeti*Compi

+ U0i + U1i SESti + eti

• Macro Level (Level 2):

β0i =γ00 + γ01 Compi + U0i

β1i =γ10 + γ11Compi + U1i

(Variance of U0i = τ00; Variance of U1i = τ11)

MAij =γ00 + γ01 Compi + γ10 Timeti + γ11Timeti*Compi+ U0i + U1i Timeti + eti
• SPSS PROC MIXED Syntax:

MIXED mathach with Time

/METHOD = REML

/Fixed = intercept Comp Time Time*Comp

/Random = intercept Time

|Subject(Subid) COVTYPE (Diag)

/PRINT = G SOLUTION TESTCOV.

Execute.

With Comp in the Macro models

Without Comp in the Macro models

(WITH “Comp” in the model)

(WITHOUT “Comp” in the model)

Proportion of variance in the intercept ( ) explained by

“Comp”=(201.71-176.16)/201.71 = .13 (or 13%)

Proportion of variance in the slope ( ) explained by

“Comp”=(14.56-9.81)/14.56 = .33 (or 33%)

Solution for Fixed Effects

Standard

Effect Estimate Error DF t Value Pr > |t|

Intercept 50.3769 2.4764 76 20.34 <.0001

time 0.5756 0.8445 232 0.68 0.4962

computer 7.7583 3.5021 76 2.22 0.0297

time*comp 3.6009 1.1943 232 3.02 0.0029

Overall Model for students in the Lecture method group

Overall Model for students in the Computer method group

Random Effect

V(eti)=σ2=90.00

Achievement

Computer

Time=0 : Immediately posttest measure

Lecture

Time (Year)

Conclusion
• Advantages of using MLM over traditional ANOVA approaches for analyzing longitudinal data:
• 1. Can flexibly model the variance function
• 2. Retain meaning of the random effects
• 3. Explore factors which predict individual differences in change over time (e.g., Treatment effect)
• 4.Take both unequal spacing and missing data into account
Take Home Exercise

A clinical psychologist wants to examine the impact of the stress level of each family member (STRESS) on his/her level of symptomatology (SYMPTOM). There are 100 families, and families vary in size from three to eight members. The total number of participants is 400.

a) Can you write out the model? (Hint: What is in the micro model? What is in the macro model?)

b) Can you write out the syntax (SPSS) to

analyze this model?

c) In designing the study, what possible macro predictors do you think the clinical psychologist should include in her study? (e.g. family size?)

d) In designing the study, what possible micro predictors do you think the clinical psychologist should include in her study? (e.g. participant’s neuroticism?)

e) Can you write out the model? (Hint: What is in

the micro model? What is in the macro model)

f) Can you write out the syntax (SPSS) to

analyze this model?

b) SYMPTOMij = γ00 + γ10 STRESSij + U0j + U1j STRESSij + eij

SPSS Syntax:

MIXED Symptom with Stress

/fixed = intercept Stress

/random = intercept Stress |subject (Family) COVTYPE (UN)

/PRINT = G SOLUTION TESTCOV.

execute.

a)

Micro-level model:

SYMPTOMij = β0j + β1j STRESSij + eij

Macro-level model:

β0j = γ00 + U0j

β1j = γ10 + U1j

Combined model:

SYMPTOMij = γ00 + γ10 STRESSij

+ U0j + U1j STRESSij + eij

THE END!

THANK YOU!