Mixed Models

1 / 89

# Mixed Models - PowerPoint PPT Presentation

Mixed Models. Simon Sheather Michael Speed TAMU. Part 1. Choosing the Mean Structure Fixed Effects. Learning Outcomes – Part 1. The participant will learn: How to determine what is being tested by the Type I, II , III sums of squares How to use the Estimable Functions in SAS

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Mixed Models

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Simon Sheather

Michael Speed

TAMU

### Part 1

Choosing the Mean Structure

Fixed Effects

Learning Outcomes – Part 1
• The participant will learn:
• How to determine what is being tested by the Type I, II , III sums of squares
• How to use the Estimable Functions in SAS
• Which non-estimable functions are used in SAS and their implications
• Why testing for “Main” Effects in the presence of interaction may be useful
• How missing data will affect what is being tested
• Why “gain” scores may cause a problem
• What is being tested in ANCOVA
Who Uses?
• Proc GLM
• Proc Mixed
• Enterprise Guide
Linear Model

where Y denotes the vector of observed yi's, X is the known matrix of xij's, beta is the unknown fixed-effects parameter vector, and the error is the unobserved vector of independent and identically distributed Gaussian random errors.

Mixed Model

Formulation of the Mixed Model

The previous general linear model is certainly a useful one (Searle 1971), and it is the one fitted by the GLM procedure. However, many times the distributional assumption about is too restrictive. The mixed model extends the general linear model by allowing a more flexible specification of the covariance matrix of . In other words, it allows for both correlation and heterogeneous variances, although you still assume normality.

The mixed model is written as

where everything is the same as in the general linear model except for the addition of the known design matrix, Z, and the vector of unknown random-effects parameters, . The matrix Z can contain either continuous or dummy variables, just like X. The name mixed model comes from the fact that the model contains both fixed-effects parameters, , and random-effects parameters, . Refer to Henderson (1990) and Searle, Casella, and McCulloch (1992) for historical developments of the mixed model.

Random and Error Terms
• A key assumption in the foregoing analysis is that and are normally distributed with
V Matrix

The variance of Y is, therefore,

V = ZGZ' + R.

You can model V by setting up the random-effects design matrix Z and by specifying covariance structures for G and R

More

And Many More

Endpoints
• Continuous Endpoints :Y is continuous
• Discrete or Categorical Endpoints
• We will consider only continuous endpoints
Fixed Effects
• Let us examine the Fixed Effects in the simple case of no random effects and the errors are normal i.i.d.
• Continuous Endpoints
• Discussion of “Analysis of Clinical Trials Using SAS” Dmitrienko et al
Example
• The example is a clinical trial comparing an experimental drug (D) with a placebo (P) in patients with a major depressive disorder.
• The primary efficacy measure was the change from baseline to the end of the 9-week acute treatment phase in the 17-item Hamilton depression rating scale (HAMID17).
• Patient randomization was stratified by center (5 centers).
Hypothesis Testing – TYPE III
• Main Effect Drug
• Main Effect Center
• No Interaction
SAS Code – Mixed & GLM

PROC MIXED DATA = WORK.SORTTempTableSorted

METHOD=REML;

CLASS drug center ;

MODEL change= drug center drug*center /HTYPE=3;

RUN;

PROC GLM DATA=WORK.SORTTempTableSorted ;

CLASS drug center;

MODEL change= center drug drug*center /

INTERCEPT

SS1 SS2 SS3 SS4 SOLUTION

E1 E2 E3 E4 ZETA=1E-08

SINGULAR=1E-07 ;

RUN;

Point #1
• GLM & Mixed give same results for fixed effects with errors i.i.d. normal with mean 0 and constant variance
Confusion on Type I, II, III & IV Sums of Squares

Population Parameters

Population Parameters

Jargon
• I am testing “R(a|u), which represents the additional reduction due to fitting the treatment effect after fitting the mean and helps assess the amount of variability explained by the treatment accounting for the mean.”
• Wonderful, now tell me what hypotheses you are testing in terms of the population parameters.
• What is Ho??
Other Jargon
• I am “doing an unadjusted analysis.”
• I am “doing an adjusted analysis.”
• “I am testing hypotheses about the population parameters and here they are.”
Change the Order

What is being tested by “center” “drug” “drug*center” in terms of the population parameters?

Point #2
• Different hypotheses are being tested by Type I, II and III sums of squares when the population sample are unequal.
Estimable Functions Can Help

PROC GLM DATA=WORK.SORTTempTableSorted

;

CLASS drug center;

MODEL change= center drug drug*center

/

INTERCEPT

SS1 SS2 SS3 SS4 SOLUTION

E1 E2 E3 E4 ZETA=1E-08

SINGULAR=1E-07

;

We Only Need drug*center

Interaction

Center

Int

Drug

Intercept is testing:

Why ? Let L1 =1

We Only Need drug*center

Center

Center is testing:

Why ?

Let L2 = 1;L3=L4 = L5=0

Let L3 = 1; L2=L4=L5=0

Let L4 =1; L2=L3=L5 = 0

Let L5 = 1; L2=L3= L4=0

We Only Need drug*center

Drug

Drug is testing:

Why ? Let L7 =1

We Only Need drug*center

Interaction

Drug*center is testing:

Why ?

Let L9=1; L10=L11=L12=0

Let L10 = 1 ; L9 = L11 = L12 = 0

Let L11 = 0 ; L9 = L10 = L12 = 0

Let L12 =0; L9 = L10 = L11 =0

Coefficients – Function of Sample Size

Recall: The subjects were assigned to the Centers at random.

Point #3
• The hypotheses being tested by the Type I and II sums of squares are a function of the number of time a population is sampled.
• In general, not a good idea to use Type I or III.
• Type III is useful if all populations are sampled at least once.
Testing “Main” Effects in thePresence of Interaction
• May we (can we) test for “main” effects in the presence of interaction? i.e. We reject the hypothesis of no interaction.
• Sure – it is a valid test.
• Should we do the test? Well, it depends.
Need Input from Researcher
• Does makes sense?
• Is there a difference between the new drug and the placebo when you averaged over the centers?
Point #4
• Testing “Main” Effects in the presence of interaction may be correct if the test makes sense to the researcher.
• “Main Effect for Drug” is not unique. The hypothesis depends on the type of sum of squares used.
Effect of Missing Populations

Suppose we did not sample the 2,3 population.

What effect does this have on hypotheses testing?

Type 3 Center

No Missing

Missing

Point #5
• Even Type III (3) sums of squares gives rather strange hypotheses when there are some populations with no samples.
No Type IV as in GLM
• Contrast
• Estimate
Estimate - Mixed

Estimate'label' < fixed-effect values ...>                                     < | random-effect values ...> , ...< / options > ;

SAS CodePopulation Mean Model

PROC MIXED DATA = WORK.SORTTempTableSorted METHOD=REML;

CLASS center drug ;

MODEL change= drug*center /noint HTYPE=3 SOLUTION DDFM=KENWARDROGER ;

Estimate 'Drug' drug*center 4 4 4 4 4 -5 -5 -5 -5/divisor = 20;

;

LSMEANS drug*center / ;

RUN;

QUIT;

SAS CodeOver-parameterized Model

PROC MIXED DATA = WORK.SORTTempTableSorted METHOD=REML;

CLASS drug center ;

MODEL change= drug center drug*center

/HTYPE=3 SOLUTION DDFM=KENWARDROGER;

estimate 'Drug' drug 20 -20

center -1 -1 4 -1 -1 drug*center 4 4 4 4 4 -5 -5 -5 -5

/divisor = 20 E;

LSMEANS center drug center*drug / ;

RUN;

QUIT;

• Many researchers do not walk over and ask to test some specific statistical hypotheses
• If they are given a chance to describe what they are interested in, it is often in terms of the population parameters.
• We may have to explain to them that we cannot simply test drug in the drug by center experiment.
• Showing the conceptual design layout can help.
Point #6
• Test the hypotheses the researcher wants to test
• Don’t rely of the computer program to test meaningful hypotheses
GLMMODOver-parameterized Model (P,102) Missing

PROC GLMmod DATA=SASUSER.SORTSORTEDQUERY1_FOR_HAMD17 outdesign=sasuser.design_matrix_1 outparm = parm

;

CLASS drug center;

MODEL change= drug center center*drug/

ZETA=1E-08

SINGULAR=1E-07

;

RUN;

GLMMODPopulation Parameter Model (P,102) Missing

PROC GLMmod DATA=SASUSER.SORTSORTEDQUERY1_FOR_HAMD17 outdesign=sasuser.design_matrix outparm = parm

;

CLASS drug center;

MODEL change= center*drug/noint

ZETA=1E-08

SINGULAR=1E-07

;

RUN;

USE SQL
• Use SQL to get rid of y in X and DX
Proc IML

proc iml;

use SASUSER.QUERY_FOR_DESIGN_MATRIX;

print x;

use SASUSER.QUERY_FOR_DESIGN_MATRIX_1;

print xd;

d =inv(x`*x)*x`*xd;

print d;

H={.2 .2 .2 .2 .2 -.25 -.25 -.25 -.25};

print H;

L = H*D;

LT=L`;

print LT;

run;

quit;

Estimate Statement

estimate 'Drug'

drug 20 -20

center -1 -1 4 -1 -1

drug*center 4 4 4 4 4 -5 -5 -5 -5

/divisor = 20 E;

Relation Between

Over-parameterized Model Estimates – No Missing

Point #7
• Correctly testing the fixed effect hypotheses is just as important in the case where there are random terms and errors with complex covariance matrices as it is in the simple case.
Weight Gain

Treatment

BEFORE

X

AFTER

Y

Example - Milliken & Johnsonpre_post.sav
• Cholesterol Study
• Pre and Post
• 4 Diets
Trouble In River City ??
• Not So Fast – General Approach
• Analysis of Covariance
Computer Model - SAS

PROC MIXED DATA = WORK.SORTTempTableSorted

METHOD=REML;

CLASS diet ;

MODEL post_ch= diet pre_ch pre_ch*diet /

HTYPE=3 DDFM=KENWARDROGER

INTERCEPT E3 OUTPM=SASUSER.PRDMMIXEDPREDICTEDMEANSPREP_0000(LABEL="Predicted means data set for ECLIB000.PREPSOT")

OUTP=SASUSER.PREDMIXEDPREDICTIONSPREPSOT_0000(LABEL="Predicted values data set for ECLIB000.PREPSOT“);

RUN;

QUIT;

More Results - General

What is being Tested?

A Different Look

PROC MIXED DATA = WORK.SORTTempTableSorted

METHOD=REML;

CLASS diet ;

MODEL post_ch= diet pre_ch*diet / noint

HTYPE=3 DDFM=KENWARDROGER solution

E3;

RUN;

QUIT;

Point #8
• Be careful about using gain scores
• Be careful about want is being tested in ANCOVA
Conclusions for Part 1
• Try to get the researcher to describe the hypotheses of interest
• Formulate your tests in terms of population parameters
• Verify what is actually being tested