slide1
Download
Skip this Video
Download Presentation
HGLM

Loading in 2 Seconds...

play fullscreen
1 / 36

hglm - PowerPoint PPT Presentation


  • 320 Views
  • Uploaded on

HGLM. HGLM . It really is little more than the combination of GLM and HLM Example Estimating the probability of voting Data: Cumulative NES file (24 NESs). Data. Micro level variables: Partisan strength Education Age White Income. Data. Macro level Random term

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'hglm' - elina


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide2
HGLM
  • It really is little more than the combination of GLM and HLM
  • Example
    • Estimating the probability of voting
    • Data: Cumulative NES file (24 NESs)
slide3
Data
  • Micro level variables:
    • Partisan strength
    • Education
    • Age
    • White
    • Income
slide4
Data
  • Macro level
    • Random term
    • Presidential election
r command
R command
  • It is a hybrid of GLM and lmer

M1<-lmer(y~x1+x2+x3+x4+x5+(1|year), family=binomial(link="logit"))

results
Results

Generalized linear mixed model fit using Laplace

Formula: y ~ x1 + x2 + x3 + x4 + x5 + (1 | year)

Family: binomial(logit link)

AIC BIC logLik deviance

40427 40486 -20206 40413

Random effects:

Groups Name Variance Std.Dev.

year (Intercept) 0.243 0.493

number of obs: 36752, groups: year, 24

Estimated scale (compare to 1 ) 1

Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 3.582592 0.122241 29.3 < 2e-16 ***

x1 -0.363597 0.012498 -29.1 < 2e-16 ***

x2 -0.294791 0.008584 -34.3 < 2e-16 ***

x3 -0.027922 0.000802 -34.8 < 2e-16 ***

x4 -0.206562 0.032863 -6.3 3.3e-10 ***

x5 -0.317875 0.012053 -26.4 < 2e-16 ***

add a year level variable
Add a year level variable
  • Note the standard deviation of the random term is 0.493
results1
Results

Generalized linear mixed model fit using Laplace

Formula: y ~ x1 + x2 + x3 + x4 + x5 + z + (1 | year)

Family: binomial(logit link)

AIC BIC logLik deviance

40391 40459 -20187 40375

Random effects:

Groups Name Variance Std.Dev.

year (Intercept) 0.0451 0.212

number of obs: 36752, groups: year, 24

Estimated scale (compare to 1 ) 1

Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 4.062479 0.096465 42.1 < 2e-16 ***

x1 -0.364464 0.012496 -29.2 < 2e-16 ***

x2 -0.291768 0.008535 -34.2 < 2e-16 ***

x3 -0.027870 0.000802 -34.8 < 2e-16 ***

x4 -0.213119 0.032828 -6.5 8.5e-11 ***

x5 -0.319210 0.012050 -26.5 < 2e-16 ***

z -0.885312 0.090501 -9.8 < 2e-16 ***

results2
Results
  • Adding the indicator of whether or not it is a presidential election year soaks up a lot of the mean level variance
  • Do we still need the random term?
    • Remember—this is a nuisance term. It is there to account for what we do not specify in the intercept equation.
    • Test is a deviance test
    • Difference in deviance is 183—yes we need it.
add a random slope
Add a random slope
  • We want to see if we need random slopes
  • Start with a random slope for x1
  • M3<-lmer (y~x1+x2+x3+x4+x5+z+(1+x1|year), family=binomial(link="logit"))
results3
Results

Generalized linear mixed model fit using Laplace

Formula: y ~ x1 + x2 + x3 + x4 + x5 + z + (1 + x1 | year)

Family: binomial(logit link)

AIC BIC logLik deviance

40023 40108 -20002 40003

Random effects:

Groups Name Variance Std.Dev. Corr

year (Intercept) 0.4595 0.678

x1 0.0543 0.233 -0.951

number of obs: 36752, groups: year, 24

Estimated scale (compare to 1 ) 1

Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 4.16969 0.16339 25.5 < 2e-16 ***

x1 -0.38179 0.04930 -7.7 9.7e-15 ***

x2 -0.29784 0.00861 -34.6 < 2e-16 ***

x3 -0.02845 0.00081 -35.1 < 2e-16 ***

x4 -0.21587 0.03314 -6.5 7.3e-11 ***

x5 -0.32396 0.01213 -26.7 < 2e-16 ***

z -0.89406 0.08948 -10.0 < 2e-16 ***

results4
Results
  • Ok, first that correlation is really high
  • Why? What is the intercept?
    • When all the x’s equal zero.
    • But none of the x’s are ever zero
    • The data are not centered
    • So, subtract off the median
new results
New Results

Generalized linear mixed model fit using Laplace

Formula: y ~ x1b + x2b + x3b + x4b + x5b + z + (1 + x1b | year)

Family: binomial(logit link)

AIC BIC logLik deviance

40023 40108 -20002 40003

Random effects:

Groups Name Variance Std.Dev. Corr

year (Intercept) 0.0468 0.216

x1b 0.0543 0.233 0.252

number of obs: 36752, groups: year, 24

Estimated scale (compare to 1 ) 1

Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -0.06417 0.07202 -0.9 0.37

x1b -0.38132 0.04929 -7.7 1.0e-14 ***

x2b -0.29784 0.00861 -34.6 < 2e-16 ***

x3b -0.02844 0.00081 -35.1 < 2e-16 ***

x4b -0.21587 0.03314 -6.5 7.4e-11 ***

x5b -0.32394 0.01213 -26.7 < 2e-16 ***

z -0.89418 0.08939 -10.0 < 2e-16 ***

results5
Results
  • The correlation is moderate
  • The coefficient on X1 changed slightly, but not much from model without random effect
  • Deviance test says we need the random slope term
  • But what if the slope varies as a function of presidential election?
  • Add the term
slide15
Generalized linear mixed model fit using Laplace

Formula: y ~ x1b + x2b + x3b + x4b + x5b + z + x1b:z + (1 | year)

Family: binomial(logit link)

AIC BIC logLik deviance

40365 40442 -20174 40347

Random effects:

Groups Name Variance Std.Dev.

year (Intercept) 0.0443 0.210

number of obs: 36752, groups: year, 24

Estimated scale (compare to 1 ) 1

Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -0.051128 0.071366 -0.7 0.47

x1b -0.299611 0.017549 -17.1 < 2e-16 ***

x2b -0.291545 0.008530 -34.2 < 2e-16 ***

x3b -0.027875 0.000802 -34.8 < 2e-16 ***

x4b -0.212297 0.032868 -6.5 1.1e-10 ***

x5b -0.319416 0.012048 -26.5 < 2e-16 ***

z -0.916731 0.089988 -10.2 < 2e-16 ***

x1b:z -0.128740 0.024592 -5.2 1.7e-07 ***

  • The interaction term adds to the model now, but what if we add the random slope?
slide16
Generalized linear mixed model fit using Laplace

Formula: y ~ x1b + x2b + x3b + x4b + x5b + z + x1b:z + (1 + x1b | year)

Family: binomial(logit link)

AIC BIC logLik deviance

40024 40118 -20001 40002

Random effects:

Groups Name Variance Std.Dev. Corr

year (Intercept) 0.0467 0.216

x1b 0.0513 0.226 0.248

number of obs: 36752, groups: year, 24

  • Estimated scale (compare to 1 ) 1

Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -0.05055 0.07299 -0.7 0.49

x1b -0.32210 0.07065 -4.6 5.1e-06 ***

x2b -0.29783 0.00861 -34.6 < 2e-16 ***

x3b -0.02845 0.00081 -35.1 < 2e-16 ***

x4b -0.21610 0.03314 -6.5 7.0e-11 ***

x5b -0.32389 0.01213 -26.7 < 2e-16 ***

z -0.91938 0.09224 -10.0 < 2e-16 ***

x1b:z -0.11019 0.09619 -1.1 0.25

  • Now the interaction term is insignificant!
the interaction does not add to the model
The interaction does not add to the model
  • If we run the paired model comparisons we find:
    • Including the interaction term is better than omitting it if there is no random slope
    • Including the random slope is better than omitting it
    • The model with the random term and the interaction is not superior to the model with only the random slope
  • It is insignificant
  • The deviance test tells us to reject including it
  • So? Best model omits it. We don’t need it for specification (though we might for theory).
other x s
Other x’s
  • Long story short, both the random term and the interaction with presidential election improve the model for x2, x4, & x5
  • Only the random term improves fit for x3
multiple random slopes
Multiple random slopes

If we add a random slope on x2, we improve the model

Formula: y ~ x1b + x2b + x3b + x4b + x5b + z + x2b:z + x4b:z + x5b:z + (1 + x1b + x2b | year)

Family: binomial(logit link)

AIC BIC logLik deviance

39425 39561 -19697 39393

Random effects:

Groups Name Variance Std.Dev. Corr

year (Intercept) 0.0789 0.281

x1b 0.0709 0.266 0.328

x2b 0.0228 0.151 0.383 0.988

number of obs: 36752, groups: year, 24

Estimated scale (compare to 1 ) 1

slide20
Fixed effects:

Estimate Std. Error z value Pr(>|z|)

(Intercept) -0.142094 0.093006 -1.5 0.12657

x1b -0.295615 0.055877 -5.3 1.2e-07 ***

x2b -0.246501 0.034072 -7.2 4.7e-13 ***

x3b -0.029652 0.000818 -36.3 < 2e-16 ***

x4b -0.148631 0.047011 -3.2 0.00157 **

x5b -0.243784 0.016835 -14.5 < 2e-16 ***

z -0.462598 0.124808 -3.7 0.00021 ***

x2b:z -0.045045 0.023232 -1.9 0.05251 .

x4b:z -0.251922 0.065749 -3.8 0.00013 ***

x5b:z -0.177566 0.024155 -7.4 2.0e-13 ***

slide21
The correlation between the random slopes is really high
  • Adding a random slope for x3 is intractable—you get negative estimates of the standard deviations.
  • Serious problem
item response
Item response
  • Basic idea
    • Each person has multiple indicators which tap the underlying concept of interest
    • Usually, not everyone gets the same indicators
    • No indicator is used for every person
    • Indicators differ in their difficulty
  • So, the dv is the probability that the answer, from the specific person on the specific question is a success (a 1)
slide23
IRT
  • Where the αj is person j’s latent ability
  • and βk is question k’s difficulty
  • If people get different questions, we need to add a subscript i to denote which response we are speaking of
slide24
IRT
  • Examples:
    • Roll call votes
    • Supreme Court decisions
    • Test scores (SAT, GRE)
    • Democracy
    • Knowledge
  • Inherent problem is that we need to estimate a person’s ability at the same time as we estimate the question’s difficulty
identifiability
Identifiability
  • The other problem is that the model is not identified
  • Add a constant to all of abilities and all of the difficulties and get the same answer
  • Just need to constrain the problem somehow—force a question to have a difficulty or a person to have an ability
    • If 0 & 1 aren’t natural you have a reflexive problem too
    • Easy
slide26
IRT
  • How about we make this a multilevel problem:
  • Solve identification by fixing one of the means to zero.
slide27
IRT
  • We can easily add group level predictors:
  • What are these?
    • The X’s in the first equation are things that we think predict the person’s ability (gender, race, party,…)
    • In the second equation they are whatever information we have about the difficulty of the question that is separate from what the data have to tell us.
slide28
IRT
  • Hold on a minute
  • What are these?
  • The basic idea that is that everyone has a probability of getting a question right.
  • That probability is based on two things:
    • Your ability
    • How hard the question is
  • Given these two things, the probability can be defined for each person
slide31
IRT
  • The basic prediction is easy: Person j will be a success on question k if his or her ability is greater than the difficulty of the question.
  • So you want to find the set of difficulty and ability parameters that best fit the data
irt discrimination
IRT-discrimination
  • The graph two slides ago (parallel lines) assumed that the questions were equally good at discriminating based on ability
    • More specifically that the effect of ability on each of the questions was the same—fixed slope
    • We can allow that to vary:
    • Gamma defines the ability of the question to discriminate—higher values mean that the question is a better predictor. That also means a sharper curve
slide34
IRT
  • We won’t estimate this yet
  • It is really hard to estimate and needs Bayesian
  • It is, however, pretty cool. Lots of applications
    • This is a fundamental measurement issue
    • Can improve on classic factor analyses or standard scaling techniques
other hglm models
Other HGLM models
  • The R code is basically the same as the logit—put a different link and family
  • Problem is that the Likelihood based estimation problems get worse. My experience is that the binary choice is the easiest.
  • More parameters to estimate
  • Wait until we can do this as Bayesian
next time mcmc
Next time, MCMC
  • This gets into detailed probability theory
    • Gelman and Hill Chapter 18
    • Jackman, Simon. 2000. Estimation and Inference via Bayesian Simulation. AJPS 375-404.
    • Casella and George. 1992. Explaining the Gibbs Sampler. American Statistician. 167-174.
    • “Markov Chain Monte Carlo in Practice: A Roundtable Discussion.” American Statistician 1998. 93-100.
ad