generalized method of moments estimation l.
Skip this Video
Loading SlideShow in 5 Seconds..
Generalized method of moments estimation PowerPoint Presentation
Download Presentation
Generalized method of moments estimation

Loading in 2 Seconds...

play fullscreen
1 / 33

Generalized method of moments estimation - PowerPoint PPT Presentation

  • Uploaded on

Generalized method of moments estimation. FINA790C HKUST Spring 2006. Outline. Basic motivation and examples First case: linear model and instrumental variables Estimation Asymptotic distribution Test of model specification General case of GMM Estimation Testing. Basic motivation.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Generalized method of moments estimation' - zelia

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
  • Basic motivation and examples
  • First case: linear model and instrumental variables
    • Estimation
    • Asymptotic distribution
    • Test of model specification
  • General case of GMM
    • Estimation
    • Testing
basic motivation
Basic motivation
  • Method of moments: suppose we want to estimate the population mean  and population variance 2of a random variable vt
  • These two parameters satisfy the population moment conditions

E[vt] -  = 0

E[vt2] – (2+2) = 0

method of moments
Method of moments
  • So let’s think of the analogous sample moment conditions:

T-1vt -  = 0

T-1vt2 – ( 2 + 2 ) = 0

  • This implies

 = T-1vt

2 = T-1(vt - )2

Basic idea: population moment conditions provide information which can be used for estimation of parameters

  • Simple supply and demand system:

qtD = a pt + utD

qtS = 1nt + 2pt + utS

qtD = qtS = qt

  • qtD, qtS are quantity demanded and supplied; pt is price and qt equals quantity produced
  • Problem: how to estimate a, given qt and pt
  • OLS estimator runs into problems
  • One solution: find ztD such that cov(ztD, utD) = 0
  • Then cov(ztD,qt) – acov(ztD,pt) = 0
  • So if E[utD] = 0 then E[ztDqt] – aE[ztDpt] = 0
  • Method of moments leads to

a* = (T-1ztDqt)/(T-1ztDpt)

which is the instrumental variables estimator of a with instrument ztD

method of moments7
Method of moments
  • Population moment condition: vector of observed variables, vt, and vector of p parameters θ, satisfy a px1 element vector of conditions E[f(vt,θ)] = 0 for all t
  • The method of moments estimator θ*T solves the analogous sample moment conditions

gT(θ*) = T-1f(vt,θ*T) = 0 (1)

where T is the sample size

  • Intuitively θ*T→ in probability to the solution θ0 of (1)
generalized method of moments
Generalized method of moments
  • Now suppose f is a qx1 vector and q>p
  • The GMM estimator chooses the value of θwhich comes closest to satisfying (1) as the estimator of θ0, where “closeness” is measured by

QT(θ) = T-1f(vt,θ)WTT-1f(vt,θ) = gT(θ)WTgT(θ)

and WT is psd and plim(WT) = W pd

gmm example 1
GMM example 1
  • Power utility based asset pricing model
  • Et[ (Ct+1/Ct)-aRit+1 – 1] = 0 with unknown parameters b, a
  • The population unconditional moment conditions are

E[((Ct+1/Ct)-aRit+1 – 1)zjt] = 0 for j=1,…,q where zjt is in information set

gmm example 2
GMM example 2
  • CAPM says

E[ Rit+1 - 0(1-bi) – biRmt+1 ] =0

  • Market efficiency

E[( Rit+1 - 0(1-bi) – biRmt+1 )zjt] =0

gmm example 3
GMM example 3
  • Suppose the conditional probability density function of the continuous stationary random vector vt, given Vt-1={vt-1,vt-2,…} is p(vt;θ0,Vt-1)
  • The MLE of θ0 based on the conditional log likelihood function is the value of which maximizes LT(θ) = ln{p(vt;θ,Vt-1)}, i.e. which solves ∂LT(θ)/∂θ = 0
  • This implies that the MLE is just the GMM estimator based on the population moment condition

E[ ∂ln{p(vt;θ,Vt-1)}/∂θ} ] =0

gmm estimation
GMM estimation
  • The GMM estimator θ*T = argminθεQT(θ) generates the first order conditions

{T-1∂f(vt,θ*T)/∂θ´}´WT{T-1f(vt,θ*T)} = 0 (2)

where ∂f(vt,θ*T)/∂θ´ is the q x p matrix with i,j element ∂fi(vt,θ*T)/∂θj´

  • There is typically no closed form solution for θ*T so it must be obtained through numerical optimization methods
example instrumental variable estimation of linear model
Example: Instrumental variable estimation of linear model
  • Suppose we have yt = xt´θ0 + ut, t=1,…,T
  • Where xt is a (px1) vector of observed explanatory variables, yt is an observed scalar, ut is an unobserved error term with E[ut] = 0
  • Let zt be a qx1 vector of instruments such that E[zt´ut]=0 (contemporaneously uncorrelated)
  • Problem is to estimate θ0
iv estimation
IV estimation
  • The GMM estimator θ*T = argminθεQT(θ)

where QT(θ) = {T-1u(θ)’Z}WT{T-1Z’u(θ)}

  • The FOCs are

(T-1X’Z)WT(T-1Z’y) = (T-1X’Z)WT(T-1Z’X)θ*T

  • When # of instruments q = # of parameters p (“just-identified”) and (T-1Z’X) is nonsingular then

θ*T = (T-1Z’X)-1(T-1Z’y)

independently of the weighting matrix WT

iv estimation15
IV estimation
  • When # instruments q > # parameters p (“over-identified”) then

θ*T = {(T-1X’Z)WT(T-1Z’X)}-1(T-1X’Z)WT(T-1Z’y)

identifying and overidentifying restrictions
Identifying and overidentifying restrictions
  • Go back to the first-order conditions
  • (T-1X’Z)WT(T-1Z’u(θ*T)) = 0
  • These imply that GMM = method of moments based on population moment conditions E[xtzt’]WE[ztut(θ0)] = 0
  • When q = p GMM = method of moments based on E[ztut(θ0)] = 0
  • When q > p GMM sets p linear combinations of E[ztut(θ0)] equal to 0
identifying and over identifying restrictions details
Identifying and over-identifying restrictions: details
  • From (2), GMM is method of moments estimator based on population moments

{E[∂f(vt,θ0)/∂θ’]}’W{E[f(vt,θ0)]} = 0 (3)

  • Let F(θ0)=W½E[∂f(vt,θ0)/∂θ´] and rank(F(θ0))=p. (The rank condition is necessary for identification of θ0). Rewrite (3) as

F(θ0)’W½E[f(vt,θ0)] = 0 or

F(θ0)[F(θ0)’F(θ0)]-1F(θ0)’W½E[f(vt,θ0)]=0 (4)

  • This says that the least squares projection of W½E[f(vt,θ0)] on to the column space of F(θ0) is zero. In other words the GMM estimator is based on rank{F(θ0)[F(θ0)’F(θ0)]-1F(θ0)’} = p restrictions on the (transformed) population moment condition W½E[f(vt,θ0)]. These are the identifying restrictions; GMM chooses θ*T to satisfy them.
identifying and over identifying restrictions details18
Identifying and over-identifying restrictions: details
  • The restrictions that are left over are

{Iq - F(θ0)[F(θ0)’F(θ0)]-1F(θ0)’}W½E[f(vt,θ0)]=0

  • This says that the projection of W½E[f(vt,θ0)] on to the orthogonal complement of F(θ0) is zero, generating q-p restrictions on the transformed population moment condition. These over-identifying restrictions are ignored by the GMM estimator, so they need not be satisfied in the sample
  • From (2),

WT½T-1f(vt,θ*T)= {Iq - FT(θ*T)[FT(θ*T)’FT(θ*T)]-1FT(θ*T)’}WT½T-1f(vt,θ*T) where FT(θ)= WT½T-1∂f(vt,θ)/ ∂θ´. Therefore QT(θ*T) is like a sum of squared residuals, and can be interpreted as a measure of how far the sample is from satisfying the over-identifying restrictions.

asymptotic properties of gmm instrumental variables estimator
Asymptotic properties of GMM instrumental variables estimator

It can be shown that:

  • θ*T is consistent forθ0
  • (θ*T,i – θ0,i)/√V*T,ii/T ~ N(0,1) asymptotically where
    • V*T = (X’ZWTZ’X)-1X’ZWTS*TWTZ’X (X’ZWTZ’X)-1
    • S*T = limT→∞ Var[T-½Z’u]
covariance matrix estimation
Covariance matrix estimation
  • Assuming ut is serially uncorrelated

Var[T-½Z’u] = E{T-½ztut}{T-½ztut}

= T-1E[ut2ztzt’}

  • Therefore,

S*T = T-1ut(θ*T)2ztzt’

two step estimator
Two step estimator
  • Notice that the asymptotic variance depends on the weighting matrix WT
  • The optimal choice is WT = S*T-1 to give V*T = (X’Z S*T-1Z’X)-1
  • But we need θ*T to construct S*T This suggests a two-step (iterative) procedure
    • (a) estimate with sub-optimal WT, get θ*T(1),get S*T(1)
    • (b) estimate with WT = S*T(1)-1
    • (c) iterated GMM: from (b) get θ*T(2), get S*T(2), set WT = S*T(2)-1, repeat
gmm and instrumental variables
GMM and instrumental variables
  • If ut is homoskedastic: var[u]=2IT then S*T = *2ZtZt’ where Zt is a (Txq) matrix with t-th row = zt’ and *2 is a consistent estimate of 2
  • Choosing this as the weighting matrix WT = S*T-1 gives

θ*T = {X’Z(Z’Z)-1Z’X)}-1 X’Z(Z’Z)-1Z’y

= {X^’X^}-1X^’y

where X^=Z(Z’Z)-1Z’X is the predicted value of X from a regression of X on Z. This is two-stage least squares (2SLS): first regress X on Z, then regress y on the fitted value of X from the first stage.

model specification test
Model specification test
  • Identifying restrictions are satisfied in the sample regardless of whether the model is correct
  • Over-identifying restrictions not imposed in the sample
  • This gives rise to the overidentifying restrictions test

JT = TQT(θ*T) = T-½u(θ*T)’Z S*T-1 T-½ Z’u(θ*T)

  • Under H0: E[ztut(θ0)] = 0, JT asymptotically follows a 2(q-p) distribution
from last time
(From last time)
  • In this case the MRS is

mt,t+j = j{ Ct+j /Ct } -

  • Substituting in above gives:

Et[ln(Rit,t+j)] =

-jln + Etct+j - ½ 2 vart[lnct+j] - ½vart[lnRit,t+j]

+ covt[ lnct+j,lnRit,t+j ]

example power utility lognormal distributions
Example: power utility + lognormal distributions
  • First order condition from investor maximizing power utility with log-normal distribution gives:

ln(Rit+1) = ui + alnCt+1 + uit+1

  • the error term uit+1 could be correlated with lnCt+1 so we can’t use OLS
  • However Et[uit+1] = 0 means uit+1 is uncorrelated with any zt known at time t
instrumental variables regressions for returns and consumption growth
Instrumental variables regressions for returns and consumption growth

Return Stage 1 Stage 2 JT test

r Δlnc γ r Δlnc

CP 0.297 0.102 -0.953 -0.118 0.221 0.091

(0.00) (0.15) (0.57) (0.11) (0.00) (0.10)

Stock 0.110 0.102 -0.235 -0.008 0.105 0.097

(0.11) (0.15) (1.65) (0.06) (0.06) (0.08)

Annual US data 1889-1994 on growth in log real consumption of nondurables & services, log real return on S&P500, real return on 6-month commercial paper. Instruments are 2 lags each of: real commercial paper rate, real consumption growth rate and log of dividend-price ratio

gmm estimation general case
GMM estimation – general case
  • Go back to GMM estimation but let f be a vector of continuous nonlinear functions of the data and unknown parameters
  • In our case, we have N assets and the moment condition is that E[(m(xt,0)Rit-1)zjt-1]=0 using instruments zjt-1 for each asset i=1,…,N and each instrument j=1,…,q
  • Collect these as f(vt,)= zt-1’ut(Xt,) where zt is a 1xq vector of instruments and ut is a Nx1 vector. f is a column vector with qN elements – it contains the cross-product of each instrument with each element of u.
  • The population moment condition is

E[f(vt,0)] = 0

gmm estimation general case28
GMM estimation – general case
  • As before, let gT(θ) = T-1f(vt,θ). The GMM estimator is θ*T = argminθεQT(θ)
  • The FOCs are

GT()’WTgT() = 0

where GT() is a matrix of partial derivatives with i, j element δgTi/δj

gmm asymptotics
GMM asymptotics

It can be shown that:

  • θ*T is consistent forθ0
  • asyvar(θ*T,ii) = MSM’ where
    • M = (G0’WG0)-1G0’W
    • G0 = E[∂f(vt,θ0)/∂θ’]
    • S = limT→∞ Var[T-½gT(θ0)]
covariance matrix estimation for gmm
Covariance matrix estimation for GMM

In practice V*T = M*TS*TM*T’ is a consistent estimator of V, where

  • M*T = (GT(θ*T)’WTGT(θ*T))-1 GT(θ*T)’WT
  • S*T is a consistent estimator of S
  • Estimator of S depends on time series properties of f(vt,0). In general it is

S = Γ0 + ( Γi + Γi’)

where Γi = E{ft-E(ft)}{ft-i-E(ft-i)}’=E[ftft-I’]is the i-th autocovariance matrix of ft = f(vt,0).

gmm asymptotics31
GMM asymptotics
  • So (θ*T,i – θ0,i)/√V*T,ii/T ~ N(0,1) asymptotically
  • As before, we can choose WT = S*T-1 and
    • V*T is then (GT(θ*T)’ST-1GT(θ*T))-1
    • a test of the model’s over-identifying restrictions is given by JT = TQT(θ*T) which is asymptotically 2(qN-p)
covariance matrix estimation32
Covariance matrix estimation
  • We can consistently estimate S with

S*T = Γ*0(θ*T) + ( Γ*i(θ*T) + Γ*i(θ*T)’)

where Γ*i(θ*T) = T-1ft(vt,θ*T)ft-i(vt-i,θ*T)’

  • If theory implies that the autocovariances of f(vt,θ0) = 0 for some lag j then we can exclude these from S*T
    • e.g. ut are serially uncorrelated implies S*T = T-1{ ut(vt,θ*T)ut(vt,θ*T)’  (zt’zt) }
gmm adjustments
GMM adjustments
  • Iterated GMM is recommended in small samples
  • More powerful tests by subtracting sample means of ft(vt,θ*T) in calculating Γ*i(θ*T)
  • Asymptotic standard errors may be understated in small samples: multiply asymptotic variances by “degrees of freedom adjustment” T/(T-p) or (N+q)T/{(N+q)T – k} where k = p+((Nq)2+Nq)/2