Few notes on panel data (materials by Alan Manning)

1 / 26

# Few notes on panel data (materials by Alan Manning) - PowerPoint PPT Presentation

Few notes on panel data (materials by Alan Manning). Development Workshop. A Brief Introduction to Panel Data. Panel Data has both time-series and cross-section dimension – N individuals over T periods

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Few notes on panel data (materials by Alan Manning)' - fraley

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Few notes on panel data (materials by Alan Manning)

Development

Workshop

A Brief Introduction to Panel Data
• Panel Data has both time-series and cross-section dimension – N individuals over T periods
• Will restrict attention to balanced panels – same number of observations on each individuals
• Whole books written about but basics can be understood very simply and not very different from what we have seen before
• Asymptotics typically done on large N, small T
• Use yit to denote variable for individual i at time t
The Pooled Model
• Can simply ignore panel nature of data and estimate:

yit=β’xit+εit

• This will be consistent if E(εit|xit)=0 or plim(X’ ε/N)=0
• But computed standard errors will only be consistent if errors uncorrelated across observations
• This is unlikely:
• Correlation between residuals of same individual in different time periods
• Correlation between residuals of different individuals in same time period (aggregate shocks)
A More Plausible Model
• Should recognise this as model with ‘group-level’ dummies or residuals
• Here, individual is a ‘group’
Three Models
• Fixed Effects Model
• Treats θi as parameter to be estimated (like β)
• Consistency does not require anything about correlation with xit
• Random Effects Model
• Treats θi as part of residual (like θ)
• Consistency does require no correlation between θi and xit
• Between-Groups Model
• Runs regression on averages for each individual
The fixed effect estimator of β will be consistent if:
• E(εit|xit)=0
• Rank(X,D)=N+K
• Proof: Simple application of what you should know about linear regression model
Intuition
• First condition should be obvious – regressors uncorrelated with residuals
• Second condition requires regressors to be of full rank
• Main way in which this is likely to fail in fixed effects model is if some regressors vary only across individuals and not over time
• Such a variable perfectly multicollinear with individual fixed effect
Estimating the Fixed Effects Model
• Can estimate by ‘brute force’ - include separate dummy variable for every individual – but may be a lot of them
• Can also estimate in mean-deviation form:
How does de-meaning work?
• Can do simple OLS on de-meaned variables
• STATA command is like:xtreg y x, fe i(id)
Problems with fixed effect estimator
• Only uses variation within individuals – sometimes called ‘within-group’ estimator
• This variation may be small part of total (so low precision) and more prone to measurement error (so more attenuation bias)
• Cannot use it to estimate effect of regressor that is constant for an individual
Random Effects Estimator
• Treats θi as part of residual (like θ)
• Consistency does require no correlation between θi and xit
• Should recognise as like model with clustered standard errors
• But random effects estimator is feasible GLS estimator
More on RE Estimator
• Will not describe how we compute Ω-hat – see Wooldridge
• STATA command: xtreg y x, re i(id)
The random effects estimator of β will be consistent if:
• E(εit|xi1,..xit,.. xiT)=0
• E(θi|xi1,..xit,.. xiT)=0
• Rank(X’Ω-1X)=k
• Proof: RE estimator a special case of the feasible GLS estimator so conditions for consistency are the same.
• Error has two components so need a. and b.
• Assumption about exogeneity of errors is stronger than for FE model – need to assume εit uncorrelated with whole history of x – this is called strong exogeneity
• Assumption about rank condition weaker than for FE model e.g. can estimate effect variables that are constant for a given individual
Another reason why may prefer RE to FE model
• If exogeneity assumptions are satisfied RE estimate will be more efficient than FE estimator
• Application of general principle that imposing true restriction on data leads to efficiency gain.
Another Useful Result
• Can show that RE estimator can be thought of as an OLS regression of:
• On:
• Where:
• This is sometimes called quasi-time demeaning
• See Wooldridge (ch10, pp286-7) if want to know more
Between-Groups Estimator
• This takes individual means and estimates the regression by OLS:
• Stata command is xtreg y x, be i(id)
• Condition for consistency the same as for RE estimator
• But BE estimator less efficient as does not exploit variation in regressors for a given individual
• And cannot estimate variables like time trends whose average values do not vary across individuals
• So why would anyone ever use it – lets think about measurement error
Measurement Error in Panel Data Models
• Assume true model is:
• Where x is one-dimensional
• Assume E(εit|xi1,..xit,.. xiT)=0 and E(θi|xi1,..xit,.. xiT)=0 so that RE and BE estimators are consistent
Measurement Error Model
• Assume:
• where uit is classical measurement error, x*iis average value of x* for individual i and ηit is variation around the true value which is assumed to be uncorrelated with and uit and iid.
• We know this measurement error is likely to cause attenuation bias but this will vary between FE, RE and BE estimators.
Proposition 5.4
• For FE model we have:
• For BE model we have:
• For RE model we have:
• Where:
What should we learn from this?
• All rather complicated – don’t worry too much about details
• But intuition is simple
• Attenuation bias largest for FE estimator – Var(x*) does not appear in denominator – FE estimator does not use this variation in data
Conclusions
• Attenuation bias larger for RE than BE estimator as T>1>κ
• The averaging in the BE estimator reduces the importance of measurement error.
• Important to note that these results are dependent on the particular assumption about the measurement error process and the nature of the variation in xit – things would be very different if measurement error for a given individual did not vary over time
• But general point is the measurement error considerations could affect choice of model to estimate with panel data
Comparison of two methods
• Estimate parameters by OLS on differenced data
• If only 2 observations then get same estimates as ‘de-meaning’ method
• But standard errors different
• Why?: assumption about autocorrelation in residuals
What are these assumptions?
• For de-meaned model:
• For differenced model:
• These are not consistent: