Few notes on panel data materials by alan manning
Sponsored Links
This presentation is the property of its rightful owner.
1 / 26

Few notes on panel data (materials by Alan Manning) PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Few notes on panel data (materials by Alan Manning). Development Workshop. A Brief Introduction to Panel Data. Panel Data has both time-series and cross-section dimension – N individuals over T periods

Download Presentation

Few notes on panel data (materials by Alan Manning)

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Few notes on panel data (materials by Alan Manning)



A Brief Introduction to Panel Data

  • Panel Data has both time-series and cross-section dimension – N individuals over T periods

  • Will restrict attention to balanced panels – same number of observations on each individuals

  • Whole books written about but basics can be understood very simply and not very different from what we have seen before

  • Asymptotics typically done on large N, small T

  • Use yit to denote variable for individual i at time t

The Pooled Model

  • Can simply ignore panel nature of data and estimate:


  • This will be consistent if E(εit|xit)=0 or plim(X’ ε/N)=0

  • But computed standard errors will only be consistent if errors uncorrelated across observations

  • This is unlikely:

    • Correlation between residuals of same individual in different time periods

    • Correlation between residuals of different individuals in same time period (aggregate shocks)

A More Plausible Model

  • Should recognise this as model with ‘group-level’ dummies or residuals

  • Here, individual is a ‘group’

Three Models

  • Fixed Effects Model

    • Treats θi as parameter to be estimated (like β)

    • Consistency does not require anything about correlation with xit

  • Random Effects Model

    • Treats θi as part of residual (like θ)

    • Consistency does require no correlation between θi and xit

  • Between-Groups Model

    • Runs regression on averages for each individual

The fixed effect estimator of β will be consistent if:

  • E(εit|xit)=0

  • Rank(X,D)=N+K

  • Proof: Simple application of what you should know about linear regression model


  • First condition should be obvious – regressors uncorrelated with residuals

  • Second condition requires regressors to be of full rank

  • Main way in which this is likely to fail in fixed effects model is if some regressors vary only across individuals and not over time

  • Such a variable perfectly multicollinear with individual fixed effect

Estimating the Fixed Effects Model

  • Can estimate by ‘brute force’ - include separate dummy variable for every individual – but may be a lot of them

  • Can also estimate in mean-deviation form:

How does de-meaning work?

  • Can do simple OLS on de-meaned variables

  • STATA command is like:xtreg y x, fe i(id)

Problems with fixed effect estimator

  • Only uses variation within individuals – sometimes called ‘within-group’ estimator

  • This variation may be small part of total (so low precision) and more prone to measurement error (so more attenuation bias)

  • Cannot use it to estimate effect of regressor that is constant for an individual

Random Effects Estimator

  • Treats θi as part of residual (like θ)

  • Consistency does require no correlation between θi and xit

  • Should recognise as like model with clustered standard errors

  • But random effects estimator is feasible GLS estimator

More on RE Estimator

  • Will not describe how we compute Ω-hat – see Wooldridge

  • STATA command: xtreg y x, re i(id)

The random effects estimator of β will be consistent if:

  • E(εit|xi1,..xit,.. xiT)=0

  • E(θi|xi1,..xit,.. xiT)=0

  • Rank(X’Ω-1X)=k

  • Proof: RE estimator a special case of the feasible GLS estimator so conditions for consistency are the same.

  • Error has two components so need a. and b.


  • Assumption about exogeneity of errors is stronger than for FE model – need to assume εit uncorrelated with whole history of x – this is called strong exogeneity

  • Assumption about rank condition weaker than for FE model e.g. can estimate effect variables that are constant for a given individual

Another reason why may prefer RE to FE model

  • If exogeneity assumptions are satisfied RE estimate will be more efficient than FE estimator

  • Application of general principle that imposing true restriction on data leads to efficiency gain.

Another Useful Result

  • Can show that RE estimator can be thought of as an OLS regression of:

  • On:

  • Where:

  • This is sometimes called quasi-time demeaning

  • See Wooldridge (ch10, pp286-7) if want to know more

Between-Groups Estimator

  • This takes individual means and estimates the regression by OLS:

  • Stata command is xtreg y x, be i(id)

  • Condition for consistency the same as for RE estimator

  • But BE estimator less efficient as does not exploit variation in regressors for a given individual

  • And cannot estimate variables like time trends whose average values do not vary across individuals

  • So why would anyone ever use it – lets think about measurement error

Measurement Error in Panel Data Models

  • Assume true model is:

  • Where x is one-dimensional

  • Assume E(εit|xi1,..xit,.. xiT)=0 and E(θi|xi1,..xit,.. xiT)=0 so that RE and BE estimators are consistent

Measurement Error Model

  • Assume:

  • where uit is classical measurement error, x*iis average value of x* for individual i and ηit is variation around the true value which is assumed to be uncorrelated with and uit and iid.

  • We know this measurement error is likely to cause attenuation bias but this will vary between FE, RE and BE estimators.

Proposition 5.4

  • For FE model we have:

  • For BE model we have:

  • For RE model we have:

  • Where:

What should we learn from this?

  • All rather complicated – don’t worry too much about details

  • But intuition is simple

  • Attenuation bias largest for FE estimator – Var(x*) does not appear in denominator – FE estimator does not use this variation in data


  • Attenuation bias larger for RE than BE estimator as T>1>κ

  • The averaging in the BE estimator reduces the importance of measurement error.

  • Important to note that these results are dependent on the particular assumption about the measurement error process and the nature of the variation in xit – things would be very different if measurement error for a given individual did not vary over time

  • But general point is the measurement error considerations could affect choice of model to estimate with panel data

Can also get rid of fixed effect by differencing:

Estimating Fixed Effects Model in Differences

Comparison of two methods

  • Estimate parameters by OLS on differenced data

  • If only 2 observations then get same estimates as ‘de-meaning’ method

  • But standard errors different

  • Why?: assumption about autocorrelation in residuals

What are these assumptions?

  • For de-meaned model:

  • For differenced model:

  • These are not consistent:

This leads to time series…

  • Which is ‘better’ depends on which assumption is right – how can we decide this?

  • Much of this you have covered in Macroeconometrics course…

  • Login