1 / 50

GY460 Techniques of Spatial Analysis

GY460 Techniques of Spatial Analysis. Lecture 3: Spatial regression and ‘neighbourhood’ effects models. Steve Gibbons. Introduction. Formal aspects of spatial regression models, from a spatial econometrics perspective Spatial ‘x’ models Spatial ‘y’ (lagged’ dependent variable) models

meg
Download Presentation

GY460 Techniques of Spatial Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GY460 Techniques of Spatial Analysis Lecture 3: Spatial regression and ‘neighbourhood’ effects models Steve Gibbons

  2. Introduction • Formal aspects of spatial regression models, from a spatial econometrics perspective • Spatial ‘x’ models • Spatial ‘y’ (lagged’ dependent variable) models • and spatial ‘’ (error) models. • Problems and ways of estimation • Limitations • Comparison with neighbourhood effects models • Applications

  3. Readings • Anselin (2002) Under the hood Issues in the specification and interpretation of spatial regression models Agricultural Economics 27 (3): 247-267 • Moffit, Robert (2001) Policy interventions, low-level equilibria, and social interactions, in Social Dynamics, (S. N. Durlauf and H. Young eds.), Cambridge MA: MIT, 45-82. • Some other applications at the end…

  4. Spatial ‘x’ models

  5. Spatial ‘x’ models • Appropriate when agents’ (or region’s )behaviour or outcome reacts to the exogenous observable characteristics of neighbours • Outcome is dependent on the Xs for neighbours • Spillovers from observed neighbour characteristics, role models, peer groups, agglomeration, etc. etc. • Q: Unbiased and consistent estimation by OLS requires that error term and regressors are uncorrelated. Is there anything in the model that makes this assumption invalid?

  6. Spatial ‘x’ models • Simplest spatial models to estimate. No special techniques required • …if errors not correlated with X and WX for reasons not written down in the model • E[|X]=0 and E[|WX]=0 • Assumes no spatial sorting ( E[|WX]=0) • Violated e.g. • if motivated parents choose better neighbourhoods for their children (e.g. WX = neighbourhood poverty) • Firms that would be productive anywhere choose to locate in cities (e.g. WX = urbanisation)

  7. Spatial ‘y’ models(spatial dependence/spatial lagged dependent variable)

  8. Spatial ‘y’ models • Conceptually appropriate when behaviour/outcome reacts to the behaviour/outcome of others • Outcome is dependent on the observable outcomefor neighbours • Reaction functions, direct spillovers from neighbours occurring through observed behaviour, peer effects • Q: Unbiased and consistent estimation by OLS requires that error term and regressors are uncorrelated. Does this assumption hold for this model?

  9. Spatial ‘y’ models • A: No. Parameters not consistently estimated by Ordinary Least Squares • Consider simple i-j case • The ‘spatially lagged’ or ‘average neighbouring’ dep. var. y_j is correlated with the unobserved error term:

  10. Spatial ‘y’ models • More generally • The ‘spatially lagged’ or ‘average neighbouring’ dep. var. Wy is correlated with the unobserved error term:

  11. Spatial ‘y’ models • The average neighbouring dep. var includes • the neighbour’s error terms • the neighbour’s-neighbours error terms • The neighbour’s-neighbour’s-neighbours error terms … • So y in any observation i depends on error terms in all other observations (despite zero-diagonal W restriction) • Q: There are some specifications of W for which this is not true. Write one down.

  12. Intuition: you are your neighbours neighbour 1

  13. Intuition: you are your neighbours neighbour 2 1

  14. …and your neighbour’s neighbour’s neighbour’s neighbour… 3 2 1

  15. …and your neighbour’s neighbour’s neighbour’s neighbour… Shocks at 1 affect 2 and 4 directly And 3 indirectly via 2 and 4 Shocks at 1 get reflected back to 1 from 2 and 4 And from 2 to 3 to 4 to 1 etc… 3 4 2 1

  16. Methods of estimation

  17. ML Estimation • Possible to estimate by maximum likelihood. • Use • And assume unobservables are normally distributed with no heteroscedasticity, serial correlation etc. • A ‘likelihood’ function can be derived. This is the probability of observing the data y, given a value for the parameters    , the other characteristics X and the weights matrix W

  18. ML Estimation • Joint density of a multivariate random normal distribution (with zero mean) • Log likelihood for spatial ‘y’ model

  19. ML Estimation • Use iterative numerical maximisation techniques on a computer to find the values of    that maximise this Ln L function • Built in on dedicated spatial software e.g. ‘SpaceStat’, GeoDA • Disadvantages: • Difficult (slow) to evaluate when sample size n is very big. • This has to be calculated on every iteration of the maximisation procedure. • Matrix must be non-singular (invertible) • Not true for all  • Procedure assumes normal distribution – ‘Parametric identification’. Not popular with applied economists these days

  20. Instrumental variables/2SLS/GMM estimation • Consider the simple case we looked at earlier • Q: What instrument is available for y_j (assuming structure is correct!)

  21. Instrumental Variables/2SLS/GMM estimation • More generally • So a possible set of ‘instruments’ (predictors) for are • Correlated with but uncorrelated with the error term • Instruments for Wy are the spatial lags, second lags, etc of X. Use in standard IV/2SLS estimation (e.g. STATA, SPSS)

  22. IV/2SLS/GMM estimation • Advantages • Easy to estimate, even on big samples • Works if unobservables are correlated with Wy for other reasons • e.g. if there are unobserved local factors that affect both Wy and y • More on this later… • Doesn’t require parametric assumptions about distribution of unobservable factors • Note ML only identified using parametric assumptions (normality, functional form) unless IV assumptions are also valid

  23. IV/2SLS/GMM estimation • Disadvantages • Assumes that the neighbouring characteristics WX do not directly affect outcomes y. • i.e. error term for i is uncorrelated with neighbouring x • Can’t use if the model is • The instruments WX, W2X,… would be nearly collinear with regressors WX • Not efficient (precise) relative to ML (assuming ML assumptions OK!)

  24. IV/2SLS/GMM estimation • Disadvantages • Need some exogenous characteristics • No good for • Commonly used in regional science-type applications, but modern applied economists won’t buy your “identification strategy” • Note: ‘GMM’ in linear econometric models essentially just efficient IV: instruments weighted to take account of error structure and provide more precise estimates

  25. Spatial ‘’ error models

  26. Spatial ‘’ models • Appropriate when agents’ (or region’s )behaviour or outcome reacts to (or is correlated with) the unobservable characteristics of neighbours • Outcome is dependent on the for neighbours • Spillovers from unobserved neighbour characteristics, role models, peer groups, agglomeration, etc. etc. • Where u is not correlated across space Or

  27. Spatial error models • Previous example is a ‘spatially autoregressive’ error model • Other formulations possible e.g. spatial moving average

  28. Spatial error models • Example in the simple i-j case • Q: Remember, consistent estimationby OLS requires that the error term (unobservables) are uncorrelated with the regressors • Is this assumption valid?

  29. Spatial error models • A: Yes. • OLS gives consistent and unbiased estimates, since error term is uncorrelated with x, or • Q: What problems are there in estimating this model?

  30. Spatial error models • Efficiency of OLS, and correct standard errors requires that error term is homoscedastic and has no autocorrelation i.e. • By definition, our spatial error model has spatially autocorrelated error terms! • OLS is not the most efficient (precise) estimator available • Standard errors computed by usual formulae will be wrong • Potential for making mistake inferences (e.g.  significantly different from zero, when in truth it is zero)

  31. Spatial error models • Efficiency is often not a major issue. We just want consistent/estimates • Wrong standard errors is a nuisance • ‘Robust’ estimate standard errors might be appropriate • Or ‘bootstrap’/ simulate the distribution under the null • If we know or specify , we can use Generalised Least Squares to estimate  and  • GLS is efficient and provides consistent estimates of standard errors

  32. Spatial error models • GLS weights all the variables in the model to make the error term non-spatially correlated • Use the transformation • Pre-multiplying by is called spatial filtering • Filters out the spatial autocorrelation, e.g. • Or if rho=1

  33. Estimating  in spatial error models • How to estimate the parameter ? • Maximum likelihood • Estimates  • Consistent and efficient • Unbiased standard errors • But assumes normality and infeasible in big samples • IV/2SLS not possible using spatial lags of X • Simple cases = random effects (next lecture) • Generalised Methods of Moments possible (Kelijan and Prucha 1999), applied in Bell and Bockstael (2000), Review of Economics and Statistics, 2000, p72-82

  34. Comparison of these spatial models

  35. Comparison of spatial models • Though different in interpretation, these spatial models are observationally very similar • Very difficult to distinguish empirically • All give rise to some form of spatial autocorrelation – outcome in on location correlated with outcome in neighbours • Spatial autoregressive error model can be transformed into a spatial lag model with non-autocorrelated errors: • Q: How is this different from the spatial lag model and spatial X model?

  36. Comparison of spatial models • Spatial dependence model can be transformed into a reduced form with neighbourhood exogenous variables and spatially autocorrelated errors: • In spatial y models, outcome at i depends on all the neighbouring characteristics (Xs) and neighbouring shocks • In spatial  models its just the shocks that are autocorrelated • Problem with the reduced form version is that it is observationally equivalent to a spatial X model with spatially autocorrelated errors! • Does this matter?...

  37. The ‘social multiplier’ Outcomei =  * treatmenti + * mean group outcome Testing in the ‘lab’: my policy improves outcome by  for person i E.g. providing sex education reduces probability of pregnancy by % Let treatment = 1 Policy: treat 50 people (e.g. teenage girls) The communities (e.g. roommates)

  38. The ‘social multiplier’ Target 1: give one person i in each group sex education Outcomefor the 50 treated = /(1- 2) Outcome for the 50 untreated= /(1- 2) Total = 50* (1+)/(1- 2) = 50*/(1-) Target 2: give both people in 50 groups sex education Outcome for the 50 treated = /(1- ) Outcome for the 50 non-treated = 0 Total =50 * /(1- ) Treatment effect is amplified (for the treated group) if the treatment is group-targeted

  39. The ‘social multiplier’ • So it seems useful to know if the behaviour depends on others’ behaviour (spatial y model) • But does it….? • Suppose I know: • Outcomei =  * treatmenti + * mean group treatment • Social multiplier is implicit in  • Q: However the conceptual advantage of the spatial y model is that it implies that the social multiplier extends to any treatment, whereas the reduced form requires that we estimate it for every treatment

  40. Limitations of ‘spatial econometric’ methods • Much of the traditional regional science literature is concerned with • a) obtaining consistent estimates of  by ML or IV, where the focus is on bias caused by two-way causality between observation and its neighbours • b) obtaining efficient estimates when  is known or can be estimated • Does not deal with the fundamental problem of spatial ‘sorting’ or unobserved similarities between neighbours • The similarity between outcomes in neighbouring places caused by unobserved similarity of these places • Or by the fact that agents with similar characteristics or preferences tend to group together • These links can be represented by ‘spatial error’ models – but the spatial econometric interpretation of these is about interdependency between neighbours

  41. Limitations of ‘spatial econometric’ methods • To address bias induced by sorting we need to employ the kind of strategies used throughout modern applied work to tackle endogeneity • ‘Differences-in-differences’/fixed effects • Instrumental variables (using instruments from elsewhere) • Experimental/quasi experimental approaches • Regression discontinuity designs • We see examples of these in past and future seminars • See also Angrist and Krueger (1999), Empirical Strategies in Labor Economics, Handbook of Labor Economics, Vol3a (supplied) • And chapters in Angrist and Pishke (2009) on reading list

  42. Neighbourhood or ‘social interaction’ models

  43. The neighbourhood effects model • Standard model used for neighbourhood/peer group effects regressions • Where the means are within a ‘reference’ group - neighbourhood, classroom etc to which i belongs • Note: the group means could be derived from within the estimation data, or from elsewhere (e.g. matched census data) • Q: how does this compare to the spatial econometrics models we just looked at?

  44. The neighbourhood effects model • Manski’s (1993) oft-cited (and useful) taxonomy of neighbourhood effects splits these into • Endogenous effects: captured by  • Techniques for estimation analagous to spatial dependence model (spatial y) • Exogenous or contextual (sociological) effects, captured by  (spatial x) • Correlated effects: similarities between unobserved factors affecting the group j - includes group sorting: captured by fj • Analagous to spatially autocorrelated error terms (spatial )

  45. The neighbourhood effects model • Estimation problems are pretty much the same as in ‘spatial regression’ context • In general not possible to estimate all these parameters without excluding one type of interaction • Bottom line is its probably best to just focus on the reduced form model with ‘contextual’ effects e.g. • But sorting still matters: whatever generates differences in context may affect i directly

  46. The neighbourhood effects model • The definitive reading on this is • Moffit, R.A (2001) Policy Interactions, Low-Level Equilibria and Social Interactions, Chapter 3 in Durlauf and Peyton-Young eds., Social Dynamics, Brookings Institution

  47. Applications of spatial and neighbourhood effects models (using spatial econometrics methods)

  48. Some examples: aggregated data • Regional models with technological spillovers • Fischer And Varga, (2003) Spatial Knowledge Spillovers And University Research, Annals Of Regional Science, 37: 303-322 • Regional growth models • Rey, S.J. and B.D Montouri (1999), US Regional Income Convergence: A Spatial Perspective, Regional Studies 33(2):143-156 • Interaction between governments • Brueckner, J. K. (1998): "Testing for Strategic Interaction among Local Governments: The Case of Growth Controls," Journal of Urban Economics, 44 (3),438-467 • Figlio, D.N., V.W. Kolpin, W.E. Reid, Do States Play Welfare Games? Journal of Urban Economics, 46 (3) 437-454

  49. Some examples: micro data • Property markets • Ioannides, Y. (2003) Interactive Property Valuations, Journal Of Urban Economics 53 (1): 145-170 • Peer group effects • Gaviria, A. and Raphael, S. (2001), School-Based Peer Effects And Juvenile Behavior, Review of Economics and Statistics, 83(2): 257–268 • Neighbourhood effects • Case, A and Katz, L, The Company You Keep: The Effects of Family and Neighbourhood on Disadvantaged Youths, NBER working paper 3705 • Case, A (1992), Neighbourhood Influence and Technological Change, Regional Science and Urban Economics, 22 • We look(ed) at other methods in the seminar

  50. Conclusions • Straightforward to show that data is spatially autocorrelated • Difficult to identify the source of the autocorrelation • Effect of neighbour outcomes? • Effect of neighbour observable characteristics? • Effect of neighbour unobservables? • Reduced form (spatial x) models are generally the best way forward for applied, policy-relevant work • You can focus on identification strategies that deal with standard endogeneity/omitted variables problems (E[|x]0) and sorting issues ((E[|Wx]0)

More Related