1 / 39

GY460 Techniques of Spatial Economic Analysis

GY460 Techniques of Spatial Economic Analysis. Lecture 2: Spatial smoothing and weighting. Steve Gibbons. Objectives. Outline methods for ‘smoothing’ data (spatial and non-spatial) Understand relevance of this method in visualisation and exploratory analysis

lacey
Download Presentation

GY460 Techniques of Spatial Economic Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GY460 Techniques of Spatial Economic Analysis Lecture 2: Spatial smoothing and weighting Steve Gibbons

  2. Objectives • Outline methods for ‘smoothing’ data (spatial and non-spatial) • Understand relevance of this method in visualisation and exploratory analysis • Understand relevance of this method to modelling and regression approaches (leads to next lecture)

  3. Readings • No essential papers for this section, though we will see examples later • Haining (2003) Chapters 5, 7; • Various chapters in Fotheringham et al • Hardle, W. (1990) Applied Non Parametric Regression, Cambridge is a useful, but non-spatial

  4. Smoothing and spatial surfaces (1) • We can (conceptually at least) decompose a random variable at places s into ‘trend’ and ‘residual’ components: • An estimate of m(s) can be useful for …

  5. Smoothing and spatial surfaces (2) • Exploratory Spatial Data Analysis… • Visualisation of the underlying spatial trend • Looking for spatial ‘heterogeneity’, clusters and hotspots i.e. high and low values of m(s) • Prediction/interpolation: what can we expect at locations not in the sample? • Or at locations in the sample at times we haven’t sampled • Input into further analysis – e.g. effects of market potential, accessibility, spillovers • Question: how can we define ‘large’ and ‘small’ scale? • This is not well determined

  6. Motivation: London house prices, aggregated patterns

  7. Motivation: London house prices, disaggregated patterns

  8. Various smoothing techniques

  9. General structure • Goal is to estimate the smooth part • When data is generated by a process of the form • And is unknown an (probably) very non-linear function of s • Note: if • uncorrelated with • i.e. this is essentially a non-linear regression problem

  10. General structure • The most common non/semi-parametric estimators take the general form • Where is a scalar weight assigned to data point given its distance from location • And • So m(s) is basically a moving weighted average

  11. Example 1: k-nearest neighbours

  12. Example 2: kernel regression or smoothing • h: bandwidth – defines width of the kernel ‘windows’

  13. Example kernels • Uniform/rectangular • Normal/Gaussian

  14. Example 3: Locally weighted regression (1) • See Cleveland (1979), Journal of the American Statistical Association 74(368) • E.g. fit a local polynomial using regressions

  15. Example 3: Locally weighted regression (2) • Straightforward to estimate gradients from estimated coefficients

  16. Example 4: Splines • If local polynomials are made to join locally then you have a spline

  17. All these methods try to estimate (E[x|s]) • Estimate at any data point or arbitrary grid point • But potential problems at edges Out of sample data

  18. Generalising to two dimensional space • We could use multivariate weighting functions, with different bandwidths in different directions • In the case of spatial kernels, its usual to just combine the two dimensions (N-S, and E-W) into one: distance ! • To see how this relates to the previous discussion consider multiplying two univariate Gaussian kernels for E-W and N-S coordinates • Q: What this combination of two dimensions into one assume?

  19. Property sales

  20. Smoothing and interpolation (1) • Note: we can estimate m(.) at points in the data, and at points between locations represented in the data – allows interpolation yk other places… yl dk yr dl dr dm ym xi,si1,si2 dq dn yq dp yn yp

  21. Smoothing and interpolation • Assign average values (e.g. prices) from points to raster cells: e.g. IDW 2 1.6 1 1.8 1.7 8 1.1 b = 1 7 =4.24

  22. Property sales prices (IDW)

  23. London log property prices (Gaussian kernel) Source Gibbons and Machin 2003 Journal of Urban Economics

  24. Alternative, parametric methods • Alternatively you can model m(s) as a parametric function • Polynomial series • Or polar coordinates e.g. Cheshire and Sheppard (1995), Economica • Application to land value in Reading

  25. Weights matrices: notation and practical implementation

  26. Matrix representation of spatial weights • You can simplify the notation for weighted averages for one location: • Or the whole vector of means • Usual to normalise so that Wx creates a mean and not a sum • Lets think about the case where Wx is projecting data on to the same set of locations • It is common then to exclude observation i from the mean for location I • You apply any of the weights schemes we’ve discussed already

  27. Inverse distance weights • Or in general

  28. 1st order contiguity weights • A traditional weighting scheme in spatial econometrics and regional science

  29. Spatial weights matrix for 1st order contiguity • For all n observations (regions), first-order contiguity

  30. The ‘average neighbours’ vector

  31. Other weight schemes • Neighbourhood/district blocks • Uniform weight on observations sharing the same ‘neighbourhood’; zero otherwise • Q: how would this look for 9 observations in 3 neighbourhoods? • Social/economic weights • E.g. absolute difference in incomes between two places • Weights derived from other analyses (e.g. trade flows – see Head and Mayer (2004) in later lectures) • Or commuting or migration flows (e.g. Figlio et al strategic interactions paper_ • Network distances: along road or rail networks • Need GIS to do this easily

  32. Practical issues • Its rarely necessary (or feasible) to work with an NxN spatial weights matrix, although this is used in much spatial econometrics notation • Weights can more easily be dealt with x a column vector, and its rarely necessary to assign all N weights to every N observation (e.g. nearest neighbours) • If all else fails: you can calculate the weighted averages one observation at a time (“do” loops). • Zero distances are problem when using inverse distance weights

  33. Applications

  34. Applications • These spatial weights systems are fundamental building block of quantitative spatial analysis • We will discuss in greater detail future lectures and seminars…

  35. Market Potential • Construction of market potential measures – e.g. for trade and economic geography • X is income, wages, expenditure, or population • Many, many examples • Harris (1954), The Market as a Factor in the Localization of Industry in the United States, Annals of the Association of American Geographers • Hanson (2005) Market potential, increasing returns and geographic concentration, Journal of International Economics,

  36. Accessibility, agglomeration • Measurement of ‘accessibility’ • X is employment, GDP, population or other variable of interest • Distance weights often computed along road and rail network • Usually weights are often not row normalised i.e. • e.g. Vickerman, Spiekermann, Wegener (1999) Accessibility and Economic Development in Europe, Regional Studies • work by Daniel Graham for UK Department of Transport http://www.dft.gov.uk/pgr/economics/rdg/webia/webtheory/investigatingthelinkbetweenp1077

  37. Spillovers, neighbourhood effects and interactions • We looked at papers on strategic interaction in seminar • Brueckner (1999), Figlio et al (1999) • X was indicators of government policy • Big literature on ‘neighbourhood effects’ and peer groups • X can be anything you think affects adult outcomes or child development • Technological spillovers • X could be R&D expenditure • More about these in next lecture(s)

  38. Conclusions • Most spatial analysis involves estimation of local means (or sums) by re-weighting the data • Close similarity between many methods, but diverse applications… • …signals the need for some caution in analytical interpretation!

More Related