**GEOG4650/5650 – Fall 2007Spatial Interpolation** • Triangulation • Inverse-distance • Kriging (optimal interpolation)

**What is “Interpolation”?** • Predicting the value of attributes at “unsampled” sites from measurements made at point locations within the same area or region • Predicting the value outside the area - “extrapolation” • Creating continuous surfaces from point data - the main procedures

**Types of Spatial Interpolation** • Global or Local • Global-use every known points to estimate unknown value. • Local – use a sample of known points to estimate unknown value. • Exact or inexact interpolation • Exact – predict a value at the point location that is the same as its known value. • Inexact (approximate) – predicts a value at the point location that differs from its known value. • Deterministic or stochastic interpolation • Deterministic – provides no assessment of errors with predicted values • Stochastic interpolation – offers assessment of prediction errros with estimated variances.

**Classification of Spatial Interpolation Methods** Global Local Stochastic Deterministic Stochastic Deterministic Regression (inexact) Kriging (exact) Trend surface (inexact) Thiessen (exact) Density estimation(inexact) Inverse distance weighted (exact) Splines (exact)

**Global Interpolation** • use all available data to provide predictions for the whole area of interest, while local interpolations operate within a small zone around the point being interpolated to ensure that estimates are made only with data from locations in the immediate neighborhood. • Trend surface and regression methods

**Trend Surface Analysis** • Approximate points with known values with a polynomial equation. • Math equation – you don’t want to know…. • Local polynomial interpolation – uses a sample of known points, such as convert TIN to DEM

**Local, deterministic methods** • Define an area around the point to be predicted • finding the data points within this neighborhood • choosing a math model • evaluating the point

**Thiessen Polygon (nearest neighbor)** • Any point within a polygon is closer to the polygon’s known point than any other known points. • One observation per cell, if the data lie on a regular square grid, then Thiessen polygons are all equal, if irregular then irregular lattice of polygons are formed • Delauney triangulation - lines joining the data points (same as TIN - triangular irregular network)

**Thiessen polygons** Delauney triangulation

**Example data set** • soil data from Mass near the village of Stein in the south of the Netherlands • all point data refer to a support of 10x10 m, the are within which bulked samples were collected using a stratified random sampling scheme • Heavy metal concentration measured

**Exercise: create Thiessen polygon for zinc concentration** • Create a new project • Copy “classfiles\GEOG4650-Li\data\10-24\Soil_poll.dbf” and add it to the project. • After add the table into the project, you need to create an event theme based on this table • Go to Tools > Add XY Data and make sure the “Easting” is shown in “X” and “Northing” is in “Y”. (Don’t worry the “Unknown coordinate” • Click on OK then the point theme will appear on your project.

**Exercise1) Try to plot this point data based on Zinc** concentration(Try “Graduate Color” and “Graduate Symbol”)

**This is what you might see on screen**

**Create a polygon theme ** • The next thing you need to do is provide the Thiessen polygon a boundary so that the computing of irregular polygons can be reasonable • Use ArcCatalog to create a new shapefile and name it as “Polygon.shp” • Add this layer to your current project. • Use “Editor” to create a polygon.

**Creating Polygon Theme**

**Notes: 1)Remember to stop Edits, otherwise your polygon** theme will be under editing mode all the time2)Remember to remove the “selected” points from the “Soil_poll_data.txt”. If you are done so, your Thiessen polygons will be based on the selected points only.

**Thiessen Polygon from ArcToolBox** • In Arctoolbox | Analysis Tools | Proximity | Create Thiessen Polygon • Make sure the soil_poll event is the Input Features and output to your own folder. Select “All” for Output Fields.

**Result from Thiessen polygon**

**Create Thiessen Polygons from Spatial Analyst: Set Extent** and Cell Size • Go to “Spatial Analyst > Options” and click on tab and use “Polygon” as the “Analysis Mask”. • If the Analysis Mask is not set, the output layer will have rectangular shape.

**Thiessen Polygon from Spatial Analyst** • Select Spatial Analyst > Distance > Allocation. • In “Assign to”, select “soil_poll Event” and click OK to create cell in temporary folder.

**Vector vs Raster Results** • Polygons from “Analysis Tools” are vector polygons with attributes. • Polygons from “Spatial Analyst” are raster polygons with same values inside of each polygon, required to be converted to vector and NO attributes..

**Zinc Concentration** Plot thiessen polygons using zinc concentrations from the attribute table. Before you plot the map, trim thiessen polygons based on polygon.shp.

**Inverse Distance Weighted** • the value of an attribute z at some unsampled point is a distance-weighted average of data point occurring within a neighborhood, which compute: =estimated value at an unsampled point n= number of control points used to estimate a grid point k=power to which distance is raised d=distances from each control points to an unsampled point

**Computing IDW** 6 Z1=40 Z2=60 4 Z4=40 Z3=50 2 2 4 6 X Do you get 49.5 for the red square?

**Exercise - generate a Inversion distance weighting surface** and contour • Spatial Analyst > Interpolate to Raster > Inverse Distance Weighted • Make sure you have set the Output cell size to 50.

**Contouring** • create a contour based on the surface from IDW • Spatial Analyst | Surface Analysis | Contour

**IDW and Contouring**

**Problem - solution** • Unsampled point may have a higher data value than all other controlled points but not attainable due to the nature of weighted average: an average of values cannot be lesser or greater than any input values - solution: • Fit a trend surface to a set of control points surrounding an unsampled point • Insert X and Y coordinates for the unsampled point into the trend surface equation to estimate a value at that point

**Splines** • draughtsmen used flexible rulers to trace the curves by eye. The flexible rulers were called “splines” - mathematical equivalents - localized • piece-wise polynomial function p(x) is

**Spline - math functions** • piece-wise polynomial function p(x) is • p(x)=pi(x) xi<x<xi+1 • pj(xi)=pj(xi) j=0,1,,,, • i=1,2,,,,,,k-1 i+1 x1 xk+1 x0 xk break points

**Spline** • r is used to denote the constraints on the spline (the functions pi(x) are polynomials of degree m or less • r = 0 - no constraints on function

**Exercise: create surface from spline** • have point data theme activated • Spatial Analyst | Interpolate to Raster | Spline • Define the output area and other parameters • Select “Zn” for Z Value Field and “regularized” as type and “50” for Output cell size.

**Kriging** • comes from Daniel Krige, who developed the method for geological mining applications • Rather than considering distances to control points independently of one another, kriging considers the spatial autocorrelation in the data

**Semivariance ()** 20 Z1 Z2 Z3 Z4 Z5 10 20 30 35 40 50 10 20 30 40 50 Zi = values of the attribute at control points h=multiple of the distance between control points n=number of sample points

**Semivariance** h=1, h=2 h=3 h=4 21.88 91.67 156.25 312.50 (Z1-Z1+h)2 100 25 25 25 175 8 225 100 100 425 6 400 225 625 4 625 625 2 (Z2-Z2+h)2 (Z3-Z3+h)2 (Z4-Z4+h)2 sum 2(n-h)

**semivariance** • the semivariance increases as h increases : distance increases -> semivariance increases • nearby points to be more similar than distant geographical data

**data no longer similar to nearby values** sill range h

**kriging computations** • we use 3 points to estimate a grid point • again, we use weighted average =w1Z1 + w2Z2+w3Z3 = estimated value at a grid point Z1,Z2 and Z3 = data values at the control points w1,w2, and w3 = weighs associated with each control point

**In kriging the weighs (wi) are chosen to minimize the** difference between the estimated value at a grid point and the true (or actual) value at that grid point. • The solution is achieved by solving for the wi in the following simultaneous equations • w1(h11) + w2(h12) + w3(h13) = (h1g) w1(h12) + w2(h22) + w3(h23) = (h2g) w1(h13) + w2(h32) + w3(h33) = (h3g)

**w1(h11) + w2(h12) + w3(h13) = (h1g) ** w1(h12) + w2(h22) + w3(h23) = (h2g) w1(h13) + w2(h32) + w3(h33) = (h3g) Where (hij)=semivariance associated with distance bet/w control points i and j. (hig) =the semivariance associated with the distance bet/w ith control point and a grid point. • Difference to IDW which only consider distance bet/w the grid point and control points, kriging take into account the variance between control points too.

**Example** distance 1 2 3 g Z1(1,4)=50 1 2 3 g 0 3.16 2.24 2.24 Z3(3,3)=25 0 2.24 1.00 0 1.41 Zg(2,2)=? 0 Z2(2,1)=40 w10.00+w231.6+w322.4=22.4 w131.6+w20.00+w322.4=10.0 w122.4+w222.4+w30.00=14.1 =10h h

** =0.15(50)+0.55(40) + 0.30(25)** = 37

**Homework 6 – due next Friday midnight (11/2/07)** • See website