Spatial Interpolation III ENVE 424/524
Spatial Interpolation We can estimate the value of a variable at a location based on the values of that variable at neighboring locations. The trick is defining the weight spatial estimation method zi is the estimated value at location i n is the number of data points zj is the value at data point j wij is the weight assigned to data point j continuous surface of estimates (map) point monitoring data
Kriging Steps • Describe the spatial variation in the sample data • Summarize the spatial variation by a mathematical function • Use this model to determine the interpolation weights
Variations on Kriging Simple Uses a known (at least assumed), constant mean across domain. Usually used by subtracting point data values from mean, interpolating the residuals, and then adding back to mean. Universal (kriging with a trend) Uses a modeled trend across domain. The trend is subtracted from the point data values and the residuals interpolated. Indicator kriging Point data are transformed to indicator variables (usually binary). For example, a threshold value is defined and data points that are below the threshold are assigned a value of 1 and those above the threshold are assigned a value of 0. The indicator values are interpolated and the resulting surface shows the probabilities (0-1) of exceeding (or being below) the threshold. Probability Similar to indicator kriging but also incorporates the difference between a data pint value and the defined threshold. Using this “proximity” to threshold information can result in more accurate probabilities.
Block Kriging Used when estimates are required not at points but an average value is required within a prescribed local area. The estimate from block kriging using four points (Figure a) is equal to averaging the estimates from kriging at each of the four points individually.
Kriging with secondary information Kriging can incorporate secondary information to aid the estimation process when the primary variable is sparsely sampled. Secondary information that is physically related to the primary variable and measured at a high spatial density than the primary variable can aid in “filling in” spatial gaps during interpolation. (hard information) (soft information) Examples: Primary Secondary rainfall amount digital elevation model particulate matter concentrations visibility measurements cadmium concentrations zinc concentrations seasonal rainfall (1970-1980) seasonal rainfall (1990-2000)
Kriging with an external drift (KED) A secondary variable related to the primary variable can be used to model the "drift", or trend, of the primary variable in the kriging estimation. In areas without primary sampling sites, the secondary variable aids the estimate through a function relating the secondary to primary variable. As the estimation location gets farther from available primary samples, the estimate approaches the drift value. The secondary variable must vary smoothly in space and must be known at all estimation locations as well as all primary sampling locations. The function relating the secondary and primary variable must be linear. & Ozone measurements in Paris Air quality model grid Ozone concentrations using KED
Cokriging Cokriging is an extension of ordinary kriging that uses a high spatial resolution secondary variable network to improve the estimation of the primary variable. Like ordinary kriging, a variogram model is developed for the primary variable. A second variogram model is developed for the secondary variable A third variogram model is generated from the cross correlation of the primary and secondary variables. The secondary data is "transformed" to the scale of the primary variable in order to derive a cross variogram. The secondary data are then treated identically to the primary data during spatial estimation. All U and V points are paired, binned, and semi-variance calculated U = Secondary Data V = Primary Data V Variogram UV Cross-Variogram U Variogram
Cokriging Example Cokriging does not improve estimates much if -- all the variogram models are very similar in shape -- the secondary information is does not have a higher density than the primary information
Kriging Error Factors influencing interpolation error: • Number of nearby samples • Proximity of nearby samples • Spatial arrangement of samples • Nature of phenomenon being estimated
Cross Validation Cross Validation, a “jackknife technique”, provides a measure of interpolation error Remove site from network (red circle) Obtain estimate at removed site location. Interpolation Error = Estimated Value – Measured Value
Updates ArcGIS now available for class use on the computer in Urbauer 319A Plan is to have it available on two other machines in Cupples II first floor. Reading forerror and uncertainty lectures Focus on Chapters 6 and 15 in Geographic Information Systems and Science, Longley et. Al Optional: Chapter 9 in Principles of Geographical Information Systems, Burrough and McDonnell
Review for Exam Exam will cover material from lecture 1 through today’s lecture Exam questions will be based on problem sets, lecture slides, lab exercises, and the reading. The exam questions will be a mix of: -- multiple choice -- match -- short answer -- equation-based calculations
Types of Attributes A common approach to classifying attributes is based on their level of measurement • Nominal– Simply identifies or labels an entity so that it can be distinguished from another. e.g. sensor ID, building name (Lopata House vs. Lopata Hall) • Cannot be manipulated using mathematical operations. However, frequency distributions are meaningful. • Ordinal – Values based on an order or ranking, e.g. agricultural potential classes • Cannot be manipulated using mathematical operations. However, frequency distributions are meaningful. • Interval – Differences between entities are defined using fixed equal units, e.g. Celsius temperature • Can be manipulated using addition and subtraction • Ratio - Differences between entities can be defined using ratios, e.g. distance • Can be manipulated using multiplication and division • Cyclic - differences between entities depend on direction, e.g. wind direction
Data Types Entities and fields can be transformed to the other type Vectors compared to rasters
Polygon Analysis Methods • Filtering • Retrieves a subset of a dataset • Examples • Query (search) • Aggregation • Combines attributes or features within data sources (layers) • Examples • Reclassify, dissolve • Integration • Combine two or more data sources (layers) • Example • Polygon Overlay (intersection, union, clip)
Slope Slope is the change is elevation (rise) with a change in horizontal position (run). It provides a measure of the steepest decent between a cell and its neighbors. Slope is often reported in degrees (0° is flat, 90° is vertical) but is also expressed as a percent
Friction Surface © Paul Bolstad, GIS Fundamentals
Coordinate Systems A geographical coordinate system uses a three-dimensional spherical surface to define locations on the earth. Divides space into orderly structure of locations. Two types: cartesian and angular (spherical) © Paul Bolstad, GIS Fundamentals
Projection Types • Equal area – the ratio of areas on the earth and on the map are constant. Shape, angle, and scale are distorted. • Conformal – the shape of any small surface of the map is preserved in its original form. If meridians and parallel lines are at 90-degree angles, then angles are also preserved. • Equidistant - preserve distances between certain points. Scale is not maintained correctly, however, typically one or more lines has its scale maintained. Three general types of projections:
Conceptualizing Projections Projections can be conceptualized by the placement of paper on or around a globe. The “location” of the paper is a factor in defining the projection. The paper can be either tangent or secant.
F(d) G and F Function G reflects how close together events are within a region while F indicates how distant events are from arbitrary locations G(d) G(d) F(d)
K Function Nearest neighbor is limiting because only nearest distances are considered The K function uses all distances between events and provides a measure of spatial dependence over a wider range of scales. • Steps: • Center circles of radius h on each event, C(si,h) • Count the number of other events inside each circle • Calculate the mean count for all events • Divide the mean count by the overall area event density,
Kernel Windows What about the edges? © Paul Bolstad, GIS Fundamentals
Key Factors in Spatial Interpolation Search Radius - sets the maximum distance at which a monitoring site is to be included for estimation. Maximum Number of Stations - creates a flexible radius of influence. The radius adjusts itself to the number of available stations. Minimum Number of Stations - prevents estimates from being made in areas with an inadequate number of stations. Weighting Factor – determines the relative influence of each monitoring site in deriving an estimate
Declustering Methods Cell-declustering of ozone monitoring sites in Missouri. The numbers in the corners of the grid cells indicate the number of stations within the grid cell. The weight assigned to stations within the grid cell is the inverse of the number of stations. The polygon separation method uses an “area of influence.” Sites within clusters have small areas of influence while remote sites have large areas of influence.
Variogram Attributes Sill: the part of the fitted function that levels off indicating that there is no spatial dependence between data points because all estimates of variances of differences will be invariant with sample separation distance Range is the distance at which the function reaches the sill. If the range is large, then long-range variation occurs over short distances. If the range is small, then major variations occur over short distances. Nugget is the local variance attributable to measurement error or random error
ArcGIS Main Components ArcCatalog ArcToolbox ArcMap