Geostatistic Analysis Topic 5 Starting 2/27/2007
Purposes • Estimating or interpolating values at unsampled sites within the area covered by existing observations • Visiting every location is usually difficult or expensive. • Assumption: spatially distributed objects are spatially correlated. in other words, things are close together tend to have similar characteristics (first law of geography, also called spatial autocorrelation).
example • Few sample points – to fill all cells
Global interpolation • global interpolators determine a single function which is mapped across the whole region • a change in one input value affects the entire map • global algorithms tend to produce smoother surfaces with less abrupt changes are used when there is an hypothesis about the form of the surface, e.g. a trend
Local interpolation • Local interpolators apply an algorithm repeatedly to a small portion of the total set of points. On average, values at points closer in space are more likely to be similar than point further apart (spatial autocorrelation) • A change in an input value only affects the result within the window • Two important steps for local interpolation • Define sampling neighborhood • Find points (samples) in the neighborhood
If no directional influence • If there are no directional influences in the data, you want to give equal weight to sample points regardless of their direction from the prediction location. This means that you probably want your neighborhood to be a circle
If directional influence • if there is directional influence in your data (such as might be caused by water draining downhill), then you may want to make an ellipse with the major axis running uphill/downhill
Sample points • Once a shape is specified, you can restrict which sample points within the neighborhood are used. You do this by specifying the maximum and minimum numbers of points to use and by dividing the neighborhood into sectors. If the neighborhood is sectored, then the maximum and minimum constraints are applied to each part
1. Explore data before interpolation • Before creating a surface, explore data (ED) tool enables you to gain a deeper understanding of the phenomena you are investigating so that you can make better decisions on issues relating to your data. • Exploring the distribution of the data, looking for global and local outliers, looking for global trends, examining spatial autocorrelation, and understanding the covariation among multiple datasets. • Tools includes Histogram, Voronoi Map, Normal QQPlot, Trend Analysis, Semivariogram/Covariance Cloud, General QQPlot, and Crosscovariance Cloud. • Only works on point and polygon layers
1.1 Histogram • provides a univariate (one-variable) description of your data, displays the frequency distribution for the dataset of interest and calculates summary statistics. • measures of location: - mean, median, and quartiles • measures of spread: - standard deviation, variance • measures of shape: - skewness, kurtosis
1.2 QQPlot The quantile-quantile (q-q) plot is a graphical technique for determining if two data sets come from populations with a common distribution. A q-q plot is a plot of the quantiles of the first data set against the quantiles of the second data set. By a quantile, we mean the fraction (or percent) of points below the given value. That is, the 0.3 (or 30%) quantile is the point at which 30% of the data fall below and 70% fall above that value The q-q plot is used to answer the following questions: • Do two data sets come from populations with a common distribution? • Do two data sets have common location and scale? • Do two data sets have similar distributional shapes? • Do two data sets have similar tail behavior?
standard • QQPlots are graphs on which quantiles from two distributions are plotted relative to each other. • a cumulative distribution is produced by ordering the data and producing a graph of the ordered values versus cumulative distribution values Normal QQPlot General QQPlot
1.3 Voronoi map • Voronoi maps are constructed from a series of polygons (thiessen polygon) formed around the location of a sample point. • The value for each polygon can be calculated using any of methods: - simple, mean, mode, cluster, entropy, median standard deviation, IQR (interquartile range)
1.4 Trend analysis • You may be interested in mapping a trend, or you may wish to remove a trend from the dataset before using kriging. The Trend Analysis tool can help identify global trends in the input dataset.
globe trend and anisotropy • globe trend is an overriding process that affects all measurements, can be represented by a mathematical formula (polynomial) • anisotropy is a random process that shows higher autocorrelation in one direction than in another (directional autocorrelation or directional influence). the reason for directional influence may not be known, but they can be statistically quantified.
1.5 Semivariogram/covariance cloud • The semivariogram/covariance cloud shows the empirical semivariogram (half of the difference squared) and covariance for all pairs of locations within a dataset and plots them as a function of the distance between the two locations. • the empirical semivariogram for the (i,j)th pair is simply 0.5*(z(si)-z(sj))2, and the empirical covariance is the cross-product where is the average of the data. The semivariogram/covariance cloud can be used to examine the local characteristics of spatial autocorrelation within a dataset and look for outliers.
Creating Variography Semivariogram depicts the spatial autocorrelation
Understanding a semivariogram -range, sill, and nugget The distance where the model first flattens out is known as the range The value at which the model attains the range is called the sill The value at which the model intercepts the y-axis is called the nugget
Fitting the semivariogram • Circular, spherical, exponential, Gaussian, and linear.
Making a prediction • prediction based on the semivarioogram model and the measured values that are nearby (using search radius) • fixed search radius: requires a distance and minimum number of points • variable search radius: number of points needs to be specified. You can also specify a maximum distance (radius), that the search radius cannot exceed.
To explore a directional influence in the semivariogram cloud, we use the Search Direction tools • the direction the pointer is facing determiners which pairs of data locations are plotted on the semivariogram. • lag size is the size of a lag distance, used to reduce the larger number of possible combinations. it is the size of the cells in the semivariogram surface. • number of cells is called number of lags, counted as the number of adjacent cells in a straight horizontal or vertical line from the center to the edge of the figure.
1.6 Crosscovariance cloud • The crosscovariance cloud shows the empirical crosscovariance for all pairs of locations between two datasets and plots them as a function of the distance between the two locations. • Let z(si) denote the value at the i th location in dataset 1, and let y(tj) denote the value at the j th location in dataset 2.
2. Inverse Distance Weighted (IDW) • Each sample point has a local influence that diminishes with distance. • Weights the points closer to the processing cell more heavily than those farther away. • Operator controls how weighting is done. • Power. • High power gives more weight to closer points. • Radius type. • Considers how far away to look. • Barrier. • Search can be limited by other polygons or polyline.
IDW Characteristics • Inverse Distance Weighting (IDW) is a quick deterministic interpolator that is exact. There are very few decisions to make regarding model parameters. It can be a good way to take a first look at an interpolated surface. However, there is no assessment of prediction errors, and IDW can produce "bulls eyes" around data locations. There are no assumptions required of the data.
r r r IDW: a local interpolator We usually uses power functions greater than 1. A r = 2 is known as the inverse distance squared weighted interpolation.
3. Global Polynomial interpolator • Global Polynomial (GP) is a quick deterministic global interpolator that is smooth (inexact). There are very few decisions to make regarding model parameters. It is best used for surfaces that change slowly and gradually. However, there is no assessment of prediction errors and it may be too smooth. Locations at the edge of the data can have a large effect on the surface. There are no assumptions required of the data.
GP interpolation fits a polynomial regression to x,y coordinates First order Second order: Third order:
4. Local Polynomial interpolator • Global Polynomial interpolation is the only method in Geostatistical Analyst that does not use a search neighborhood. If you add the idea of a search neighborhood to Global Polynomial interpolation, you get Local Polynomial interpolation; • Local Polynomial (LP) is a moderately quick deterministic interpolator that is smooth (inexact). It is more flexible than the global polynomial method, but there are more parameter decisions. There is no assessment of prediction errors. The method provides prediction surfaces that are comparable to kriging with measurement errors. Local polynomial methods do not allow you to investigate the autocorrelation of the data, making it less flexible and more automatic than kriging. There are no assumptions required of the data.
Local polynomial interpolation creates a surface from many different formulas, each of which is optimized for a neighborhood • The neighborhood shape, maximum and minimum number of points, and sector configuration can be specified. In addition, as with IDW, the sample points in a neighborhood can be weighted by their distance from the prediction location. Thus, local polynomial interpolation produces surfaces that better account for local variation.
5. Radial Basis Functions • Radial Basis Functions (RBF) are moderately quick deterministic interpolators that are exact. They are much more flexible than IDW, but there are more parameter decisions. There is no assessment of prediction errors. The method provides prediction surfaces that are comparable to the exact form of kriging. Radial Basis Functions do not allow you to investigate the autocorrelation of the data, making it less flexible and more automatic than kriging. Radial Basis Functions make no assumptions about the data. • RBF seems like a rubber membrane that is fitted to each of the measured data points while minimizing the total curvature of the surface. Because the surface must pass through each sampled point, radial basis functions are exact interpolators
Radial basis functions (Spline) • Radial basis functions (RBF) methods are a series of exact interpolation techniques, that is, the surface must go through each measured sample value. • There are five different basis functions: thin-plate spline, spline with tension, completely regularized spline, multiquadric function, and inverse multiquadric spline • RBFs are conceptually similar to fitting a rubber membrane through the measured sample values while minimizing the total curvature of the surface. IDW will never predict values above the maximum measured value or below the minimum measured value. Spline can http://www.math.ucla.edu/~baker/java/hoefer/Spline.htm
Wi Bias or error Radial basis functions Wi When running spline in Spatial Analyst weight: - regularized: 0, 0.001, 0.01, 0.1, 0.5 the higher the weight, the smoother the surface - tension: 0, 1, 5, 10 the higher, the coarser number of points: - used in the calculation. the more points. the smoother the surface
6. Kriging • Kriging is a moderately quick interpolator that can be exact or smoothed depending on the measurement error model. It is very flexible and allows you to investigate graphs of spatial autocorrelation. Kriging uses statistical models that allow a variety of map outputs including predictions, prediction standard errors, probability, etc. The flexibility of kriging can require a lot of decision-making. Kriging assumes the data come from a stationary stochastic process (1. constant mean throughout the region, or 2. variance of differences between any two samples is independent of position, but depends on separated distance), , and some methods assume normally-distributed data.
Trend and error • In Kriging, a predicted value depends on two factors: a trend and an additional element of variability. This is an intuitive idea with plenty of analogies in the real world. For instance, if you go from the ocean to the top of a mountain, you have an upward trend in elevation. However, there is likely to be variation on the way—you will go both up and down when crossing valleys, streams, knobs and other features. • In Kriging, the trend part of a prediction is called the trend. The fluctuation part is called spatially-autocorrelated random error. "Error" doesn't mean a mistake—it just means a fluctuation from the trend. Z(s) = μ(s) + ε (s) • Assumption one: ε (s) is zero, (positive errors and negative errors) • Assumption two: the autocorrelation of the error is purely spatial; it depends only on distance and not on any other property (such as position) of a location. Autocorrelation between ε(s1) and ε(s1+h) is the same as the one between ε(s2) and ε(s2+h)
Ordinary Kriging • In many cases, however, there is no trend in the data—or, if there is one, it is weak enough that your predictions are just as good when you ignore it. Assuming that there is no trend in the data is mathematically equivalent to assuming that the data have a constant mean value. • If the mean is a simple constant, such as μ(s) = μ (i.e., no trend) for all locations s, and if µ is unknown (you do not have prior knowledge of the mean value), then this is the model on which Ordinary Kriging is based.
Universal Kriging • Sometimes, there is a trend where the data values change consistently with the spatial coordinates. Mathematically, this is represented as a linear regression equation on the spatial x- and y-coordinates. Trends that vary (do not have a constant mean), and for which the regression coefficients are unknown, form models for Universal Kriging.
Simple Kriging and indicator Kriging • A known constant mean for the entire dataset, then you have the model for Simple Kriging. • look at the left side of the equation, Z(s) = μ(s) + ε(s). You can do some useful math operations on Z(s). For example, suppose you want to predict the probability that Z(s) is above or below some threshold value, such as 0.12 ppm for ozone concentration. You can transform Z(s) to an indicator variable, where it gets the value 0 if Z(s) is below the threshold and 1 if it is above it. This is called Indicator Kriging.
Ordinary Kriging Simple Kriging Universal Kriging Indicator Kriging
7. Cokriging • Cokriging is a moderately quick interpolator that can be exact or smoothed depending on the measurement error model. Cokriging uses multiple datasets and is very flexible, allowing you to investigate graphs of cross-correlation and autocorrelation. Cokriging uses statistical models that allow a variety of map outputs including predictions, prediction standard errors, probability, etc. The flexibility of cokriging requires the most decision-making. Cokriging assumes the data come from a stationary stochastic process, and some methods assume normally-distributed data.
8. definitions of output maps (for Kriging methods) • Prediction maps (interpolated maps) estimate values at locations where measurements have not been taken. • Standard error maps (the square root of the variance of a prediction) show the distribution of prediction error for a surface. Error tends to be highest in places where there is little or no sample data. • Quantile maps show the values with 100 percent probability that the true values are less than the quantile map values. • Probability maps show the odds that the true value at a location is greater than a threshold value. The probability of exceeding a threshold is determined from predicted values, the error distribution, and the specified threshold value
Kriging family and output maps • Transformation and trend removal can help justify assumptions of normality and stationarity