Deana D. Pennington, PhD University of New Mexico. Spatial Modeling and Analysis. What is spatial analysis?. Analyses where the data are spatially located and explicit consideration is given to the possible importance of their spatial arrangement in the analysis. Statistical Issues.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Analyses where the data are spatially located and explicit consideration is given to the possible importance of their spatial arrangement in the analysis
Valid statistics depend on:
BUT Ecology & Earth Science violate all of these!
We study:
Tobler’s Law: All things are related, but nearby things are more related than distant things
***Field samples tend to be taken from nearby locations, and are almost always spatially autocorrelated***
Nonindependent observations: duplicates observations in the sample set, therefore is a loss of information compared with independent observations. Affects mean, variance, confidence intervals and significance tests
Clouds 23
Roads 33
River 23
Barren 22
Riparian 28
Agriculture 38
Arid upland 25
Hyperspectral Example
True
Color
False
Color
6 km2
*low % samples
*errors in samples
300 x 300 pixels, 192 training pixels out of 90,000 total pixels, 7 mislabeled
River/agriculture
Riparian
Riparian
Riparian
Kmeans
Unsupervised
10 classes
Riparian
Clouds/barren
Arid upland
Arid upland
Semiarid upland
Semiarid upland
Unclassified
Clouds
River
Riparian
Arid upland
Roads
Barren
Agriculture
Kmeans Unsupervised
Maximum Likelihood
89.44%
Naïve Bayesian
83.33%
Parallelepiped
82.78%
Support Vector Machine
77.22%
Minimum Distance
69.44%
Components of Spatial Analysis
Exploratory Spatial Data Analysis (ESDA)
Finding interesting patterns.
Visualization
Showing interesting patterns.
Spatial Modeling
Explaining interesting patterns.
Things to consider:
Biological survey
where each point denotes
the observation of an
endangered species. If a
pattern exists, like this
diagram, we may be able
to analyze behavior in terms
of environmental characteristics
Make maps from points
Distance interpolation
Kriging
Trend surface analysis
Spline
Network Analysis
Path analysis
Allocation
Connectivity
Test models with
space as causal factor
Mantel test
Mantel correlogram
Multivariate analysis
Describe spatial structure
Point pattern analyses
Context
Adjacency measures
Cross variogram
Cross correlogram
Gradient, periodic
Single scale of pattern
Semivariogram
Correlogram
Multiscale pattern
Spectral analysis
Single scale of pattern
Quadrat analysis
Nearest neighbor
Multiscale pattern
Refined nearest neighbor
2nd order analysis
Ripley’s K
Selfsimilarity
Fractal
dimension
Edge
Wavelet
analysis
Statistical tests for significant patterns in data, compared with the null hypothesis of random spatial pattern
The standard against which spatial point patterns are compared is a:
Completely Spatially Random (CSR) Point Process
Poisson probability distribution (mean = variance)
used to generate spatially random points
# of
cells
Expected CSR = null hypothesis
Uniform
# of pts/cell
Quadrat AnalysisExpected mean #/cell in CSR l = N/# of quads
For Poisson distribution:
p(x) = (ellx)/x!
Chi square C2 = (observed – expected)2/expected
#OiP(x)Ei
0 2 0.0156 0.39
1 2 0.0649 1.62 5.39 2.42
2 5 0.1350 3.38
3 1 0.1873 4.68
… S C2
Check Chi square table
If Ho rejected:
Mean <> variance
Mean > variance (uniform)
Mean < variance (clustered)
Look up in significance in zstatistic table
If Ho rejected,
observed mean < expected and Z < 0 => clustered
observed mean > expected and Z > 0 => uniform
Clustered
Ripley’s KExpected CSR mean
L(d)
Radius
***Note added information – mean clustering distance
Species B habitat
Range Species A = 4 cells
Species A depends on B
Grid Analysis: Focal AnalysisSpatial filters: output value for each cell is calculated from neighboring cells (moving windows)
Neighborhood shapes:
Majority
Maximum
Mean
Median
Minimum
Range
Standard deviation
Sum
Variety
Area
Centroid
Geometry
Perimeter
Majority
Maximum
Mean
Median
Minimum
Range
Statistics
Standard deviation
Sum
Thickness
Variety
Vegetation class A
or land use A
Vegetation class B
or land use B
Vegetation class C
or land use C
Output is:
a) grid with same value in each cell for a given zone
b) table with values by zone
Parametric Stats
UnivariateMultivariate
Spatial Stats
UnivariateMultivariate
mean
variance
x
semivariance
lag correlation
lag covariance
x, h
h = lag (time or space)
crosssemivariance (variogram)
cross correlation inverse
cross covariance (correlogram)
x, y, h
correlation
covariance
x, y
w.r.t study
extent
N
Variance: s2 = S (xi – x )2
i=1
N
Nh
Semivariance: gh = S (xi – xi+h )2
i=1
2Nh
Xi+h
Xi
Semivariance ghNh
Semivariance: gh = S (xi – xi+h )2
i=1
2Nh
h = 1….Nh = 9
h = 5….Nh = 5
Semivariance ghNumber of cells N = 10
Number of windows Nh = # cells – h
Xi+h
Xi
Limit h to 1/3 of study extent
Nh
Semivariance: gh = S (xi – xi+h )2
i=1
2Nh
Sill
gh
Nugget
spatial
dependence
independence
0
h
Range
SemivariogramIf xi is similar to xi+h , gh is small, and they are spatially correlated
If xi is not similar to xi+h , gh is large, and they are not spatially correlated
=> gh measures heterogeneity
Nugget – value of gh at distance 0 (not in data) – measure of unexplained variability
Range – distance h of leveling off – below range heterogeneity is increasing in a predictable manner, above range, heterogenity is constant – measure of independence
Sill – measure of maximum heterogeneity in data (gmax)
periodic, cyclic
gradient, no sill or range
gh
gh
0
0
h
h
Examples: timber harvest, forest age
range ~ harvest area
sill ~ rotation
Xih
Lag Covariance: Geary’s CCentered around mean values of x, x
Local mean
Nh
Lag covariance: Ch = S (xi – xih )(xi – xi+h )
i=1
Nh
Xi+h
Xi
If x, xi+h and xih are all the same, Ch = 0
If values are increasing or decreasing through space (xih < x < xi+h, or xih > x > xi+h, 1 term is negative and Ch = negative, things are not similar. Otherwise positive, things are similar
Correlograms have the inverse shape of semivariograms
Nh
Lag covariance: Ch = S (xi – xih )(xi – xi+h )
i=1
Nh
Lag correlation Ph = Ch
Sxh Sx+h
Lag Correlation: Moran’s ICentered around mean values of x, x
Standardized against sample variation
gh
0
h
ComparisonCorrelated Independent
Semivariance gh 0 < Gh <
Lag Covariance Geary’s C Ch  < Ch <
Lag Correlation Moran’s I Ph 1 < Ph < +1
+1
zero
Ch
Ph

1
0
0
h
h
similar h
range
Surfaces do not have to be elevation, but could be population density, species richness, or any other measured attributed
Trend analysis
Spline
Given geolocated point data, calculate values at regular intervals between points
Inverse distance weighting
Gauge
Points
Field Data
(Vector)
DEM
Watershed
Hydro
Model
Samples
Grid
Process
Land
Cover
Statistics
Soil
Modeling
regression,
et al.
Sample 1, lat, long, species, presence
Access File
Sample 3, lat, long, species, absence
Vegetation cover type
Sample 2, lat, long, species, presence
Integrated data:
Elevation (m)
P, juniper, 2200m, 16C
P, pinyon, 2320m, 14C
A, creosote, 1535m, 22C
Mean annual temperature (C)
Example: Integrating Species Occurrence Points and Images
Geographic patterns of species richness
of 17 native rodent species.
SanchezCordero and MartinezMeyer, 2000
ENM ResultsModel building and testing.
a) training data; b) predictive model.
Peterson, Ball and Cohoon, 2002