Deana d pennington phd university of new mexico
Download
1 / 45

Spatial Modeling and Analysis - PowerPoint PPT Presentation


  • 210 Views
  • Uploaded on

Deana D. Pennington, PhD University of New Mexico. Spatial Modeling and Analysis. What is spatial analysis?. Analyses where the data are spatially located and explicit consideration is given to the possible importance of their spatial arrangement in the analysis. Statistical Issues.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Spatial Modeling and Analysis' - keefe


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Deana d pennington phd university of new mexico

Deana D. Pennington, PhD

University of New Mexico

Spatial Modeling and Analysis


What is spatial analysis
What is spatial analysis?

Analyses where the data are spatially located and explicit consideration is given to the possible importance of their spatial arrangement in the analysis


Statistical issues
Statistical Issues

Valid statistics depend on:

  • Temporal stability and causal transience

  • Unit homogeneity

  • Independence

  • Constant effects

    BUT Ecology & Earth Science violate all of these!

    We study:

  • Change with time (no temporal stability)

  • Legacies, persistence, recovery (no causal transience )

  • Heterogenity through space and time (no unit homogeneity

  • Spatial structure (no independence)

  • Differences in response through space/time (non-constant effects)

  • Attributes rather than causal factors, which must be inferred


Issues in spatial analysis
Issues in Spatial Analysis

  • Error

  • Small sample sizes compared with size of environmental data sets

  • Spatial dependency

  • Spatial heterogeneity

  • Boundaries effects

  • Modifiable Areal Unit Problem


Spatial dependency
Spatial Dependency

Tobler’s Law: All things are related, but nearby things are more related than distant things

***Field samples tend to be taken from nearby locations, and are almost always spatially autocorrelated***

Non-independent observations: duplicates observations in the sample set, therefore is a loss of information compared with independent observations. Affects mean, variance, confidence intervals and significance tests


Spatial heterogeneity

Heterogenity in spatial data

Spatial Heterogeneity

  • Stratification of the landscape (regions, classes, etc) problematic due to gradational nature

  • Intra-strata variability, mixtures

  • Differences in numbers of observations within strata


7

Clouds 23

Roads 33

River 23

Barren 22

Riparian 28

Agriculture 38

Arid upland 25

Hyperspectral Example

True

Color

False

Color

6 km2

*low % samples

*errors in samples

300 x 300 pixels, 192 training pixels out of 90,000 total pixels, 7 mislabeled


Hyperspectral results

7

Hyperspectral Results

River/agriculture

  • Confusion between river & agriculture

Riparian

Riparian

  • Confusion between clouds and barren

Riparian

K-means

Unsupervised

10 classes

  • Unsampled semi-arid upland

Riparian

Clouds/barren

  • Mislabeled arid upland

Arid upland

  • Unsampled variability in riparian

Arid upland

Semi-arid upland

  • Road variability

Semi-arid upland


7

Unclassified

Clouds

River

  • Confusion between river & agriculture

Riparian

Arid upland

  • Confusion between clouds and barren

Roads

  • Unsampled semi-arid upland

Barren

  • Mislabeled arid upland (4.4%)

Agriculture

  • Unsampled variability in riparian

  • Road variability

K-means Unsupervised

Maximum Likelihood

89.44%

Naïve Bayesian

83.33%

Parallelepiped

82.78%

Support Vector Machine

77.22%

Minimum Distance

69.44%


Boundary effects
Boundary Effects

  • Loss of neighbors in analyses that depend on neighborhood values

  • Solution: collect data along a border outside of the analysis area


Modifiable areal unit problem maup
Modifiable Areal Unit Problem (MAUP)

  • Results sensitive to cell size, location, orientation


Components of Spatial Analysis

Exploratory Spatial Data Analysis (ESDA)

Finding interesting patterns.

Visualization

Showing interesting patterns.

Spatial Modeling

Explaining interesting patterns.


Spatial analyses
Spatial Analyses

Things to consider:

  • Objective: describe, map, causation

  • Data type: binary (Y/N), categorical, continuous

  • Expected pattern: gradient, periodic, clustered

  • Scale of pattern

  • Univariate/multivariate


Spatial analyses1
Spatial Analyses

Biological survey

where each point denotes

the observation of an

endangered species. If a

pattern exists, like this

diagram, we may be able

to analyze behavior in terms

of environmental characteristics

  • Quantify pattern

    • Attraction or repulsion

    • Directionality

  • Make inferences about process based on observed pattern


Choices
Choices

Make maps from points

Distance interpolation

Kriging

Trend surface analysis

Spline

Network Analysis

Path analysis

Allocation

Connectivity

Test models with

space as causal factor

Mantel test

Mantel correlogram

Multivariate analysis

Describe spatial structure

Point pattern analyses

Context

Adjacency measures

Cross variogram

Cross correlogram

Gradient, periodic

Single scale of pattern

Semivariogram

Correlogram

Multiscale pattern

Spectral analysis

Single scale of pattern

Quadrat analysis

Nearest neighbor

Multiscale pattern

Refined nearest neighbor

2nd order analysis

Ripley’s K

Self-similarity

Fractal

dimension

Edge

Wavelet

analysis


Point pattern analysis

Uniform

(repulsion)

Point Pattern Analysis

Clustered (attraction)


Point Pattern Analysis

Statistical tests for significant patterns in data, compared with the null hypothesis of random spatial pattern

The standard against which spatial point patterns are compared is a:

Completely Spatially Random (CSR) Point Process

Poisson probability distribution (mean = variance)

used to generate spatially random points


Quadrat analysis

Clustered

# of

cells

Expected CSR = null hypothesis

Uniform

# of pts/cell

Quadrat Analysis

  • Divide the area up into quadrats

  • Count the number of points in each quadrat

  • Compare counts with expected counts in random distribution

Expected mean #/cell in CSR l = N/# of quads

For Poisson distribution:

p(x) = (e-llx)/x!

Chi square C2 = (observed – expected)2/expected

#OiP(x)Ei

0 2 0.0156 0.39

1 2 0.0649 1.62 5.39 2.42

2 5 0.1350 3.38

3 1 0.1873 4.68

… S C2

Check Chi square table

If Ho rejected:

Mean <> variance

Mean > variance (uniform)

Mean < variance (clustered)


Nearest neighbor distance
Nearest Neighbor Distance

  • Calculate the distance to the nearest neighbor for every point

  • Calculate mean nn distance

  • Calculate expected mean for CSR distribution E(di) = 0.5 A/N

  • Compare expected mean to observed mean with Z statistic

  • Z = [ d – E(di)] / [0.0683 A/N2]

Look up in significance in z-statistic table

If Ho rejected,

observed mean < expected and Z < 0 => clustered

observed mean > expected and Z > 0 => uniform


Ripley s k

Uniform

Clustered

Ripley’s K

  • Expand a circle of increasing radius around each point

  • Count the number of points within each circle.

  • Calculate L(d), a measure of the expected number of points within distance (d); L(d) = [ASkij/pN(N-1)]0.5, where A = area, Skij = number of points j within distance d of all i points

  • Monte Carlo simulations or t-test

Expected CSR mean

L(d)

Radius

***Note added information – mean clustering distance


Lab 12a point pattern analysis
Lab #12APoint pattern analysis


Analysis of continuous data
Analysis of Continuous Data

  • Variation in mean values

  • Describe local variability & spatial dependence


Mean trends

Input

Output

Focal

Zonal

Global

or table

Single value (surface analysis)


Grid analysis focal analysis

Species A habitat

Species B habitat

Range Species A = 4 cells

Species A depends on B

Grid Analysis: Focal Analysis

Spatial filters: output value for each cell is calculated from neighboring cells (moving windows)

Neighborhood shapes:

Majority

Maximum

Mean

Median

Minimum

Range

Standard deviation

Sum

Variety

  • Low pass: Smoothing, removing noise

  • High pass: Emphasize local variation

  • Edge enhancement


Grid analysis zonal analysis
Grid Analysis: Zonal Analysis

Area

Centroid

Geometry

Perimeter

Majority

Maximum

Mean

Median

Minimum

Range

Statistics

Standard deviation

Sum

Thickness

Variety

Vegetation class A

or land use A

Vegetation class B

or land use B

Vegetation class C

or land use C

Output is:

a) grid with same value in each cell for a given zone

b) table with values by zone



Geostatistics basics
Geostatistics Basics

Parametric Stats

UnivariateMultivariate

Spatial Stats

UnivariateMultivariate

mean

variance

x

semi-variance

lag correlation

lag covariance

x, h

h = lag (time or space)

cross-semivariance (variogram)

cross correlation ||inverse

cross covariance (correlogram)

x, y, h

correlation

covariance

x, y


Semi variance g h

Local mean

w.r.t study

extent

N

Variance: s2 = S (xi – x )2

i=1

N

Nh

Semi-variance: gh = S (xi – xi+h )2

i=1

2Nh

Xi+h

Xi

Semi-variance gh

  • Slide x through space to get gh

  • Vary h


Semi variance g h1

Local mean

Nh

Semi-variance: gh = S (xi – xi+h )2

i=1

2Nh

h = 1….Nh = 9

h = 5….Nh = 5

Semi-variance gh

Number of cells N = 10

Number of windows Nh = # cells – h

Xi+h

Xi

Limit h to 1/3 of study extent


Semi variogram

Next x

Nh

Semi-variance: gh = S (xi – xi+h )2

i=1

2Nh

Sill

gh

Nugget

spatial

dependence

independence

0

h

Range

Semi-variogram

If xi is similar to xi+h , gh is small, and they are spatially correlated

If xi is not similar to xi+h , gh is large, and they are not spatially correlated

=> gh measures heterogeneity

Nugget – value of gh at distance 0 (not in data) – measure of unexplained variability

Range – distance h of leveling off – below range heterogeneity is increasing in a predictable manner, above range, heterogenity is constant – measure of independence

Sill – measure of maximum heterogeneity in data (gmax)


Semi variograms
Semi-variograms

periodic, cyclic

gradient, no sill or range

gh

gh

0

0

h

h

Examples: timber harvest, forest age

range ~ harvest area

sill ~ rotation


Lag covariance geary s c

Xi-h

Lag Covariance: Geary’s C

Centered around mean values of x, x

Local mean

Nh

Lag covariance: Ch = S (xi – xi-h )(xi – xi+h )

i=1

Nh

Xi+h

Xi

If x, xi+h and xi-h are all the same, Ch = 0

If values are increasing or decreasing through space (xi-h < x < xi+h, or xi-h > x > xi+h, 1 term is negative and Ch = negative, things are not similar. Otherwise positive, things are similar

Correlograms have the inverse shape of semi-variograms


Lag correlation moran s i

Nh

Lag covariance: Ch = S (xi – xi-h )(xi – xi+h )

i=1

Nh

Lag correlation Ph = Ch

Sx-h Sx+h

Lag Correlation: Moran’s I

Centered around mean values of x, x

Standardized against sample variation


Comparison

gh

0

h

Comparison

Correlated Independent

Semi-variance gh 0 < Gh <

Lag Covariance Geary’s C Ch - < Ch <

Lag Correlation Moran’s I Ph -1 < Ph < +1

+1

zero

Ch

Ph

-

-1

0

0

h

h

similar h

range



Surface Analysis

  • Spatial distribution of surface information in terms of a three-dimensional structure

    Surfaces do not have to be elevation, but could be population density, species richness, or any other measured attributed


Surface analysis

Kriging

  • Uses semi-variogram to determine relative importance (weighting) of data at different distances

  • Uses global variation, only works well if semi-varigram captures variation across entire map

Trend analysis

Spline

  • Calculates a best-fit polynomial equation using linear regression

  • Recalculates all positions using equation (lose original data)

  • Smoothing depends on polynomial order

  • Calculates a 2-D minimum curvature surface that passes through every input point

Surface Analysis

Given geolocated point data, calculate values at regular intervals between points

Inverse distance weighting

  • Can’t create extremes (ridges, valleys)

  • Isotropic influence (not ridge preserving)

  • Best with dense samples



Network Analysis

  • Designed specifically for line features organized in connected networks, typically applies to transportation problems and location analysis

  • Streams

  • Dispersal vectors

  • Community interactions


Network analysis
Network Analysis

  • Pathfinding: shortest or least cost

  • Allocation of network areas to a center based on supply, demand and impedance

  • Connectivity


Integrated analysis
Integrated Analysis

Gauge

Points

Field Data

(Vector)

DEM

Watershed

Hydro

Model

Samples

Grid

Process

Land

Cover

Statistics

Soil

Modeling-

regression,

et al.



Sampling
Sampling

  • Spatial dependency must be considered in sample design

    • Non-independent observations

    • Fewer degrees of freedom

    • Differences within groups will appear small => over estimate significance of between group variation

    • Spatial structure & heterogeneity can affect experimental results – response due to treatments or due to inherent spatial structure?

  • Solutions:

    • include space as an explanatory variable (Mantel test)

    • Sample at greater distance than the variogram range


Excel File

Sample 1, lat, long, species, presence

Access File

Sample 3, lat, long, species, absence

Vegetation cover type

Sample 2, lat, long, species, presence

Integrated data:

Elevation (m)

P, juniper, 2200m, 16C

P, pinyon, 2320m, 14C

A, creosote, 1535m, 22C

Mean annual temperature (C)

Example: Integrating Species Occurrence Points and Images

  • Semantics

  • Compatible scales

  • Reproject

  • Resample grain

  • Clip extent

  • Sample occurrence points


Enm results

Geographic patterns of species richness

of 17 native rodent species.

Sanchez-Cordero and Martinez-Meyer, 2000

ENM Results

Model building and testing.

a) training data; b) predictive model.

Peterson, Ball and Cohoon, 2002


ad