1 / 33

Geospatial Data and Spatial Data Analysis Tools For Ecologists

Geospatial Data and Spatial Data Analysis Tools For Ecologists. University of California – Santa Barbara www.nceas.ucsb.edu Rick Reeves / March 17, 2005. Presentation Goals. Overview: Geospatial Data Analysis Defining and distinguishing between spatial, geospatial, geographic data

gitano
Download Presentation

Geospatial Data and Spatial Data Analysis Tools For Ecologists

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geospatial Data and Spatial Data Analysis Tools For Ecologists University of California – Santa Barbara www.nceas.ucsb.edu Rick Reeves / March 17, 2005

  2. Presentation Goals • Overview: Geospatial Data Analysis • Defining and distinguishing between spatial, geospatial, geographic data • Addressing the particular attributes of geospatial data • Inventory of Geospatial Data Types • Primary data types and common sources for data • Survey of Geoprocessing Software Tools • Key issues driving choice of geospatial processing software • A Tour of NCEAS Scientific Computing Web Site • Spatial Datasets, Tools, Tutorials, and Project Archives • Some Examples: Geospatial Data Analysis at NCEAS • From the Annals of the NCEAS Scientific Programmer: ‘Real World’ solutions to Ecological research challenges

  3. Meet the Scientific Programmer • Rick’s Academic and Professional Background • Undergraduate: Environmental Remote Sensing • Graduate: Spatial Operations Research / Location-Allocation Heuristic Development • Spatial Modeling branch of Geographic Data Analysis • Problem Domain: Transportation and Facility Location within networks • Professional: Software Development, geospatial database development, training curriculum development

  4. Spatial Data: A Hierarchical Definition • Spatial Data • Observations are distributed in multidimensional space • X / Y / Z coordinates attached to each data element • Geospatial Data • Spatial Data with attached Geographic coordinates • Latitude / Longitude, UTM • Optional: data subjected to a map projection transformation • Geographic Data • Geospatial Data that captures ‘Earth System’ phenomena • Terrain height • Drainage Network • Land surface cover or urban Land Use • Meteorological / climate data forecasts • Ecologists may work with any or all during a project

  5. Overview: Geospatial / Geographic Data • Two Broad Primary Categories • Raster: A multi dimensional, regularly-spaced grid of values (samples) • Dimensions: Northing, Easting, Altitude, Time • Examples: Satellite Image, Digital Terrain, land surface cover maps • Vector: Three primary shapes stored in drawing-optimized format • Point, Line, Polygon, (TIN, vector field) • Thousands of datasets exist in hundreds of formats • Remote Sensing Imagery / Digital Elevation Models • Surface Features (political, physiographic) as points/lines/polygons • Meteorological data (observed / forecasted (short-and long-term)) • File format standards set by Industry, Government, user community • Data Ingestion: First Step in Geospatial Analysis • Data input / format conversion / spatial registration

  6. Geospatial Data Analysis • Geospatial Information Analysis: 3 Categories • From O’Sullivan & Unwin (2003) • Spatial Data Manipulation:Investigate the relationships between geographic dataset layers • Examples: ‘point-in-polygon’, buffer zones around spatial features • GIS software typically used to view/ manipulate / create layers • Spatial/Statistical Data Analysis: Descriptive and Explanatory: What is there? How do we categorize it? • Data points treated as statistical ‘population’, compared to others • Spatial Modeling: Construct models to explore and understand geospatial systems • Based on ‘abstraction’ of domain-specific problem into a systems framework. Some examples: • Predicting network flows; optimizing facility locations among demands • Lessons learned building model as valuable as model’s ‘answers’

  7. The Challenge of Geospatial Analysis • Geospatial Data violate some key statistical assumptions • Must be addressed in the experimental design and sampling scheme • Require specialized assessment techniques to factor out effects • Spatial Autocorrelation • Samples are NOT randomly selected from normally-distributed population • In fact, nearby samples more likely to be similar than distant ones • Autocorrelated data points introduce redundancy into the sample set • Spatial Scaling • AKA Modifiable Areal Unit Problem • Statistical relationships in an area may change at different aggregations • The placement of sampling grid can introduce artifacts • Nonuniform sampling space, edge effects • Geospatial Data Attributes have explanatory power • Spatial relationships may be causes for observed phenomena

  8. Selecting Geospatial Software Tools • Geospatial software: layered software architecture • Data layer: Efficiently store geospatial data • Feature Set + spatial coordinates • Analytic Layer: Spatial/statistical analysis algorithms • Statistical packages increasingly contain geospatial analysis tools • Visualization Layer: Creates data views (AKA maps) • Geospatial tools broadly divided in two categories • Geographic Information Systems (GIS) • Three software layers are each extensive, ‘feature rich’ • Geospatial Analysis Packages • Data layer is ‘thinner’, Analytic layer ‘thicker’ • Visualization layer built on existing data plotting tools

  9. Geospatial Software Tools: GIS ‘Value Added’ • Data layer is optimized for efficient geospatial data storage/processing • Raster and Vector Data storage, ‘mixed mode’ operations • Georeferencing tools for data layer projection, spatial registration • Map Algebra tools foster analysis and creation of data layers • Comprehensive cartographic tools for output map design

  10. Geospatial Software Tools: GIS Caveats • Underdeveloped geostatistical processing tools • Vendors pressured to include them in product • Yet validation data and algorithm details not available • Often, these are critical tools for ecological analysis • Steep Learning Curve • Identifying, mastering ‘essential’ features a challenge • Cost: GIS Software can be expensive • Upfront purchase and yearly license fees • Time investment in training and data maintenance • Workload • If non-GIS must be used for part of analysis, time must be spent moving between s/w packages

  11. Geospatial Software Tools: GIS Caveats • Underdeveloped geostatistical processing tools • Vendors pressured to include them in product • Yet validation data and algorithm details not available • Often, these are critical tools for ecological analysis • Steep Learning Curve • Identifying, mastering ‘essential’ features a challenge • Cost: GIS Software can be expensive • Upfront purchase and yearly license fees • Time investment in training and data maintenance • Workload • If non-GIS must be used for part of analysis, time must be spent moving between s/w packages

  12. Geospatial Software Tools: Choosing • Some Suggested Selection Criteria • Research Objectives should drive choice of tools • Identify the project’s core geospatial processing needs • Platform Flexibility • Select tools supported on multi-platforms (hardware/OpSys) • Widely supported/used platforms foster collaberation • Solution ‘Visibility’ • Can you obtain the details of the algorithm? • Does the community recognize the accuracy of the algorithm? • Costs of implementing your research idea in software • Scripted solutions using integrated environments are best • R, SAS, MATLAB • Avoid development in high-level programming languages

  13. Geospatial Software Tools: Choosing • Select GIS for core needs: • Construct, compare, create multiple spatial data layers • Simultaneously analyzing vector and raster data • Creating detailed production quality study site maps • Your data is exclusively in the GIS product format • You require spatial analysis tools unavailable outside GIS • Select Geospatial Analysis tools for core needs: • Spatial/Statistical data analysis is the focus • Your mapping requirements are modest • two-dimensional data plots with geographic coordinates, legend • You need in-depth understanding of algorithms used • Or, you wish to extend / modify the algorithms

  14. Sources for Geospatial Software Tools • Commercial Software Products • For-profit corporations sell or license their software • Major players produce comprehensive products • ESRI ArcGIS is the dominant GIS vendor • Their goal: Provide solution for every geospatial application • Other vendors offer tailored solutions • Examples: ENVI / IDL, ERDAS: Remote Sensing oriented GIS • Example: S Plus Spatial Statistics: Geospatial statistics and spatial data visualization enhancements to statistical package • Example: MATLAB has mapping and image processing toolkits • Example: SAS offers GIS, geospatial software tools • Commercial products often drive geospatial data formats • Example: ESRI Shape File, ERDAS IMG file

  15. Sources for Geospatial Software Tools • Open Source Software • Broad-based effort by worldwide scientific and research community • Distributed under General Public License (GPL) • Software development and maintenance by the user community • Most significant geospatial analysis products: R, GRASS GIS • Examples of others: PostGIS, GDAL libraries • Visit FreeGIS.org, or the open software foundation sites.

  16. Tradeoffs: Commercial GIS Software • Centralized documentation and product support….. • At a price of $100s to $1000s per year • Comprehensive, integrated software product • Data/Analytic/Visualization layers populated w/ features • Steep learning curve: Where are my ‘essential features?’ • Training always available – at a cost…. • Details of proprietary geospatial algorithms usually unavailable

  17. Tradeoffs: Open Source GIS Software • Open Source Software • Distributed under General Public License (GPL) • Software development and maintenance by the user community • Most significant geospatial analysis products: R, GRASS GIS • Many applications available via the Internet but…. • Quality, features, support, and documentation are inconsistent • Algorithms and even source code are freely available • Open Source software drawbacks are shrinking as user support community evolves and matures • But active participation in the community is advised for those wishing to stay technically proficient

  18. Sources for Geospatial Data • Government Agencies • National Mapping and Survey Agencies: surface cover data • USGS • Research Centers: Climate forecasting models • NOAA, NASA, NCDC • For-Profit Corporations • The highest-quality UNCLASSIFIED imagery now acquired by the private sector • Sometimes, no-cost government data is resold to public • Data widely available via the Internet • Many data sets available at no- or low-cost • Notable Exception: Satellite Remote Sensing data • Some discounts available to education and/or research entities • The best sites allow ‘search by geographic coordinates’ • Examples from NCEAS Scientific Computing web site

  19. Popular Geospatial Data Formats • Meteorological and Climatalogical Data • Historical measurements • Short-term model-based forecasts (3 – 10 days from now) • Long-term predictions (10 – 100 years): General Circulation Models • Widely-Used Formats: Gridded Binary (GRIB), NetCDF • Political and Physiographic features • Country Boundaries • Road Networks • Drainage Networks • Widely-Used Formats: Digital Line Graphs (DLG), ESRI Shape Files (.shp) • Most GIS/Geospatial packages ingest these formats • Or conversion utilities are available to ingest them

  20. Popular Geospatial Data Formats • Remote Sensing Imagery • Many operational systems provide many kinds of images • Multispectral Imagery: Landsat, SPOT, IKONOS • Data Formats tend to be sensor-specific • Most GIS can ingest most imagery types • Portal sites Commercial: http://www.vterrain.org/Imagery/commercial.html Govt: http://www.nationalgeographic.com/maps/map_links.html • Digital Terrain Models • Raster Grid datasets containing elevation measurements • Available for complete Earth land surface • Primary format: USGS Digital Elevation Model (DEM) • AKA National Elevation Dataset (NED) • Portal sites: USGS: http://gisdata.usgs.net/Website/Seamless/ Terrainmap.org: http://www.terrainmap.org/

  21. Tour of the Scientific Computing Web Site • Links to Data Sources • Links to Geospatial Software Sources • Links to Tutorials and Research Papers • Archive of NCEAS Research Projects http://www.nceas.ucsb.edu/scicomp

  22. Example: Spatial Modeling: Optimization • Route vehicles along network using environmental costs as a metric • Simultaneously locate facilities along shipment routes that mitigate environmental costs • Optimal Location of species reserve sites • Develop and compare performance of alternate solution methods • Mathematically optimal but operationally impractical • Heuristicallyderived Near-optimal, usable solution

  23. Spatial Modeling: The Problem Domain

  24. Geospatial Dataset: Routes + Locations

  25. Spatial Model Solution: Alternative Methods

  26. Selecting Species Reserves Locations Dr. Ross Gerrard, UCSB Biogeography Lab, 1996

  27. Example: Spatial Data Manipulation • Elevation zone threshold calculation • Digital Elevation Models for selected worldwide sites • Classify sites into 100 meter ‘wide’ elevation zones • General Circulation Model climate data extraction • Identify, obtain, import GCM data files • Import the data into GIS as raster grid • Overlay point file, extract matching climate values

  28. Digital Elevation Data Ingestion / Clipping

  29. Elevation Zone Data Analysis

  30. General Circulation Model data extraction

  31. Spatial Analysis: Arc GIS and R Platforms • ESRI Shape files exported to the R programming environment • R Geostatistical and Spatial Analysis methods can then be applied

  32. A Sampling: R Geospatial Analysis packages • clim.pact: Climate data analysis and downscaling tools • GeoR: Geostatistical Data Analysis: variograms, et. al • maptools: read/manipulate polygon data (ESRI .shp) • shapefiles: read/manipulate ESRI shape files • sgeostat: Geostatistical modeling code • splancs: Spatial and space-time point patterns • spstat: Spatial Point Pattern analysis

  33. Concluding thoughts • NCEAS Associates are extensively use geospatial data in many creative ways • Geospatial Data Analysis requires specialized techniques • GIS and geospatial analysis available from commercial vendors and open source community • Choosing geospatial data and tools can be overwhelming and distract from the primary ‘science mission’ • Scientific Programming Team has geospatial expertise, and can assist NCEAS Associates in this domain • Coming soon: Short course on the R Programming Language!

More Related