1 / 24

GLOBE

GLOBE. Matthew D Schmill Lindsey Gordon, Erle Ellis, Nicholas Magliocca , Tim Oates University of Maryland, Baltimore County. Analytics for Assessing Global Representativeness. GLOBE: Enhancing Scientific Workflows.

diamond
Download Presentation

GLOBE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GLOBE Matthew D SchmillLindsey Gordon, Erle Ellis, Nicholas Magliocca, Tim Oates University of Maryland, Baltimore County Analytics for Assessing Global Representativeness

  2. GLOBE: Enhancing Scientific Workflows • The goal: accelerate and improve scientific workflows for land change science • Joint work with Wayne Lutters, Erle Ellis, Tim Oates, Penny Rheingans at University of Maryland, Baltimore County • IS, CSEE, GES • Supported by NSF’s Cyber-Enabled Discovery & Innovation program • Fourth and final year of the program • Centerpiece is the GLOBE system • Enabling better science through • Real-time statistical assessments, interactive geovisualization tools • Scientific collaboration platform

  3. Land Change Science • Study of interaction between human systems, ecosystems, the atmosphere, and other Earth Systems as mediated through human use of land. • Cross cuts many disciplines of social and natural science • Typified by this challenge: how to integrate and synthesize local studies to “globalized” results • Though GLOBE is targeted at Land Change Scientists • The concept of representativeness is a very general concern • The GLOBE system is appropriate to any discipline engaged in the synthesizing local studies into global results

  4. Representativeness • The degree to which a sample represents a global pattern • A converse to bias • A well-represented sample is not biased, a biased sample is not representative • Sampling bias: a typical criticism anywhere that samples are used to make inferences • A land change science example: • Are you representing only accessible sites? • Accessibility as a measure of travel time to a city (Nelson, 2008) • A measure of representativeness should be • Intuitive, understandable • Statistically sound

  5. Measures of Representativeness • Pearson’s Chi Square • Requires the variable space be discrete • Unreliable with small sample sizes • Kolmogorov-Smirnov Goodness-of-Fit Test • Does not require discrete space • Scaling and visualizing beyond 1d is hard • f-Divergence (Hellinger, Jensen-Shannon) • Requires discrete variable space

  6. Measures of Representativeness • Pearson’s Chi Square • Requires the variable space be discrete • Unreliable with small sample sizes • Kolmogorov-Smirnov • Does not require discrete space • Scaling and visualizing beyond 1d is hard • f-Divergence (Hellinger, Jensen-Shannon) • Requires discrete variable space • Probability Estimates • Chi Square – simple • Monte Carlo methods for the rest

  7. Representativeness Gives you Does not give you Any guidance on where to look to address sampling bias Any way to view this geographically • Quick metric for judging level of bias • Basis for comparing samples/sampling methods • A way to compute the probability of incorrectly concluding a sample is biased

  8. Representedness • The degree to which a location or member of the population is represented by the collection • The complement of representativeness • Useful for visualization and analysis • Heat maps that show geographically where gaps lie • Can be used as a basis for case study search to fill study gaps

  9. Computing Representedness Chi Square p-value of x2 times sign of between difference sample and population 1573mm/yr Difference in ECDF forpopulation versus sampleat unit datum Get datum for land unit(precipitation) KS Distance Locate datum in global distribution Compute representativeness for that value

  10. Computing Representedness discrete p-value of x2 times sign of between difference sample and population Compute RGB (heat map) 49.2m Difference in ECDF forpopulation versus sampleat unit datum Get datum for land unit continuous Locate datum in global distribution Compute representativeness for that value

  11. Addressing Bias Study Gap Search Case Weighting Addresses biases in statistical analysis by Over-weighting (> 1.0) cases in under-represented areas Under-weighting (< 1.0) cases in over-represented areas Computed using representedness • Identify areas where density in population is significantly higher than sample • Search case database using that criterion • Additional criteria available (fts, metadata)

  12. The GLOBE Application • Our platform for better Land Change Science • By improving workflows • As a social/collaborative platform • Formally introduced to GLP OSM in March 2014 • Features • Allows researchers to create and manage case studies and their geometry • Integrates global data layers to augment user cases • Provides real-time analytics and visual tools • Similarity search • Representativeness analysis

  13. Global Data • Organized into a Discrete Global Grid [Sahr, White, and Kimerling, 2003] • ISEA Aperature3, Hexagonal • 1.5M 96 km2 equal-area hexagons at resolution 12 (native GLOBE resolution) • Downsampled grid at resolution 10 (863.8 km2) for approximate calculations • Currently 75 global variables; variables can be processed and submitted to GLOBE • Human, remote sensing, biological, surface, climate

  14. GLOBE Cases

  15. GLOBE Cases • GLOBE GES team has georeferenced and entered 630 cases • Currently a total 927 georeferenced, completed cases

  16. Similarity Assessment

  17. Representativeness Analysis – Monte Carlo

  18. Representativeness Analysis – x2

  19. Representativeness Analysis – Gap Search

  20. In Summary • Representativeness an issue anywhere inferences are made from samples • Representedness a companion piece that enables geovisualization and gap search • Can be implemented many ways • Classical hypothesis test (x2) • Monte Carlo methods: f-divergence, KS-distance • GLOBE application enables representativeness workflow for land change science • Realtime assessment & visualizations • Gap search and case weighting

  21. In the Pipeline • Multidimensional Analysis • Quantifying the impact of data scarcity (small sample size) • Heuristic tools for guiding the user • Improved visual tools • Dimensionality reduction • Identifying if and when it is possible • Automated exploratory analysis • Helping the user to identify what analysis they should be running

  22. Thanks! • Visit us at http://globe.umbc.edu

  23. Representativeness Analysis – KS

  24. Conceptual Overview Global Data discrete global grid GLOBE GCE analytical & computational engine GLOBE Web App visual & interactive tools GLOBE Cases geography + data

More Related