Sciscope a data discovery and retrieval tool for environmental science
Download
1 / 17

SciScope : A Data Discovery and Retrieval Tool for Environmental Science - PowerPoint PPT Presentation


  • 56 Views
  • Uploaded on

SciScope : A Data Discovery and Retrieval Tool for Environmental Science. Bora Beran, Catharine van Ingen Microsoft Research. It is often useful and sometimes necessary for scientists and engineers to work with data from multiple sources… WHY?. For data variety. USGS. NOAA.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'SciScope : A Data Discovery and Retrieval Tool for Environmental Science' - wesley


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Sciscope a data discovery and retrieval tool for environmental science

SciScope: A Data Discovery and Retrieval Tool for Environmental Science

Bora Beran, Catharine van Ingen

Microsoft Research


Sciscope a data discovery and retrieval tool for environmental science
It is often useful and sometimes necessary for scientists and engineers to work with data from multiple sources…WHY?


For data variety
For data variety and engineers to work with data from multiple sources…

USGS

NOAA


For better spatial coverage
For better spatial coverage and engineers to work with data from multiple sources…

  • STORET has 758 sites in Texas, TCEQ has 8407.

  • STORET has 47,602 sites in Florida, NWIS has 27,906.

  • NWIS has 121,545 in Minnesota, STORET has 22,260.

TCEQ data from David Maidment, UTX-Austin


For better temporal coverage
For better temporal coverage and engineers to work with data from multiple sources…

1957-1977

1977-2003

2003-2007

Nitrogen


Data discovery retrieval today
Data Discovery/Retrieval today and engineers to work with data from multiple sources…

1 month later

What sites are in Russian River basin?

Rainfall vs. Runoff in Russian River


What do we need
What do we need? and engineers to work with data from multiple sources…

A search engine that creates a unified view over

multiple heterogeneous data repositories allowing

scientists to discover and retrieve data in a simple and intuitive

way.

In technical terms:

  • A searchable metadata repository/aggregator

  • A mediator (semantics/syntax/structure)

  • A light-weight web GIS


Sciscope a data discovery and retrieval tool for environmental science

www. and engineers to work with data from multiple sources…sciscope.org

+ Download data

Automated download

+ Add to favorites


Sciscope demo
SCISCOPE DEMO and engineers to work with data from multiple sources…


Sciscope stack
SciScope and engineers to work with data from multiple sources… Stack


Sciscope a data discovery and retrieval tool for environmental science

Microsoft Virtual Earth and engineers to work with data from multiple sources…

Geographical Features Catalog

  • A free JavaScript based mapping tool

  • Collection of features such as dams, aquifers, geologic formations, watersheds, sensors

  • Based on data and maps from USGS, EPA, National Atlas

Metadata Repository

  • Contains information on where, when (time frame) and what is being measured for about 1.7 million sites in the US

  • Scraped/crawled on a regular basis


Knowledge base
Knowledge base and engineers to work with data from multiple sources…

  • Relationships are stored as RDF triples in a relational database

  • Supports transitive, symmetric and inverse properties

  • Inferred statements are pre-computed

‘Escherichia coli’ = ‘E. coli’

‘E. coli’ is-a ‘Indicator Organism’

‘Nitrogen’ is-a ‘Macronutrient’

‘Macronutrient’ is-a ‘Nutrient’

‘Hypoxia’ isMeasuredUsing ‘DissolvedOxygen’

‘Hypoxia’ isRelatedTo ‘Eutrophication’


Inference
Inference and engineers to work with data from multiple sources…

  • Transitive

    ‘Nitrogen’ is-a ‘Macronutrient’

    ‘Macronutrient’ is-a ‘Nutrient’

    Inferred: ‘Nitrogen’ is-a ‘Nutrient’

  • Symmetric

    ‘Hypoxia’ isRelatedTo ‘Eutrophication’

    Inferred: ‘Eutrophication’ isRelatedTo ‘Hypoxia’

  • Inverse

    ‘Macronutrient’ is-a ‘Nutrient’

    Inferred: ‘Nutrient’ isBroaderThan ‘Macronutrient’


Definitions and related material
Definitions and related material and engineers to work with data from multiple sources…

  • Definitions of search terms from various glossaries

  • Pointers to relevant records in Integrated Taxonomic Information System (ITIS) and EPA Substance Registry System (SRS).


Data retrieval
Data retrieval and engineers to work with data from multiple sources…

  • SciScope currently hosts only metadata.

  • Data are requested on the fly from the original publisher using web service wrappers written specifically for each data source.

  • Data are reformatted to provide a unified view over the repositories.


Future outlook potential use cases
Future outlook & potential use cases and engineers to work with data from multiple sources…

  • SciScope as a data publishing/sharing platform

  • SciScope as a service

  • SciScope as a tool for general consumer use


Thank you

www.sci and engineers to work with data from multiple sources…scope.org

Thank you