dtc archive data repositories in the fight against diffuse pollution
Download
Skip this Video
Download Presentation
DTC Archive: data repositories in the fight against diffuse pollution

Loading in 2 Seconds...

play fullscreen
1 / 28

DTC Archive: data repositories in the fight against diffuse pollution - PowerPoint PPT Presentation


  • 121 Views
  • Uploaded on

DTC Archive: data repositories in the fight against diffuse pollution. Mark Hedges, Richard Gartner: King’s College London Mike Haft, Hardy Schwamm: Freshwater Biological Association. Open Repositories 2012, Edinburgh, Scotland/UK, 10 th July 2012. A message from our sponsors.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' DTC Archive: data repositories in the fight against diffuse pollution' - gali


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
dtc archive data repositories in the fight against diffuse pollution

DTC Archive: data repositories in the fight against diffuse pollution

Mark Hedges, Richard Gartner: King’s College London

Mike Haft, Hardy Schwamm: Freshwater Biological Association

Open Repositories 2012, Edinburgh, Scotland/UK, 10th July 2012

a message from our sponsors
A message from our sponsors
  • Collaboration between the Freshwater Biological Association and King’s College London (Centre for e-Research)
  • Funded by DEFRA (Department for the Environment, Food and Rural Affairs)
    • A UK government ministry
  • Runs from Jan. 2011 – Dec. 2014
diffuse pollution what is it
Diffuse Pollution – what is it?
  • Pollution processes that:
    • Individually, have minimal effect
    • Cumulatively, have significant impact
  • Some examples:
    • Run-off of water/rain (e.g. from road, commercial properties)
    • Farm fertilisers and waste
    • Seepage from developed landscapes
water framework directive
Water Framework Directive
  • What is an EU Directive?
    • An EU Directive is a European Union legal instruction or secondary European legislation which is binding on all Member States but which must be implemented through national legislation within a prescribed time-scale.
  • Water Framework Directive concerns water quality
  • Freshwater (rivers, lakes, groundwater,) adversely affected by diffuse pollution
  • Failure to comply means problems!
dtc project
DTC Project
  • DTC = Demonstration Test Catchment
  • Investigate measures for reducing impact of diffuse water pollution on ecosystems
  • Evaluate the extent to which on-farm mitigation measures can reduce impact of water pollution on river ecology
    • cost-effectively
    • maintaining food production capacity
slide8

DefraDemonstration Test Catchments (DTCs)

3 catchment areas in England selected for tests

how does the dtc project work
How does the DTC project work?
  • The procedure is (roughly speaking):
    • Monitor various environmental markers
    • Try out mitigation measures
    • Analyse changes in baseline trends of markers in response to these measures
  • All this produces a great variety of data
  • The DTCs create data, the DTC Archive project has to make it usable and useful!
slide10

Equipment for data capture

Bank-side water-quality monitoring station

Drilling a borehole for monitoring groundwater

Images thanks to Wensum DTC

slide11

Mains power

LHS view

RHS view

Nitrate probe

Ammonium analyser

ISCO automatic water sampler

Pump

Flow cell

YSI multi-parameter sonde

Meteor telemetry unit

Total P and Total reactive P analyser

Bank-side water-quality monitoring station [Image from Wensum DTC]

purpose of the archive
Purpose of the archive
  • Curating data generated and captured by DTC projects
  • DTCs create data, we have to make it useful!
  • Data archive, but also querying, browsing, visualising, analysing, other interactions
  • Integrated views across diverse data
  • Need to meet needs of different users – researchers, also land managers, civil servants, planners, ...
the data
The Data
  • Mostly numerical in some form: spreadsheets, databases, CSV files
    • Sensor data (automated, telemetry)
    • Manual samples/analyses
  • Species/ecological data
  • Geo-data
  • Also less highly structured information:
    • Time series images, video
    • Stakeholder surveys
    • Unstructured documents
example water quality data
Example: water quality data

61,752 data points per year for all stations

challenges of data
Challenges of data
    • Not primarily an issue of scale
    • Datasets diverse in terms of structure
    • Different degrees of structuring:
  • Highly structured (e.g. sensor outputs)
  • Highly unstructured (e.g. surveys, interviews)
    • Different types of structure (tables of data, geospatial)
    • Some small, hand-crafted data sets.
  • Idiosyncratic metadata, description, vocabularies
  • Varying provenance and reliability
inspire
INSPIRE
  • Another EU directive 
  • An Infrastructure for Spatial Information in the European Community
    • Create a European Spatial Data Infrastructure for improved sharing of spatial information
  • Includes standards for describing, representing, disseminating geo-spatial data, e.g.
    • Gemini2 for catalogue metadata
    • GML (Geography Markup Language)
  • Builds on ISO standards (ISO 19100 series)
generic data model
Generic Data Model

ISO 19156:Observation & Measurements

multiple data representations
Multiple Data Representations

Generic data model implemented in several ways for different purposes:

    • Archival representation
  • based on library/archive standards
    • Data representation for data integration
  • “Atomic” representation as triples
  • Various derived representations
  • Generated for input to specific tools/analysis
model for integration
Model for Integration
  • RDF triples
  • Atomic statements forming network of node/relations
  • Discrete datasets mapped into common format

Subject

Object

predicate

Identified by URIs

predicate

Species

Genus

memberOf

Literal value

hasCommonName

Water flea

example dataset
Example dataset

Tarn

Name

English Lake District rainfall dataset – from FISH.Link project

CollectionMethod

Location

GridReference

Easting

Northing

Latitude

Longitude

Dataset

Site

Name

Actor

ObservationSet

About:Rainfall

Type:Raw

Unit:Inch

ObservationSet

About:Rainfall

Type:Raw

Unit:Inch

ObservationSet

About:Rainfall

Type:Derived

Unit:mm

DependsOn: OS1, OS2

Duration: 1Day

ObservationSet

About:Rainfall

Type:Derived

Unit:mm

DependsOn: OS1, OS2

Duration: 1Day

Observation

StartDate:

EndDate

Value:

Observation

StartDate:

EndDate

Value:

Observation

StartDate:

EndDate

Value:

Observation

StartDate:

EndDate

Value:

dataset capture and mapping
Dataset capture and mapping
  • Columns, concepts, entities mapped to formal vocabularies
  • Mappings defined in archive objects
  • Automated
      • e.g. sensor output files
  • Computer-assisted
      • e.g. some spreadsheets
  • Manual
      • by domain experts
      • e.g. mark up values in texts

Spreadsheet transformation workflow – from FISH.Link project

architectural overview
Architectural Overview

Browsing

Visualisation

Search

Analysis

Mappings

RDF triples

Mappings

Archive Objects

Source datasets

current status and next steps
Current Status and Next Steps
  • Archive project started Jan. 2011, runs till end 2014.
  • Datasets are already being generated in large quantities.
  • Prototype functionality
  • Modelling and Ingestion of data (incremental)
  • Next steps:
    • Extend types of dataset covered.
    • User interactions (queries, visualisation etc.)
ad