Dtc archive data repositories in the fight against diffuse pollution
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

DTC Archive: data repositories in the fight against diffuse pollution PowerPoint PPT Presentation


  • 82 Views
  • Uploaded on
  • Presentation posted in: General

DTC Archive: data repositories in the fight against diffuse pollution. Mark Hedges, Richard Gartner: King’s College London Mike Haft, Hardy Schwamm: Freshwater Biological Association. Open Repositories 2012, Edinburgh, Scotland/UK, 10 th July 2012. A message from our sponsors.

Download Presentation

DTC Archive: data repositories in the fight against diffuse pollution

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Dtc archive data repositories in the fight against diffuse pollution

DTC Archive: data repositories in the fight against diffuse pollution

Mark Hedges, Richard Gartner: King’s College London

Mike Haft, Hardy Schwamm: Freshwater Biological Association

Open Repositories 2012, Edinburgh, Scotland/UK, 10th July 2012


A message from our sponsors

A message from our sponsors

  • Collaboration between the Freshwater Biological Association and King’s College London (Centre for e-Research)

  • Funded by DEFRA (Department for the Environment, Food and Rural Affairs)

    • A UK government ministry

  • Runs from Jan. 2011 – Dec. 2014


Dtc archive data repositories in the fight against diffuse pollution

Background: water quality and the DTC project


Diffuse pollution what is it

Diffuse Pollution – what is it?

  • Pollution processes that:

    • Individually, have minimal effect

    • Cumulatively, have significant impact

  • Some examples:

    • Run-off of water/rain (e.g. from road, commercial properties)

    • Farm fertilisers and waste

    • Seepage from developed landscapes


Catchments what are they

Catchments – what are they?


Water framework directive

Water Framework Directive

  • What is an EU Directive?

    • An EU Directive is a European Union legal instruction or secondary European legislation which is binding on all Member States but which must be implemented through national legislation within a prescribed time-scale.

  • Water Framework Directive concerns water quality

  • Freshwater (rivers, lakes, groundwater,) adversely affected by diffuse pollution

  • Failure to comply means problems!


Dtc project

DTC Project

  • DTC = Demonstration Test Catchment

  • Investigate measures for reducing impact of diffuse water pollution on ecosystems

  • Evaluate the extent to which on-farm mitigation measures can reduce impact of water pollution on river ecology

    • cost-effectively

    • maintaining food production capacity


Dtc archive data repositories in the fight against diffuse pollution

DefraDemonstration Test Catchments (DTCs)

3 catchment areas in England selected for tests


How does the dtc project work

How does the DTC project work?

  • The procedure is (roughly speaking):

    • Monitor various environmental markers

    • Try out mitigation measures

    • Analyse changes in baseline trends of markers in response to these measures

  • All this produces a great variety of data

  • The DTCs create data, the DTC Archive project has to make it usable and useful!


Dtc archive data repositories in the fight against diffuse pollution

Equipment for data capture

Bank-side water-quality monitoring station

Drilling a borehole for monitoring groundwater

Images thanks to Wensum DTC


Dtc archive data repositories in the fight against diffuse pollution

Mains power

LHS view

RHS view

Nitrate probe

Ammonium analyser

ISCO automatic water sampler

Pump

Flow cell

YSI multi-parameter sonde

Meteor telemetry unit

Total P and Total reactive P analyser

Bank-side water-quality monitoring station [Image from Wensum DTC]


Dtc archive data repositories in the fight against diffuse pollution

DTC Archive


Purpose of the archive

Purpose of the archive

  • Curating data generated and captured by DTC projects

  • DTCs create data, we have to make it useful!

  • Data archive, but also querying, browsing, visualising, analysing, other interactions

  • Integrated views across diverse data

  • Need to meet needs of different users – researchers, also land managers, civil servants, planners, ...


The data

The Data

  • Mostly numerical in some form: spreadsheets, databases, CSV files

    • Sensor data (automated, telemetry)

    • Manual samples/analyses

  • Species/ecological data

  • Geo-data

  • Also less highly structured information:

    • Time series images, video

    • Stakeholder surveys

    • Unstructured documents


Example water quality data

Example: water quality data

61,752 data points per year for all stations


Example weather station data

Example: weather station data


Example field use data

Example: Field Use Data


Challenges of data

Challenges of data

  • Not primarily an issue of scale

  • Datasets diverse in terms of structure

  • Different degrees of structuring:

  • Highly structured (e.g. sensor outputs)

  • Highly unstructured (e.g. surveys, interviews)

    • Different types of structure (tables of data, geospatial)

    • Some small, hand-crafted data sets.

  • Idiosyncratic metadata, description, vocabularies

  • Varying provenance and reliability


  • Inspire

    INSPIRE

    • Another EU directive 

    • An Infrastructure for Spatial Information in the European Community

      • Create a European Spatial Data Infrastructure for improved sharing of spatial information

    • Includes standards for describing, representing, disseminating geo-spatial data, e.g.

      • Gemini2 for catalogue metadata

      • GML (Geography Markup Language)

    • Builds on ISO standards (ISO 19100 series)


    Generic data model

    Generic Data Model

    ISO 19156:Observation & Measurements


    Multiple data representations

    Multiple Data Representations

    Generic data model implemented in several ways for different purposes:

    • Archival representation

  • based on library/archive standards

    • Data representation for data integration

  • “Atomic” representation as triples

  • Various derived representations

  • Generated for input to specific tools/analysis


  • Archival data representation

    Archival Data Representation


    Model for integration

    Model for Integration

    • RDF triples

    • Atomic statements forming network of node/relations

    • Discrete datasets mapped into common format

    Subject

    Object

    predicate

    Identified by URIs

    predicate

    Species

    Genus

    memberOf

    Literal value

    hasCommonName

    Water flea


    Example dataset

    Example dataset

    Tarn

    Name

    English Lake District rainfall dataset – from FISH.Link project

    CollectionMethod

    Location

    GridReference

    Easting

    Northing

    Latitude

    Longitude

    Dataset

    Site

    Name

    Actor

    ObservationSet

    About:Rainfall

    Type:Raw

    Unit:Inch

    ObservationSet

    About:Rainfall

    Type:Raw

    Unit:Inch

    ObservationSet

    About:Rainfall

    Type:Derived

    Unit:mm

    DependsOn: OS1, OS2

    Duration: 1Day

    ObservationSet

    About:Rainfall

    Type:Derived

    Unit:mm

    DependsOn: OS1, OS2

    Duration: 1Day

    Observation

    StartDate:

    EndDate

    Value:

    Observation

    StartDate:

    EndDate

    Value:

    Observation

    StartDate:

    EndDate

    Value:

    Observation

    StartDate:

    EndDate

    Value:


    Dataset capture and mapping

    Dataset capture and mapping

    • Columns, concepts, entities mapped to formal vocabularies

    • Mappings defined in archive objects

    • Automated

      • e.g. sensor output files

  • Computer-assisted

    • e.g. some spreadsheets

  • Manual

    • by domain experts

    • e.g. mark up values in texts

  • Spreadsheet transformation workflow – from FISH.Link project


    Architectural overview

    Architectural Overview

    Browsing

    Visualisation

    Search

    Analysis

    Mappings

    RDF triples

    Mappings

    Archive Objects

    Source datasets


    Current status and next steps

    Current Status and Next Steps

    • Archive project started Jan. 2011, runs till end 2014.

    • Datasets are already being generated in large quantities.

    • Prototype functionality

    • Modelling and Ingestion of data (incremental)

    • Next steps:

      • Extend types of dataset covered.

      • User interactions (queries, visualisation etc.)


    Thank you

    Thank you

    [email protected]

    [email protected]

    http://dtcarchive.org/


  • Login