Dtc archive data repositories in the fight against diffuse pollution
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

DTC Archive: data repositories in the fight against diffuse pollution PowerPoint PPT Presentation


  • 87 Views
  • Uploaded on
  • Presentation posted in: General

DTC Archive: data repositories in the fight against diffuse pollution. Mark Hedges, Richard Gartner: King’s College London Mike Haft, Hardy Schwamm: Freshwater Biological Association. Open Repositories 2012, Edinburgh, Scotland/UK, 10 th July 2012. A message from our sponsors.

Download Presentation

DTC Archive: data repositories in the fight against diffuse pollution

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


DTC Archive: data repositories in the fight against diffuse pollution

Mark Hedges, Richard Gartner: King’s College London

Mike Haft, Hardy Schwamm: Freshwater Biological Association

Open Repositories 2012, Edinburgh, Scotland/UK, 10th July 2012


A message from our sponsors

  • Collaboration between the Freshwater Biological Association and King’s College London (Centre for e-Research)

  • Funded by DEFRA (Department for the Environment, Food and Rural Affairs)

    • A UK government ministry

  • Runs from Jan. 2011 – Dec. 2014


Background: water quality and the DTC project


Diffuse Pollution – what is it?

  • Pollution processes that:

    • Individually, have minimal effect

    • Cumulatively, have significant impact

  • Some examples:

    • Run-off of water/rain (e.g. from road, commercial properties)

    • Farm fertilisers and waste

    • Seepage from developed landscapes


Catchments – what are they?


Water Framework Directive

  • What is an EU Directive?

    • An EU Directive is a European Union legal instruction or secondary European legislation which is binding on all Member States but which must be implemented through national legislation within a prescribed time-scale.

  • Water Framework Directive concerns water quality

  • Freshwater (rivers, lakes, groundwater,) adversely affected by diffuse pollution

  • Failure to comply means problems!


DTC Project

  • DTC = Demonstration Test Catchment

  • Investigate measures for reducing impact of diffuse water pollution on ecosystems

  • Evaluate the extent to which on-farm mitigation measures can reduce impact of water pollution on river ecology

    • cost-effectively

    • maintaining food production capacity


DefraDemonstration Test Catchments (DTCs)

3 catchment areas in England selected for tests


How does the DTC project work?

  • The procedure is (roughly speaking):

    • Monitor various environmental markers

    • Try out mitigation measures

    • Analyse changes in baseline trends of markers in response to these measures

  • All this produces a great variety of data

  • The DTCs create data, the DTC Archive project has to make it usable and useful!


Equipment for data capture

Bank-side water-quality monitoring station

Drilling a borehole for monitoring groundwater

Images thanks to Wensum DTC


Mains power

LHS view

RHS view

Nitrate probe

Ammonium analyser

ISCO automatic water sampler

Pump

Flow cell

YSI multi-parameter sonde

Meteor telemetry unit

Total P and Total reactive P analyser

Bank-side water-quality monitoring station [Image from Wensum DTC]


DTC Archive


Purpose of the archive

  • Curating data generated and captured by DTC projects

  • DTCs create data, we have to make it useful!

  • Data archive, but also querying, browsing, visualising, analysing, other interactions

  • Integrated views across diverse data

  • Need to meet needs of different users – researchers, also land managers, civil servants, planners, ...


The Data

  • Mostly numerical in some form: spreadsheets, databases, CSV files

    • Sensor data (automated, telemetry)

    • Manual samples/analyses

  • Species/ecological data

  • Geo-data

  • Also less highly structured information:

    • Time series images, video

    • Stakeholder surveys

    • Unstructured documents


Example: water quality data

61,752 data points per year for all stations


Example: weather station data


Example: Field Use Data


Challenges of data

  • Not primarily an issue of scale

  • Datasets diverse in terms of structure

  • Different degrees of structuring:

  • Highly structured (e.g. sensor outputs)

  • Highly unstructured (e.g. surveys, interviews)

    • Different types of structure (tables of data, geospatial)

    • Some small, hand-crafted data sets.

  • Idiosyncratic metadata, description, vocabularies

  • Varying provenance and reliability


  • INSPIRE

    • Another EU directive 

    • An Infrastructure for Spatial Information in the European Community

      • Create a European Spatial Data Infrastructure for improved sharing of spatial information

    • Includes standards for describing, representing, disseminating geo-spatial data, e.g.

      • Gemini2 for catalogue metadata

      • GML (Geography Markup Language)

    • Builds on ISO standards (ISO 19100 series)


    Generic Data Model

    ISO 19156:Observation & Measurements


    Multiple Data Representations

    Generic data model implemented in several ways for different purposes:

    • Archival representation

  • based on library/archive standards

    • Data representation for data integration

  • “Atomic” representation as triples

  • Various derived representations

  • Generated for input to specific tools/analysis


  • Archival Data Representation


    Model for Integration

    • RDF triples

    • Atomic statements forming network of node/relations

    • Discrete datasets mapped into common format

    Subject

    Object

    predicate

    Identified by URIs

    predicate

    Species

    Genus

    memberOf

    Literal value

    hasCommonName

    Water flea


    Example dataset

    Tarn

    Name

    English Lake District rainfall dataset – from FISH.Link project

    CollectionMethod

    Location

    GridReference

    Easting

    Northing

    Latitude

    Longitude

    Dataset

    Site

    Name

    Actor

    ObservationSet

    About:Rainfall

    Type:Raw

    Unit:Inch

    ObservationSet

    About:Rainfall

    Type:Raw

    Unit:Inch

    ObservationSet

    About:Rainfall

    Type:Derived

    Unit:mm

    DependsOn: OS1, OS2

    Duration: 1Day

    ObservationSet

    About:Rainfall

    Type:Derived

    Unit:mm

    DependsOn: OS1, OS2

    Duration: 1Day

    Observation

    StartDate:

    EndDate

    Value:

    Observation

    StartDate:

    EndDate

    Value:

    Observation

    StartDate:

    EndDate

    Value:

    Observation

    StartDate:

    EndDate

    Value:


    Dataset capture and mapping

    • Columns, concepts, entities mapped to formal vocabularies

    • Mappings defined in archive objects

    • Automated

      • e.g. sensor output files

  • Computer-assisted

    • e.g. some spreadsheets

  • Manual

    • by domain experts

    • e.g. mark up values in texts

  • Spreadsheet transformation workflow – from FISH.Link project


    Architectural Overview

    Browsing

    Visualisation

    Search

    Analysis

    Mappings

    RDF triples

    Mappings

    Archive Objects

    Source datasets


    Current Status and Next Steps

    • Archive project started Jan. 2011, runs till end 2014.

    • Datasets are already being generated in large quantities.

    • Prototype functionality

    • Modelling and Ingestion of data (incremental)

    • Next steps:

      • Extend types of dataset covered.

      • User interactions (queries, visualisation etc.)


    Thank you

    [email protected]

    [email protected]

    http://dtcarchive.org/


  • Login