Dtc archive data repositories in the fight against diffuse pollution
Download
1 / 28

DTC Archive: data repositories in the fight against diffuse pollution - PowerPoint PPT Presentation


  • 115 Views
  • Uploaded on

DTC Archive: data repositories in the fight against diffuse pollution. Mark Hedges, Richard Gartner: King’s College London Mike Haft, Hardy Schwamm: Freshwater Biological Association. Open Repositories 2012, Edinburgh, Scotland/UK, 10 th July 2012. A message from our sponsors.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' DTC Archive: data repositories in the fight against diffuse pollution' - gali


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Dtc archive data repositories in the fight against diffuse pollution

DTC Archive: data repositories in the fight against diffuse pollution

Mark Hedges, Richard Gartner: King’s College London

Mike Haft, Hardy Schwamm: Freshwater Biological Association

Open Repositories 2012, Edinburgh, Scotland/UK, 10th July 2012


A message from our sponsors
A message from our sponsors pollution

  • Collaboration between the Freshwater Biological Association and King’s College London (Centre for e-Research)

  • Funded by DEFRA (Department for the Environment, Food and Rural Affairs)

    • A UK government ministry

  • Runs from Jan. 2011 – Dec. 2014



Diffuse pollution what is it
Diffuse Pollution – what is it? pollution

  • Pollution processes that:

    • Individually, have minimal effect

    • Cumulatively, have significant impact

  • Some examples:

    • Run-off of water/rain (e.g. from road, commercial properties)

    • Farm fertilisers and waste

    • Seepage from developed landscapes



Water framework directive
Water Framework Directive pollution

  • What is an EU Directive?

    • An EU Directive is a European Union legal instruction or secondary European legislation which is binding on all Member States but which must be implemented through national legislation within a prescribed time-scale.

  • Water Framework Directive concerns water quality

  • Freshwater (rivers, lakes, groundwater,) adversely affected by diffuse pollution

  • Failure to comply means problems!


Dtc project
DTC Project pollution

  • DTC = Demonstration Test Catchment

  • Investigate measures for reducing impact of diffuse water pollution on ecosystems

  • Evaluate the extent to which on-farm mitigation measures can reduce impact of water pollution on river ecology

    • cost-effectively

    • maintaining food production capacity


Defra pollutionDemonstration Test Catchments (DTCs)

3 catchment areas in England selected for tests


How does the dtc project work
How does the DTC project work? pollution

  • The procedure is (roughly speaking):

    • Monitor various environmental markers

    • Try out mitigation measures

    • Analyse changes in baseline trends of markers in response to these measures

  • All this produces a great variety of data

  • The DTCs create data, the DTC Archive project has to make it usable and useful!


Equipment for data capture pollution

Bank-side water-quality monitoring station

Drilling a borehole for monitoring groundwater

Images thanks to Wensum DTC


Mains power pollution

LHS view

RHS view

Nitrate probe

Ammonium analyser

ISCO automatic water sampler

Pump

Flow cell

YSI multi-parameter sonde

Meteor telemetry unit

Total P and Total reactive P analyser

Bank-side water-quality monitoring station [Image from Wensum DTC]


DTC Archive pollution


Purpose of the archive
Purpose of the archive pollution

  • Curating data generated and captured by DTC projects

  • DTCs create data, we have to make it useful!

  • Data archive, but also querying, browsing, visualising, analysing, other interactions

  • Integrated views across diverse data

  • Need to meet needs of different users – researchers, also land managers, civil servants, planners, ...


The data
The Data pollution

  • Mostly numerical in some form: spreadsheets, databases, CSV files

    • Sensor data (automated, telemetry)

    • Manual samples/analyses

  • Species/ecological data

  • Geo-data

  • Also less highly structured information:

    • Time series images, video

    • Stakeholder surveys

    • Unstructured documents


Example water quality data
Example: water quality data pollution

61,752 data points per year for all stations




Challenges of data
Challenges of data pollution

  • Not primarily an issue of scale

  • Datasets diverse in terms of structure

  • Different degrees of structuring:

  • Highly structured (e.g. sensor outputs)

  • Highly unstructured (e.g. surveys, interviews)

    • Different types of structure (tables of data, geospatial)

    • Some small, hand-crafted data sets.

  • Idiosyncratic metadata, description, vocabularies

  • Varying provenance and reliability


  • Inspire
    INSPIRE pollution

    • Another EU directive 

    • An Infrastructure for Spatial Information in the European Community

      • Create a European Spatial Data Infrastructure for improved sharing of spatial information

    • Includes standards for describing, representing, disseminating geo-spatial data, e.g.

      • Gemini2 for catalogue metadata

      • GML (Geography Markup Language)

    • Builds on ISO standards (ISO 19100 series)


    Generic data model
    Generic Data Model pollution

    ISO 19156:Observation & Measurements


    Multiple data representations
    Multiple Data Representations pollution

    Generic data model implemented in several ways for different purposes:

    • Archival representation

  • based on library/archive standards

    • Data representation for data integration

  • “Atomic” representation as triples

  • Various derived representations

  • Generated for input to specific tools/analysis



  • Model for integration
    Model for Integration pollution

    • RDF triples

    • Atomic statements forming network of node/relations

    • Discrete datasets mapped into common format

    Subject

    Object

    predicate

    Identified by URIs

    predicate

    Species

    Genus

    memberOf

    Literal value

    hasCommonName

    Water flea


    Example dataset
    Example dataset pollution

    Tarn

    Name

    English Lake District rainfall dataset – from FISH.Link project

    CollectionMethod

    Location

    GridReference

    Easting

    Northing

    Latitude

    Longitude

    Dataset

    Site

    Name

    Actor

    ObservationSet

    About:Rainfall

    Type:Raw

    Unit:Inch

    ObservationSet

    About:Rainfall

    Type:Raw

    Unit:Inch

    ObservationSet

    About:Rainfall

    Type:Derived

    Unit:mm

    DependsOn: OS1, OS2

    Duration: 1Day

    ObservationSet

    About:Rainfall

    Type:Derived

    Unit:mm

    DependsOn: OS1, OS2

    Duration: 1Day

    Observation

    StartDate:

    EndDate

    Value:

    Observation

    StartDate:

    EndDate

    Value:

    Observation

    StartDate:

    EndDate

    Value:

    Observation

    StartDate:

    EndDate

    Value:


    Dataset capture and mapping
    Dataset capture and mapping pollution

    • Columns, concepts, entities mapped to formal vocabularies

    • Mappings defined in archive objects

    • Automated

      • e.g. sensor output files

  • Computer-assisted

    • e.g. some spreadsheets

  • Manual

    • by domain experts

    • e.g. mark up values in texts

  • Spreadsheet transformation workflow – from FISH.Link project


    Architectural overview
    Architectural Overview pollution

    Browsing

    Visualisation

    Search

    Analysis

    Mappings

    RDF triples

    Mappings

    Archive Objects

    Source datasets


    Current status and next steps
    Current Status and Next Steps pollution

    • Archive project started Jan. 2011, runs till end 2014.

    • Datasets are already being generated in large quantities.

    • Prototype functionality

    • Modelling and Ingestion of data (incremental)

    • Next steps:

      • Extend types of dataset covered.

      • User interactions (queries, visualisation etc.)


    Thank you
    Thank you pollution

    [email protected]

    [email protected]

    http://dtcarchive.org/


    ad