Open provenance model tutorial session 4 use cases from data gov uk
Download
1 / 13

Open Provenance Model Tutorial Session 4: Use cases from data.uk - PowerPoint PPT Presentation


  • 83 Views
  • Uploaded on

Open Provenance Model Tutorial Session 4: Use cases from data.gov.uk. Outline. Background about data.gov.uk The use cases XML serialization Data transformation on the fly Complex and nested processes. data.gov.uk. Linking UK government data Aims:

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Open Provenance Model Tutorial Session 4: Use cases from data.uk' - holt


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Open provenance model tutorial session 4 use cases from data gov uk

Open Provenance Model TutorialSession 4: Use cases from data.gov.uk


Outline
Outline

  • Background about data.gov.uk

  • The use cases

    • XML serialization

    • Data transformation on the fly

    • Complex and nested processes


Data gov uk
data.gov.uk

  • Linking UK government data

  • Aims:

    • Provide a set of best practices for government agencies

    • Provide the minimum set of tooling and specification to facilitate the publication of data

    • Encourage “responsible” data publishing


Xml rdf
XML -> RDF

XSLT Parameter

Binding

RDF File

input

XSLT Processor

output

XSLT Stylesheet

XSLT Template

Who, when, which version, how


Downloaded from;

Unzipped from, etc

input

XSLT Processor

output

Made accessible

RDF File

XSLT Parameter

Binding

XSLT Stylesheet

Who, when, which version, how

XSLT Template


On the fly transformation
On-the-fly Transformation

Who, when, which version, how

http://mytransportatio.db/j10

Data transformation wrapper


Complex data creation pipeline
Complex Data Creation Pipeline

GATE Pipeline

GateXMLRegressionTransformation

GateXMLRdfaTransformation

RdfaRdfXmlTransformation

Courtesy of Paul Appleby from TSO (Data Enrichment Service)


Complex data creation pipeline1
Complex Data Creation Pipeline

Document Reset PR

GATE Pipeline

ANNIE English Tokeniser

ANNIE English Splitter

GateXMLRegressionTransformation

ANNIE POS Tagger

Data.gov.uk Morphological Analyzer

Data.gov.uk Flexible Roof Gazetteer

GateXMLRdfaTransformation

Data.gov.uk Generic Gazeteer

GATE Noun Phrase Chunker

RdfaRdfXmlTransformation

Data.gov.uk Generic Transducer

TSO Coreference

Courtesy of Paul Appleby from TSO (Data Enrichment Service)


Services used by

executions

accessedService

wasTriggeredBy

wasTriggeredBy

Level 1: Provenance

of execution

at higher level

iterationOfProcess

hasParentProcess

followed

Level 0: Provenance

of execution at detailed level

wasGeneratedBy

wasGeneratedBy

wasGeneratedBy

Artifacts

A data collection

wasDerivedFrom


Non digital data objects
Non-digital Data Objects

  • Organizations

    • Organizational structure changes over time

    • Origin organization, resulting Organization

  • Boundary

  • Legislation

An organization ontology: http://www.epimorphics.com/public/vocabulary/org.html


The challenges
The Challenges

  • Data of different representations, of physical forms, of granularity

  • Not tooling support

  • Provenance across different types of systems

    • Identification

    • Different terminologies


The gaps
The Gaps

  • A vocabulary being able to describe provenance of all types of data, from different systems

  • A vocabulary still providing enough terms to describe provenance accurately


This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License(http://creativecommons.org/licenses/by-sa/3.0/)


ad