1 / 13

“provenance”

6th e-Infrastructure Concertation Lyon 24 Nov 2008. “provenance”. DATA TRACK Chair : Krystyna Marek Rapporteur: Wolfram Horstmann. Motivation. Last two meetings were on standards It was proposed to have a more focussed discussion

viola
Download Presentation

“provenance”

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 6th e-Infrastructure Concertation Lyon 24 Nov 2008 “provenance” DATA TRACK Chair : Krystyna Marek Rapporteur: Wolfram Horstmann

  2. Motivation • Last two meetings were on standards • It was proposed to have a more focussed discussion • Focus on practice and interoperability rather than standards • Select an arbitrary but important topic

  3. Notions of Provenance • Where do data objects* originate from? • Scientific Work -- examples • Instrumentation techniques • Manufacturers of hard- and software • Methodologies • Processes, e.g. gene sequencing • Technical/Local -- examples • (web)-identifiers • Database, repository name * Primary data, documents, metadata …

  4. Why Provenance? • Quoting / Citing / Referencing as global scientific principle • „Reproducible research“ • Giving credits to authors / creators in distributed environments • Original location / context has to be known • Experienced in Grid-Environments [1]

  5. Provenance & Interoperability • Re-Use / Sharing: “Addressing/Accessing” • Common view, common use • Unidirectional: No change of data objects! • Federation: “Discovering in Context” • Remote representation of distributed DOs • Aggregation: “Contextualizing” • Add unchanged object in a context • Processing/Annotation: “Changing” • Uni- vs. Bidirectional: Change of DOs and remote representation vs. back-storage (e.g. CVS)

  6. IVOA • Astronomy area: Repositories use OAI-PMH to provide general • Provenance as kind of metadata • „Observation data model“ • History of data (process „lineage“) • Processing • Configuration: telescope, camera • Ambient condiditions: temperature etc. • Versioning is included (also algorithms etc.)

  7. MetaFor • Data from numerical models • Descriptive information from model • Models are often transformed • Database / Registry for models in distributed repositories

  8. D4Science • Framework for • More than simple import framework • Graphs representing provenance information • Thematic: fishing site / statistic /

  9. DRIVER • Focus on document repositories • Some 100 … • Simple Provenance • OAI-PMH • Further (2nd order) Provenance • OAI-PMH („about“): repository identifiers • Enhanced Publications >> OAI-ORE • Semantic Model (named graphs) representing packages of documents and data objects

  10. Solutions • Provenance • Registries for curator, publisher etc. • Resolving over registry • Diversity of approaches • CIDOC-CRM, OPM, EuroStats, • Languages: RDF / OAI-ORE

  11. Differentiations • Expertise from Data-Centers as opposed to Data-Providers • Infrastructures should provide functions to add provenenace information (but do not) • e.g. EGEE provides an additional module for recording provenance data

  12. Hot topics • Propagating provenance: versioning • Disambiguation / Deduplication • different identical objects • Who provides the data? • Each processing step should provide at least some metadata

  13. Recommendations for Infrastructure • Standards for Provenance: Non-existing? • Each processing step should provide at least some metadata • Look deeper into specific implementations in subject communities • Technical point to point organisation • Bilateral • Programming a meeting • 24/25th ESA: earth science meeting?

More Related