1 / 26

Metadata Concepts / Use in Climate Research

Metadata Concepts / Use in Climate Research. Stephan Kindermann , Martina Stockhause German Climate Computing Center (DKRZ) Hamburg, Germany. Overview. Metadata descriptions: sources, usage  data level, preservation level, model level, domain knowledge level

arden-foley
Download Presentation

Metadata Concepts / Use in Climate Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metadata Concepts / Use in Climate Research Stephan Kindermann, Martina Stockhause German Climate Computing Center (DKRZ) Hamburg, Germany

  2. Overview Metadata descriptions: sources, usage  data level, preservation level, model level, domain knowledge level Metadata standards, IT-principles

  3. Metadata descriptions: sources, usage (I) Data Description Level: source: model run output format: gib, netcdf3/4 container formats (including basic metadata) metadata homogenization(„Climate and Forecast Convention (CF)“ conformance, CMOR2 compliance, controlled vocabs) usage: analysis tools, data access script, data search ( „linked data principle“) (II) Data Preservation Level: target: legacy data centers (e.g. WDCC) format: internal DB, various external formats, e.g. ISO 19139, DIF, .. usage: long term data storage and access, citation e.g. using DOIs

  4. Metadata descriptions: sources, usage (IIl) Model Description Level: source: Researcher interviews, online questionnaire format: CIM(Climate Metadata for Climate Modelling Digital Repositories - Metafor FP7) Con-CIM: UML, APP-CIM: XSD + vocabs) usage: model intercomparison, scientific portals, information space browsing / search (lV) Semantic Annotion Level: source: data metadata, model metadata, domain knowledge metadata format: OWL (RDF) usage: user navigation in portals, „faceted search“ etc. deployments: Earth System Grid CMIP5 portal, IS-ENES portal

  5. B) Metadata standards, IT principles (I) Data Description Level: Metadata File naming convention based on CVs building uniform URIs (DRS, Data Reference Syntax) Data Activity/Product/Institute/Model/Exp/frequ/realm/Variable/ensemble Grib, netcdf data containers 10`s of PBytes Data servers MD catalogue servers  Enabling „linked data“ wget http://server.org/Activity/Product/../ensemble

  6. B) Metadata standards, IT principles (II) Data Preservation Level: WDCC Metadata Concept CERA GUI IS-ENES Portal … search API • Scalability • Sustainability • Flexibility • User friendly GUIs Common CV CERA2 DB schema OWL conceptual model QC, DOI assignment, .. Tape Archive

  7. B) Metadata standards, IT principles (III) Model Description Level: Metafor FP7 project: Common Information Model (CIM) • Formal metadata model of the climate modelling process • It includes descriptions of the experiments being undertaken, the simulations being run in support of these experiments, the software models and tools being used to implement the simulations and the data generated by the software. • CMIP5 use case: CV collection, CMIP5 questionnaire

  8. Metafor CIM overview CONCIM (UML) Automatic translation ISO, Geographic Markup Language (GML) series APPCIM (XSD) CMIP5 portal(s) IS-ENES portal Metafor catalogue CIM Instances(interliked XML files)

  9. Metadata collection

  10. Automatic XML  RDF translation ESG OWL instances IS-ENES1 portal CMIP5 gateway(s) 1Infrastructure for the European Network for Earth System Modelling

  11. (CON)CIM Overview

  12. (IV) Semantic Annotation Level B) Metadata standards, IT principles Portal(s) ESG Gateways RDF CIM XML OWL ontologies: http://ontologies.ucar.edu/owl Data object XML Triple Store IS-ENES Portal Content Management System Community content RDF Triple Store Rel. DB Evolving OWL model

  13. THREDDS Data Server Metafor / CIM Questionnaire MD on model+simulation MD on data MD Quality Checks L2 Data Quality Checks L2 QC DB MetadataRepository CMIP5 Quality Control Files Data Metadata CIM Metadata Data in prescribed DRS Syntax Information MD Quality MD Data MD

  14. THREDDS Data Server Metafor / CIM MD on model+simulation +data+quality MD on data QC DB Data Quality Checks L3 double check, cross checks CMIP5 STD-DOI Publication TIB:DOIRegistrationAgency Data Data Node Metadata DOI Target Pageaccess todata and metadata Filesystem STD-DOI Catalogue QualityMD Data MD InformationMD Longterm Archive STD-DOI MD Information MD WDCC:DOI Publication Agent

  15. (IV) Semantic Annotation Level B) Metadata standards, IT principles Portal(s) ESG Gateways RDF CIM XML OWL ontologies: http://ontologies.ucar.edu/owl Data object XML Triple Store IS-ENES Portal Content Management System Community content RDF Triple Store Rel. DB Evolving OWL model

  16. IS-ENES Info Portal

  17. 2010-07-07 16:49:13 INFO triplestorefill.utility Adding item <ComponentModel at /test7/echam> with ID echam at http://localhost:8080/test7/echam 2010-07-07 16:49:13 INFO triplestorefill.sesameconnector Storing RDF... (1118 byte) 2010-07-07 16:49:13 INFO triplestorefill.sesameconnector RDF data: @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix owl: <http://www.w3.org/2002/07/owl#> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix dc: <http://purl.org/dc/elements/1.1/> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix xsd: <http://www.w3.org/2001/XMLSchema#> . @prefix isenes: <http://www.enes.org/isenes#> . isenes:echam rdf:type isenes:ComponentModel . isenes:echam foaf:page <http://plone.dkrz.de/test7/echam> . <http://plone.dkrz.de/test7/echam> foaf:topic isenes:echam . isenes:echam dc:title "ECHAM" . isenes:echam rdfs:label "ECHAM" . isenes:echam rdfs:comment "Global circulation model" . isenes:dkrz isenes:isResponsibleFor isenes:echam . isenes:echam isenes:hasResponsible isenes:dkrz . isenes:joachim-biercamp rdfs:label "Joachim Biercamp" . isenes:joachim-biercamp rdf:type foaf:Person . isenes:dkrz rdfs:label "DKRZ" . isenes:dkrz rdf:type foaf:Organization . isenes:joachim-biercamp isenes:isMemberOf isenes:dkrz . isenes:dkrz isenes:hasMember isenes:joachim-biercamp . isenes:dkrz dc:title "DKRZ" . isenes:joachim-biercamp foaf:mbox "biercamp@dkrz.de" „save“ Triple Store

  18. (B) From a user`s perspective Bildchen: Plone seite mit „related info“ portlet

  19. (B) From a user`s perspective Bildchen: Plone Seite nach Klick auf „related“ link: faceted search

  20. Summary • international CMIP5 / IPCC effort is key driver for collection • / standardization of CVs, Metadata, • conceptual models (Ontologies) • Metadata mainly used for • model intercomparison, uniform data search / access • + data processing • Prepare for Climate Impact Community use cases !!

  21. ..workshop reminder.. - Usage and quality of descriptive keyword type of metadata used in your domain to manage data. - Types of usages of this metadata (management, retrieval, research statistics, machine processing, etc). - The standards used for your metadata descriptions (structure, elements, vocabularies). - Adherence to common IT principles (explicit syntax, registered semantics, use of PIDs, etc). - Compliance with the recommendations to be found in the report of the e-IRG task force on Data Management http://www.e-irg.eu/publications/e-irg-task-force-reports.html ..therefore we would like the presenters to focus on a few points allowing all of us to draw conclusions at the end:

  22. Methodology to create CMIP 5 CIM instancaes

  23. Producers: providers of models, tools, model results, HPC ecosystem, Grid .., community Motivation • Consumers: ENES community, impact community Portal E-infrastructure components Governance Agreements, Commitments, Sociology,.. Virtual Earth System Modeling Resource Centre CMIP5/AR5/+ data services Ticketing AAI Collaboration Metadata (CIM,..) Protocols APIs

  24. IS-ENES vERC Portal Requirement E-Infra component Technology used (A) Community info presentation (models, tools, descriptions,..) Content Management Sytem (CMS, Collab.Tool) Plone + IS-ENES „content-types“ Project Management / Ticketing Tool Redmine (B) Community development support Zope/Plone plugin(s) (C) Data portal to AR5 archives Web Framework (external) Metafor service(s) (external) ESG-gateway (D) CIM metadata Web service (proxies) Python info collector based using Atom, OAI-PMH,.. protocols (E) External content / metadata collection Info (XML) harvester „Cross-selling“ Semantic interlinking (F) Additional value provisioning RDF triple store (Sesame)

More Related