1 / 29

The Earth System Curator

The Earth System Curator. Metadata Representations Prototype Portal in Collaboration with ESMF and ESG Rocky Dunlap Spencer Rugaber Georgia Tech. Who we are. Cecelia DeLuca, NCAR V. Balaji, GFDL/Princeton University Don Middleton, NCAR Chris Hill, MIT Serguei Nikonov, GFDL

walda
Download Presentation

The Earth System Curator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Earth System Curator Metadata Representations Prototype Portal in Collaboration with ESMF and ESG Rocky Dunlap Spencer Rugaber Georgia Tech

  2. Who we are • Cecelia DeLuca, NCAR • V. Balaji, GFDL/Princeton University • Don Middleton, NCAR • Chris Hill, MIT • Serguei Nikonov, GFDL • Sylvia Murphy, NCAR • Luca Cinquini, NCAR • Julien Chastang, NCAR • Spencer Rugaber, Georgia Tech • Leo Mark, Georgia Tech • Rocky Dunlap, Georgia Tech Plus other collaborators: NMM, Metafor, BFG2, and others

  3. What is the Earth System Curator? • The goal of Curator is to link climate datasets with a detailed description of the model that ran to produce the dataset • Transparent access to models and datasets • Use cases for climate model metadata • Provenance (history of what happened) • Archival and search (for models and datasets) • Model inter-comparison • Compatibility checking • Generation of coupler components

  4. Collaborations with Related Projects • Earth System Modeling Framework (ESMF) • Software infrastructure to facilitate building numerical Earth System models • Component-based model development • Built in tools for managing common modeling tasks (coupling fields, calendars, grid creation, etc). • Earth System Grid (ESG) • A large scale distributed portal for hosting data produced by Earth System models • Services such as dataset ingest, faceted search, dataset browsing, viewing metadata, downloading datasets

  5. Representations of Curator Metadata • UML • RDF/OWL • XML/XML Schema • Relational DB - SQL

  6. UML • Unified Modeling Language • What it is • A visual modeling language for representing software systems • Source • OMG Standard • Motivation • Conceptual modeling, human to human communication of the model, object oriented representation • of the 13 diagrams in UML 2.0, we are using one: class diagram • static structure in terms of classes, attributes on classes, relationships between classes

  7. UML • Metamodel • Access to metamodel for creating UML Profiles ability to define a subset of UML used for building your own models • Tool support • Enterprise Architect – recommended • Others – Rational Rose, Poseidon, ArgoUML, Microsoft Visio • Constraint +Query Language – Object Constraint Language (OCL)

  8. http://swiki.cc.gatech.edu:8080/Curator/46

  9. RDF/OWL • What it is • “Semantic web” ontology language • Primary modeling constructs are properties and classes • Conceptual implementation language (not low level like XML) • RDF – Resource Description Framework • Based on {subject, predicate, object} triples • OWL – Web Ontology Language (2.0 coming soon!) • Strong theoretical basis on Description Logics • Source • W3C standard

  10. RDF/OWL • Motivations • Now a widely accepted standard • Simple data model, but OWL still allows complex class descriptions • Very “web friendly” for use with external systems, semantic mediation, URIs, XML format for interchange • “Non-experts” can build an ontology using Protégé • Architectural considerations: faceted search interface • Tool support • Protégé • Sesame Triple Store, Jena Java API

  11. Example RDF Statements Balaji “Balaji works at GFDL.” worksAt Curator meeting hasLocation GFDL starts ends “18 Oct 2007” “19 Oct 2007”

  12. RDF XML Representation <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:esc="http://www.earthsystemcurator.org"> <rdf:Description rdf:about=“http://....#OctCuratorMeeting"> <esc:hasLocation rdf:resource=“http://....#GFDL”/> <esc:starts>18 Oct 2007</esc:starts> <esc:ends>19 Oct 2007</esc:ends> </rdf:Description> <rdf:Description rdf:about=“http://....#Balaji"> <esc:worksAt rdf:resource=“http://....#GFDL”/> </rdf:Description> </rdf:RDF>

  13. ESG Ontology with Curator Extensions • Protégé 4 beta: http://protege.stanford.edu/download/registered.html#p4 • Update Pizza Tutorial (HIGHLY RECOMMENDED) • http://www.co-ode.org/resources/tutorials/ProtegeOWLTutorial-p4.0.pdf

  14. XML/XML Schema • What it is • Very widely accepted format for communication between applications, tag-based markup • Source • W3C Standards • Motivations • A standard implementation that modeling groups can adhere to (most will not be comfortable with RDF/OWL) • Can be output by modeling frameworks such as ESMF • “Use profiles” are small chunks of XML for specific purposes (part of the egg white?)

  15. XML/XML Schema • Tool support • XMLSpy, oXygen, Notepad... • Query languages • XQuery, XPath • XSLT for transforming XML to other formats

  16. SQL – Relational Databases (RDBMS) • ANSI standard • Motivations • Very mature technology • RDF/OWL and XML are likely NOT good solutions for long term storage • Fast querying • Large scale metadata storage

  17. Representation Issues/Considerations • What kinds of constraints do we need to precisely model the domain? • structural constraints vs. dynamic constraints • What kinds of reasoning and query capabilities do the applications require? • What role will the meta-model play? • How do you keep consistency among several representations/notations? • What is the role of auto-generation?

  18. Putting it all together... • A prototype application developed this summer at NCAR in collaboration with ESMF and ESG: • ESMF modeling components become “self-describing” • Metadata is exported from an ESMF component in a standardized XML format (multiple conventions allowed) • The XML is ingested into ESG and exposed to the portal for users to search

  19. Metadata Lifecycle

  20. Metadata Lifecycle • ESMF component exports XML metadata • The XML is validated and harvested into a Java object representation • The Java objects are persisted to a relational database (RDBMS) • Metadata in the RDBMS is then harvested into RDF – a Semantic Web ontology language • The RDF is accessed by the ESG web portal for faceted search of the metadata

  21. ESMF XML Output (example) <model_component name="Finite Volume Dynamical Core"> <discipline_set> <discipline name="Atmosphere" /> </discipline_set> <physical_domain_set> <physical_domain name=“Earth system" /> </physical_domain_set> <agency_set> <agency name="NASA" /> </agency_set> <institution_set> <institution name="Global Modeling and Assimilation Office (GMAO)" /> </institution_set> …… Viewed as a simple “use-profile”

  22. ESMF XML Output (example) <author_set> <author name="Max Suarez" /> </author_set> <coding_language_set> <coding_language name="Fortran 90" /> </coding_language_set> <model_component_framework_set> <model_component_framework name="ESMF (Earth System Modeling Framework)" /> </model_component_framework_set> <variable_set> <variable shortname="DPEDT" longname="Edge pressure tendency" units="Pa s-1" /> <variable shortname="DUDT" longname="Eastward wind tendency" units="m s-2" /> …… </variable_set> </model_component>

  23. Faceted search Harvested component ESG Prototype Data Portal

  24. ESG Prototype Data Portal

  25. ESG Prototype Data Portal

  26. Demo of Dycore Portal http://dycore.ucar.edu/

More Related