1 / 20

“curator” DB design

“curator” DB design. Curator meeting, GFDL, Sep 20. Why RDBMS. A lot of information: Model metadata Experiments metadata Institution/user metadata Data metadata Mostly it’s in textual form

flint
Download Presentation

“curator” DB design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “curator” DB design Curator meeting, GFDL, Sep 20

  2. Why RDBMS • A lot of information: • Model metadata • Experiments metadata • Institution/user metadata • Data metadata • Mostly it’s in textual form • Information is internally linked tightly that can be easy to express by means of relational databases. • Relational databases have well developed means for searching and extracting procedures (SQL query language and program interfaces for any language) as for local as well as for remote user. • Very reliable, safety technology. Curator meeting, GFDL, Sep 20

  3. Desirable Features of Model Data Factory • Relational Database storing metadata, containing description of • model components and model configuration • scenarios • postprocessing (model output and CMOR) directives • experiments • variables • formalized rules of Quality Control • data locations • task scheduler • users and groups accounts • XML as data exchange format • for compliance with FRE • working format of existing third party software • good fitted for hierarchical metadata description • prevalent in world, easy to exchange with others Data Portals • Model Builder (FMS Runtime Environment in GFDL) • checks out available model components from DB • chooses model datasets from DB • sets postprocessing directives • checks components and configurations compatibility • builds executable application and runs it • write metadata about experiment into DB (model configuration, scenario, project, organization/user, postprocessing) Curator meeting, GFDL, Sep 20

  4. Desirable Features of Model Data Factory (continue) • Climate Model Output Rewriter (CMOR) subsystem • prepares data consistently with specific project requirements • Data Publisher • transfer data to Data Portal storage in accordance to settings from DB • Data Portal Software Package • Configuration Manager (configures Aggregation Server and Data Portal Interface) • Search Catalog Engine • Data Subsampling Engine • Data Computation Engine • Data Visualization • Data Delivery Manager Curator meeting, GFDL, Sep 20

  5. Standard scenario of functioning Model Data Factory (ideal picture) • Scientist builds model in FRE using available model components, datasets and forcing scenario. • FRE puts metadata about built model, scenario, experiment into “curator” DB and runs experiment; • Postprocessing subsystem extracts metadata about postprocessing plan from “curator” DB and executes it, and on finish puts metadata about processed experiment back into DB. • Data Publisher (DP) regularly checks “curator” DB for new experiments marked as “public” and if finds any invokes CMOR. • CMOR goes to “curator” DB for metadata and processes needed data following metadata instructions. • DP calls QAC and then transfers data to Data Portal storage. • Configuration Manager configures Aggregation Server and Data Portal Interface and puts records about new public data in “curator” DB. • End of process, data is ready to go. Curator meeting, GFDL, Sep 20

  6. Common functionality schema of ‘Model Data Factory’ Curator meeting, GFDL, Sep 20

  7. Database ‘curator’design Database Compartments: • Model Metadata Compartment contains models’ descriptions, allows to build coupled model of needed configuration • Variables Compartment List of all related physical variables • Workflow Compartment contains scenarios, experiments, institutions, projects and users info • Postprocessing Compartment defines postprocessing plan for conducting experiment • Data Portal Compartment contains info about experiments data Curator meeting, GFDL, Sep 20

  8. MySQL DB CURATOR Curator meeting, GFDL, Sep 20

  9. Coupled_Models Model_List Component_Medias Models Variables Model Metadata Compartment(in development) Workflow Compartment Experiments Variables Compartment Curator meeting, GFDL, Sep 20

  10. Components_Medias Coupled_Models Model_List Models Data Samples from Model Compartment Curator meeting, GFDL, Sep 20

  11. Variables Variable_Bundles Variable_Lists Variable_List_Contents Projects Proj_Var_Names Variables Compartment Workflow Compartment Curator meeting, GFDL, Sep 20

  12. Proj_Var_Names Variables Variable_List_Contents Variable_Lists Variable_Bundles Data Sample from Variables Compartment Curator meeting, GFDL, Sep 20

  13. GFDL_USERS Institutions Experiment_Status Realization Projects Experiments Scenarios Workflow Compartment (in development) Curator meeting, GFDL, Sep 20

  14. Scenarios Experiments Data Samples from Workflow Compartment Curator meeting, GFDL, Sep 20

  15. Post_Proc PP_Units Coupled_Models Projects GFDL_USERS PP_Content Average_Periods Variable_Lists PP_Content PP_Units Postprocessing Compartment Data Samples from Postprocessing Compartment Curator meeting, GFDL, Sep 20

  16. Data_Files Data_Grids Variables MissedData_Descriptors Experiments Coupled_Models Variable_Bundles Data Portal Compartment Curator meeting, GFDL, Sep 20

  17. Data_Files MissedData_Descriptors Data_Grids Data Samples from Data Portal Compartments Curator meeting, GFDL, Sep 20

  18. “curator” DB is in use now: • CM2.0 • CM2.1 Curator meeting, GFDL, Sep 20

  19. Future Development • Bring DB terms to conventional terminology. • Set up model metadata schema standards and create tables in “curator” DB following this schema. • Fill these tables with real metadata extracted from models of GFDL, CCSM, MIT and from ESMF Component Database. • Implement tables for observation data metadata. • Implement DODS aggregated data support. • Build XML bridge for XML transcoding DB input/output Curator meeting, GFDL, Sep 20

  20. END Questions? Suggestions? Objections? Thanks! Curator meeting, GFDL, Sep 20

More Related