1 / 32

C3-Grid * Federation System for Climate Data Handling

C3-Grid * Federation System for Climate Data Handling. Stephan Kindermann German Climate Computing Center – DKRZ. * C ollaborative C limate C ommunity Grid Project (Part of D-Grid Initiative). Overview. C3Grid Overview: Architecture, Partners, Goals..

cira
Download Presentation

C3-Grid * Federation System for Climate Data Handling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. C3-Grid* Federation System for Climate Data Handling Stephan Kindermann German Climate Computing Center – DKRZ * Collaborative Climate Community Grid Project (Part of D-Grid Initiative) GO-ESSP 2008

  2. Overview • C3Grid Overview: Architecture, Partners, Goals.. • C3Grid Federation System Components: • C3Grid ISO Discovery Metadata and Metadata Catalog A short interop. study: C3Grid ISO Metadata / Geonetwork • Data Access and Preprocessing • C3Grid Security • C3Grid / IPCC ? GO-ESSP 2008

  3. C3Grid: Overview C3Grid Data Providers World Data Centers Universities Research Institutes IFM-Geomar FU Berlin Uni Köln Climate Mare RSAT DWD DKRZ PIK GKSS AWI MPI-M ISO Discovery Metadata Data Access Interface (B) (A) Collaborative Grid Workspace D-Grid (SRM, d-cache,..) C3RC Workflow Data + Data + ISO 19139 Discovery Catalog Metadata Metadata Grid Data / Job Interface C3Grid Data and Job Management Middleware Result Data Products + Metadata Portal ? ! GO-ESSP 2008

  4. (A) Metadata for Data Discovery: Design and Implementation C3Grid Data Providers Data Access Interface ISO Discovery Metadata (A) ISO 19139 Discovery Catalog ? GO-ESSP 2008

  5. (A) Metadata – harvesting and lookup components • Technology •  ISO 19115/19139 metadata profile •  OAI-PMH harvesting  catalogue •  lucene based catalogue search •  GridSphere based portal • Fast Range Queries • Java API + Web Service Interface madeavailable on sourceforge.net see also: http://www.panfmp.org GO-ESSP 2008

  6. (A) C3Grid ISO 19139 profile Design criteria: • no schema extensions, profiling by restriction • restriction using schematron constraints • „the granularity of the discovery metadata should reflect the logical organization of the data repository at a sufficiently coarse grained level“ (1) • CF based content description • Link to resource metadata infrastructure (GT4-MDS based) (1) Inspire: DT Metadata – Draft Implementing Rules for Metadata (version 2, 02/02/2007) GO-ESSP 2008

  7. (A) C3Grid ISO Profile • Description at aggregate level (e.g. experiment) •  Aggregate extent description with multiple verticalExtent sections • Sub-selection in data request GO-ESSP 2008

  8. <contentInfo><MD_CoverageDescription> <attributeDescription><gco:RecordType>air_temperature</gco:RecordType></attributeDescription> <dimension xlink:href="#verticalCRS_hPa"><MD_RangeDimension> <descriptor><gco:CharacterString>K</gco:CharacterString></descriptor> </MD_RangeDimension></dimension> </MD_CoverageDescription></contentInfo> <contentInfo><MD_CoverageDescription> <attributeDescription><gco:RecordType>sea_surface_temperature</gco:RecordType></attrib…> <dimension xlink:href="#verticalCRS_m"><MD_RangeDimension> <descriptor><gco:CharacterString>K</gco:CharacterString></descriptor> </MD_RangeDimension></dimension> </MD_CoverageDescription></contentInfo> Reference to vertical CRS (A) C3Grid ISO Profile: CF usage Content description based on (extended) CF names  Link to corresponding vertical CRS GO-ESSP 2008

  9. (A) C3Grid ISO profile • Data Distributor Info: • reference to C3Grid resource metadata catalog (MDS) (names  service endpoints) • (optional: service endpoints) GO-ESSP 2008

  10. (A) C3Grid ISO profile • Data provenance description: • by now (data staging output): simple sequence of ProcessStep descriptions • later (c3grid processed data): combined Source/ProcessStep blocks + external data provenance store GO-ESSP 2008

  11. GO-ESSP 2008

  12. C3Grid ISO Profile: A short geonetwork experiment • Federation building: • OAI-PMH, WebDAV, Z39.50, geonet • Full ISO metadata support (ISO19139/19119) • OGC CSW 2.0 reference impl. • RSS and GeoRSS newsfeeds • SKOS based thesauri • adaptable to new schema`s • schematron constraint checking • On roadmap: • flexible ISO profile support • shibboleth integration GO-ESSP 2008

  13. C3Grid ISO Profile: A short geonetwork experiment GO-ESSP 2008

  14. Building complex metadata federations … • Harvesting via: • CSW • OAI-PMH • Geneonet • Web-Dav GO-ESSP 2008

  15. C3Grid ISO Profile: A short geonetwork experiment • Import / Edit / Search: ok • Missing: • content (CF) search • vertical search • temporal BBox search • data staging GO-ESSP 2008

  16. GO-ESSP 2008

  17. GO-ESSP 2008

  18.  complete portal protoype to seach, access (pre-process) data described by C3Grid ISO profile in 3 weeks based on geonetwork open source solution .. GO-ESSP 2008

  19. GO-ESSP 2008

  20. (B) Data Access and Preprocessing C3Grid Data Providers World Data Centers Research Institutes University Partners Data Access Interface ISO Discovery Metadata (B) (A) Collaborative Grid Workspace Data + Data + Data Analysis Workflow ISO Discovery Catalog Metadata Metadata Result Data Products + Metadata ? ! GO-ESSP 2008

  21. (B) Data Access and Pre-Processing: Implementation Data Staging Request Processing jobs • C3Grid Generation 1: secured plain web services • (status) • C3Grid Generation 2: WSRF service interfaces (scheduled november 08) • Generation 2+: full PKI/SAML security stack Data IDs Offer Time / resource estimation JSDL based description Selection: • lon, lat, alt • time • content: CF Output Properties Data Staging Web Service WS GRAM skeleton impl status .. Local resource manager Provider staging jars Provider staging scripts MD DB Flat File DB Distributed C3Grid Work Space Archive GO-ESSP 2008

  22. C3Grid Middleware Components Scheduler: Globus WSRF based, accepts WSL workflow description: compute tasks + data staging tasks Datamanagement: Globus WSRF based, offer negotiation with scheduler, consistent view to distributed data, (later: replica management, caching) Globus MDS Resource Metadata Catalog: service registry, resource status  Dependency on Globus SW stack, no high level impl. support tools, impl. Globus 4.1.x migration ??, problems with delegation impl. (insufficient docu. and guidance) GO-ESSP 2008

  23. C3Grid Workflow Analysis task-related workflow-related interaction an moitoring via WS Notification standard Handler to facade single/ specific Tasks monitoring and management of workflow execution (individual) scheduling strategy to optimize the management analysis and preparation of workflows GO-ESSP 2008

  24. (C) Security Infrastructure „Home attributes + VO attributes“ Identity Provider Home Organisation Attribute Provider Virtual Organisation Browser Shibb. login SAML SAML Portal C3Grid Middleware Webstart app Delegation Service GridShib SAML tools wflow client Grid Service X509 Grid-proxy Grid Service SAML Grid Resource <..SAML Assertions..> policy GridShib for GT SAML GRAM / DataRAM SLCS (CA) MyProxy Personal / Group Account DFN GO-ESSP 2008

  25. (C) Security Infrastructure • Status: • Shibb IdP`s running at core C3Grid partners • Online CA for short-lived credentials tested, set up & operated by DFN (the German NREN) • Online CA (DFN-SLCS) accreditation process with EUGridPMA started • SLC contain campus attributes as SAML assertion • Java Webstart app to bootstrap SLCS in development at DFN • GridShib SAML Tools (v0.6.0) tested • Prototype of shibbolized GridSphere portal tested • open issues with GT4 proxy-delegation implementation • Next: • Integration of components • Virtual home organization for C3 users without a Shibboleth IdP • Integration of VO attributes (shibbolized VOMS) GO-ESSP 2008

  26. C3Grid / IPCC Use Case • (0) IPCC Metadata harvested / mirrored in CERA DB (WDCC) • Metadata visible in C3 Portal • User issues IPCC data import from external repository • User  OpenID IdP / + IPCC_Access role  external repos • Download  ??  C3 Repository • C3Grid grants access to users with IPCC_Access role • ‘grant procedure ?’: before each wflow exec. contact to IdP/AttributeService ?? or more offline method ? Analysis wflow Wflow result publication IPCC data import C3RC / C3 Workspace GO-ESSP 2008

  27. Appendix GO-ESSP 2008

  28. C3Grid Content Info (Version 2) <contentInfo> <MD_CoverageDescription> <attributeDescription> <gco:RecordType> CF_name_with_attribute </gco:RecordType> </attributeDescription> <contentType> <MD_CoverageContentTypeCode codeList="http://wis.wmo.int/2006/catalogues/cf-standard-name-table.xml" codeListValue="air_temperature"> air_temperature with a cell_methods attribute including time:mean (interval: 1 day) </MD_CoverageContentTypeCode> </contentType> <dimension xlink:href="#verticalCRS_hPa"><MD_RangeDimension><descriptor> <gco:CharacterString>K</gco:CharacterString> </descriptor></MD_RangeDimension</dimension> </MD_CoverageDescription> </contentInfo> GO-ESSP 2008

  29. Security Aspect: C3Grid step 0  step 1 GO-ESSP 2008

  30. GO-ESSP 2008

  31. GO-ESSP 2008

  32. parent collection 0..1 * has_parent is_part_of p_data is_generated_by + • Time stamp • Description • Citation info 0..* process step has_input + source • Description is_generated_by (C) Data Reuse of Analysis Results: Metadata Generation Portal Context description of Analysis Data: • Aggregation • Processing history WS Interface Lucene+ Index OAI-Harvester OAI-PMH Server C3Grid Workspace wflow “quality check” m_tool API Prototype (Python) GO-ESSP 2008

More Related