1 / 13

Advancing NCAR’s Cyberinfrastructure: the Community Data Portal & the Earth System Grid

Advancing NCAR’s Cyberinfrastructure: the Community Data Portal & the Earth System Grid. Luca Cinquini SCD User Forum, May 2005 http://cdp.ucar.edu/ http://www.earthsystemgrid.org/

adelie
Download Presentation

Advancing NCAR’s Cyberinfrastructure: the Community Data Portal & the Earth System Grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advancing NCAR’s Cyberinfrastructure:the Community Data Portal&the Earth System Grid Luca Cinquini SCD User Forum, May 2005 http://cdp.ucar.edu/ http://www.earthsystemgrid.org/ ESG & CDP CI staff: Dave Brown, Michael Burek, Luca Cinquini, James Humphrey, Robert Markel, Don Middleton (PI), Markus Stobbs, Nathan Wilhelmi

  2. Cyberinfrastructure “Heterogeneous set of computer and information services (hardware and software) that facilitate management, access and analysis of scientific data”

  3. Climate Models Campaigns Remote Sensing Space Weather NCAR Cyberinfrastructure Data Storage MSS access Metadata Catalogs Services Data Search & Discovery Auth & Authz Services Data Download Data Aggregation & Subsetting Data Transformation Data Analysis & Visualization Web Browsers API-specific Clients

  4. Cyberinfrastructure benefits: • Sharing of data among restricted virtual communities, and distribution of data to the largest possible audience, including new communities • Seamless access of data independent of location, intercomparison of data, and discovery of new data of interest • Reduction of spin up cost & time for new scientific projects (“no reinventing the wheel”) • Adoption of standard for data formats, metadata schemas, service APIs --> cross-agency interoperability • Etc. ... Make life easier for data producers and data users.

  5. The Community Data Portal & the Earth System Grid ESG and CDP are two synergistic projects currently underway in SCD aimed at advancing the NCAR Cyberinfrastructure, from two different perspectives: • CDP • Funding: NCAR strategic initiative (NSF) • Goal: gateway to all data that is produced/stored at NCAR • Strategy: promote collaboration and co-development among many data efforts within NCAR • ESG • Funding: DOE SciDAC program • Participants: ANL, ISI, LANL, LBNL, LLNL, NCAR, ORNL • Focus: climate model data (CCSM, PCM, etc.) • Technology: Globus toolkit for Grid computing

  6. CDP: scientific data hosting CDP system: Framework of integrated, shared resources and services for scientific data hosting • Hardware (8 processor SUN server, ~28 TB disk farm, connection to NCAR MSS) • Software (multi-functionality web portal, OPeNDAP server, GDS server etc.) • Data holdings: ~ 700 datasets from a variety of disciplines (model data, campaign data, climate, biosphere, carbon cycle, atmosphere, land, ocean, etc), located at NCAR and collaborating institutions

  7. CDP architecture CDP Data Node web browser access > disk Community Data Portal web portal (browse, search, download, aggregate, subset) XML data catalogs OPeNDAP server XML data catalogs disk NCAR MSS GDS server < harvest + exchange > API-specific client access > Data catalogs disk Other Data Centers And Digital Libraries CDP Data Node

  8. CDP demo • Browse • Mozart dataset • Dataset-level metadata • Dataset access services • File-level metadata • HTTP download • MSS download • Search: • “ozone” • Extended results • “ozone troposphere” • Link back to Mozart data catalog • “forest fire” • “source code” • “tuna”

  9. Earth System Grid features • ESG shares much of the technology with CDP, and offers many of the same services (browsing, searching, download, aggregation and subsetting) • Some differences: • Access to climate model data on multiple deep storages (NCAR MSS, NERSC, ORNL HPSS) and disks (NCAR, LANL) by proxy certificate (no separate logins required!) • Uses distributed, cross-updating databases to keep track of file replicas • Uses domain-specific, high-semantic metadata schema to describe climate model data • Data holdings: • NCAR web portal: CCSM (including control runs, some IPCC runs, CCSM 3.0 source code) and PCM models, for a total of >500 datasets, >50 TB (+ NCL and PyNGL distributions) • “Sister” ESG site at PCMDI/LLNL used to distribute IPCC data world-wide

  10. ESG architecture LBNL NERSC RLS ORNL SRM ORNL HPSS SRM RLS RLS metadata database XML data catalogs web portal LANL disk NCAR MSS RLS disk NCAR SRM

  11. ESG demo • Browse: • CCSM source code, output data • PCM output data • Run b04.10 • Download from NCAR MSS, NERSC, ORNL HPSS • Aggregation/subsetting

  12. Conclusions • Several active projects aimed at developing NCAR Cyberinfrastructure: ESG, CDP, GridBGC, GIS, VSTO • Cyberinfrastructure must be beneficial to producers and users of data: • We welcome feedback on current systems (good and bad) • Interested in gathering requirements for additional functionality in order to establish and prioritize future development • Interested to talk to data providers who want to share data (of any kind) that is of interest to the community • Feedback: please email cdp@ucar.edu

  13. The future • Greater integration of data portals and services within NCAR: promote sharing of software, services, co-development • Still allow the capability for discipline-specific functionality (sub-portals), and branding • Growing federation with other data centers across U.S. and the world

More Related