1 / 19

Comprehensive Large-Array Data Stewardship System (CLASS) Update

DAARWG Meeting. Comprehensive Large-Array Data Stewardship System (CLASS) Update. Overview. Background System Development Operational Integration Recent Accomplishments Future Direction Next Steps. Background. Vision.

dunne
Download Presentation

Comprehensive Large-Array Data Stewardship System (CLASS) Update

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DAARWG Meeting Comprehensive Large-Array Data Stewardship System (CLASS) Update

  2. Overview • Background • System Development • Operational Integration • Recent Accomplishments • Future Direction • Next Steps DAARWG Meeting: CLASS Update

  3. Background Vision • No place to put mountain of data.Manage rapidly growing data volume of major observing and modeling systems • Every program solving the same problem. Eliminate various "stove-pipe” systems and produce a unified "enterprise” data access system to reduce IT cost • Satellite Active Archive (SAA) • GOES Active Archive (GAA) • Earth Observing System (EOS) Archive • Can’t access the data we have. Centralize NOAA’s numerous data systems for environmental data access--create a single portal • Don’t break anything. Retain, as much as possible, portions and modules of existing legacy systems DAARWG Meeting: CLASS Update

  4. Background Level 1 Requirements • Scope. Enterprise-wide IT system supporting long-term, secure storage of and common access to environmental datasets and information stewarded by NOAA’s Archives. • Large Data Campaigns. Satellites (NPP/JPSS, GOES, POES, DMSP, MetOp), Radar (NEXRAD), Models • Enterprise Approach • Providing common services for development and operation of IT systems supporting NOAA Archives • Consolidating legacy archival storage systems • Relieving data producers of responsibility for archival DAARWG Meeting: CLASS Update

  5. Background Contract • The CLASS support contract was awarded competitively on June 20, 2008 to Diversified Global Partners (DGP) JV LLC • Small business set-aside – 8(a) mentor-protégé program • Protégé Company: DB Consulting Group • Mentor Company: Global Science and Technology (GST) Inc. • Potential nine-year period of performance – Now in year 3: • Base Year • Four (4) one-year Option Periods • Four (4) one-year Award Term Option Periods • Indefinite Delivery/Indefinite Quantity (IDIQ) contract • Maximum Ordering Volume of $200M (maximum of nine years) • Cumulative Tasking to date valued at $42.0M NSOF (Devel/Ingest) Suitland, MD NGDC (Ops) Boulder, CO Fairmont, WV (Devel) NCDC (Ops) Asheville, NC DAARWG Meeting: CLASS Update

  6. Development • Systems • Software • Data Integration DAARWG Meeting: CLASS Update

  7. Software Development Open Archive Information System Reference Model (OAIS-RM) Stakeholders Data Stewards • CLASS Software Evolution (CLASS-SE). Five-year project to establish configurable ingest • NOAA Enterprise Archive Access Tool (NEAAT). Enterprise Application Program Interface (API) Preservation Planning CLASS System Data Management Ingest Access Producer Consumer Archival Storage Administration System Administrators OAIS-RM DAARWG Meeting: CLASS Update

  8. Software Development CLASS Software Evolution (CLASS-SE) • CLASS-SE provides configurable ingest capability • Configurability reduces development costs for storage and access of new data types • Applications, or services, are ‘data agnostic’ and may be applied against selected data types • Workflow engine supports both • Collective system operations & resource allocation • Allocates services from 1 to N instances of CLASS nodes • Individual Nodes may perform all or sub-set of system capabilities DAARWG Meeting: CLASS Update

  9. Software Development Customer Application NOAA Enterprise Archive Access Tool (NEAAT) • Enterprise access to all NOAA archive storage systems • Satisfies L1RD Requirements • CLASS Access Interface • Support to GEO/IDE • Interface to Legacy Systems • Supports both data access and stewardship applications • Service-Oriented Architecture (SOA) Middleware • Simple plugin adaptor in integration layer provides interface to NEAAT • Support for open source tool kits (i.e., OGC) • OPeNDAPprotoypeprovides access to climate model data data through NOMADS (National Operational Model Archive & Distribution System) NEAAT Standard Protocols (OPeNDAP) Plugin Plugin Plugin Plugin Plugin Legacy Systems Satellites CFSR NCDC NCEP Models CLASS HDSS ESG INE IDEAS ESSE SPIDR NARR NOMADS Data Migration DAARWG Meeting: CLASS Update

  10. Data Integration Data Campaigns (L1RD 5.1.2, 11/6/08) DAARWG Meeting: CLASS Update

  11. Data Integration New Acquisitions DAARWG Meeting: CLASS Update

  12. Recent Accomplishments Archiving Climate Forecast System Reanalysis And Reforecast (CFSRR) • First climate model data archived on CLASS; access through NOMADS • 245 TB of CFS reanalysis in tape archive • 100 TB of “high-priority” data available on disk for rapid access • Just began ingest of reforecast • NCDC-NCEP-CLASS Project partnership • Reuse of existing spinning disk • Agile Software Development; rapid integration of open source access protocol • One of NCEPs most important data sets • Significant jump in NCDC data access DAARWG Meeting: CLASS Update

  13. Resent Accomplishments Continuously Operating Reference Stations (CORS) • First NOS data set in CLASS • Complete: CORS File Naming Convention document signed May 2010 • NGDC has established a centralized ingest interface to CLASS • In-process: Interface Control Document • CORS goals for CLASS version 5.4 release: • archive of forward-looking RINEX files (3173 daily files) and metadata • daily ingest ~ 49.5 GB/day • Future CORS goals, pending success of CLASS archive: • archive of forward-looking binary files • archive of historical RINEX and binary files • archive of NGS reanalysis data • current NGDC archive total: ~69.0 TB DAARWG Meeting: CLASS Update

  14. Future Direction Current Architecture Ingest Node NSOF Full Node Full Node NCDC NGDC Replication Archive Data Sources Simple model based on preservation through two-site replication. DAARWG Meeting: CLASS Update

  15. Future Direction Potential Architecture Data Processing Processing Node Ingest Node Data Producer Cloud Access Data Stewardship Full Node Data Center • Increase distribution of nodes • Federate with Centers of Data providing tiers of service • Exploit cloud resources for faster access • Becoming more H/W agnostic Sub-Node (Federated) Center of Data DAARWG Meeting: CLASS Update

  16. Future Direction Prototype Capability • NCDC partnership with RENCI and DataNet Consortium • Prototype “system of systems” framework; federation of NOAA data systems with NOAA archive using iRODS (Integrated Rule-Oriented Data System) • Connectivity to data systems such as RENCI, ORNL, OOI, and Earth System Grid (ESG) • Pilot Project • Federate with RENCI to share 70TB of NEXRAD data • Utilize highly distributed computing to derive climate-quality precipitation re-analysis, push data products to NCDC archive system (CLASS) • Future Plans to Support Climate Assessments • Federate with NOS systems via RENCI to integrate data from OOI with climate data at NCDC • Federate with GFDL and ESG to integrate climate model data with in situ and satellite data at NCDC DAARWG Meeting: CLASS Update

  17. Next Steps • #1 Priority: NPP Operational Test & Evaluation • Prepare FY13 Submission • Release of 5.4.1; implement final version of NEAAT • Complete Cloud Computing Study • Establish Archive Architecture and ConOps • Prepare for Transition to Climate Service • Stand-up Project Management Staff • Program Review • Migrate data from legacy systems DAARWG Meeting: CLASS Update

  18. DAARWG Engagement • Need for NOAA-level focus on enterprise infrastructure (beyond “comm-lines”), i.e., NOAA Program for GEO-IDE development and fielding • Need for NOAA-level policies, directives, and concepts to constraint operational practices and guide IT investments, i.e., CLASS DAARWG Meeting: CLASS Update

  19. Scott Hausman Acting Director NOAA’s National Climatic Data Center (NCDC) 151 Patton Avenue, Room 557 Asheville, NC 28807-5002 • 828-271-4848 828-271-4246 828-450-9188 Scott.Hausman@noaa.gov www.ncdc.noaa.gov

More Related