1 / 37

CLASS Presentation to the NOAA Science Advisory Board’s

CLASS Presentation to the NOAA Science Advisory Board’s Data Archiving and Access Requirements Working Group Robert Rank CLASS New Campaigns Manager May 24-25, 2007. Objectives. Review CLASS mission, role, drivers, challenges What is an ‘Archive’ Open Archive Information System (OAIS)

betsy
Download Presentation

CLASS Presentation to the NOAA Science Advisory Board’s

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CLASS Presentation to the NOAA Science Advisory Board’s Data Archiving and Access Requirements Working Group Robert Rank CLASS New Campaigns Manager May 24-25, 2007 DAARWG _ May 24-25, 2007

  2. Objectives Review CLASS mission, role, drivers, challenges • What is an ‘Archive’ • Open Archive Information System (OAIS) • CLASS Architectural Transitions • Discuss GEO-IDE/CLASS joint efforts • CLASS Campaign Status • NODC IOC/FOC • CLASS APIs • NPP DDR • Open discussion DAARWG _ May 24-25, 2007

  3. CLASS Mission In its simplest form, CLASS’s mission is to provide IT infrastructure and support for NOAA archives. DAARWG _ May 24-25, 2007

  4. What is an “Archive?” • Historical definition: roughly “record preservation” • Semantics, functions varied significantly across “archives” • Digital information preservation is driving changes • Need for long-term understandability/usability • “Data” vs. “Information” • Modern definition: OAIS-RM • “… an organization of people and systems, that has accepted the responsibility to preserve information and make it available for a Designated Community.” DAARWG _ May 24-25, 2007

  5. OAIS-RM • International standard reference model for “open information archives” • Reference in: • Report to Congress, CLASS L1Rs, ARWG-A&A Reqs • Initially inspired by need to establish common framework for discussions between archival entities • Provides • Terminology and concepts • Guidelines • Typical functional entities and services • Information taxonomy • “Mandatory responsibilities” • Core ‘Definition of Archive’ DAARWG _ May 24-25, 2007

  6. OAIS Mandatory Responsibilities • Negotiate for and accept appropriate information from Producers • Obtain sufficient control of the information to ensure Long-Term Preservation • Determine the Designated Community • Ensure Independent Understandability • Follow documented policies and procedures for preservation • Make information available to the Designated Community DAARWG _ May 24-25, 2007

  7. CLASS/Archive Relationship • Information Preservation – (Requirements definition) • Archive “performs” • CLASS “provides IT capabilities in support of” • Expertise • Archive: information structure, semantics, usage; science • CLASS: IT • Focus • Archive: information • CLASS: data (bits) • Stewardship • Archive: science data stewardship - (Requirements development) • CLASS: data stewardship • Key takeaway: CLASS is not (nor does it “have”) an archive DAARWG _ May 24-25, 2007

  8. CLASS Mission Revisited CLASS’s mission is to provide IT infrastructure and support that enable NOAA archives to implement OAISs. DAARWG _ May 24-25, 2007

  9. OAIS-RM Functional Entities in CLASS Context DAARWG _ May 24-25, 2007

  10. Selected CLASS Attributes • Vision • “… An enterprise-wide IT system supporting long-term, secure storage of and access to NOAA’s archived environmental datasets” – CLASS L1Rs • Scope • “… NOAA (digital spatial environmental) data • “… capable of supporting both existing and new archive collections …” – CLASS L1Rs • Applicability • “NOAA has directed that both legacy and emerging environmental observing systems requiring long-term archive plan to use CLASS.” – CLASS L1Rs • Customers - (Requirements development) • NOAA archives (e.g., NNDCs & Centers of Data) are CLASS’s direct customers • Consumers and Producers are archives’ direct customers DAARWG _ May 24-25, 2007

  11. Key CLASS Challenges • NOAA roles, responsibilities • Identification and allocation of all archival mission roles and responsibilities, operations • Essential to CLASS success, both practice and perception • NOAA integration/interoperation – GEO/IDE • Data heterogeneity/specificity • Large data volumes • Data management • Long-term data preservation • Complex, evolving problem • Currently in a developmental, research stage • Widely divergent user needs, requirements • Evolving from legacy system DAARWG _ May 24-25, 2007

  12. Addressing the Challenges - Internal • Emphasize generality, scalability, flexibility • Depend on standards wherever possible • Data models • Ontologies • Provide infrastructure customizable by external entities • Customer Interface for Access • Software must support hardware refresh transparently • Abstract hardware in software • Layered architecture • Track preservation-related best-practices and similar archival projects DAARWG _ May 24-25, 2007

  13. Addressing the Challenges - External • Continue to push for clarification of roles and responsibilities • Archive ConOps is key • Continue to stress clear, comprehensive requirements • L1Rs have helped enormously • Push for L2, L*, requirements • ARWG – Archive & Access Requirements - V2.2 • Are they complete? Do they need to be revised? • Transparency • Early, strong commitment to GEO-IDE DAARWG _ May 24-25, 2007

  14. CLASS’s SOA • Consistency with FEAF/NOAA EA • Facilitates interoperability & provides public interfaces • Enables custom “client” applications •  Reduces pressure to be all things to all people • Foundation for distributed infrastructure • Facilitates evolution from “as-is” system • Facilitates cost-effective extensibility • Primary services • Externally visible • Ingest • Access • Internally visible • Archival Storage • Data Management DAARWG _ May 24-25, 2007

  15. CLASS Architectural Transitions - Internal • Transition to SOA • Wrap existing functionality in services • Refactor behind-the-scenes later • Transition to layered architecture • Hardware abstraction • Distributed infrastructure • Continue to emphasize, expand componentization • Tactics • Prototype services and vet internally • Incrementally redesign subsystems • Increase use of standards (formats, code, terminology, etc.) • Decouple CLASS Web Interface from CLASS internals DAARWG _ May 24-25, 2007

  16. CLASS Architectural Transitions - External • Track industry best-practices and others’ lessons-learned • Work with GEO-IDE to • Specify services, interfaces, standards • Vet CLASS-proposed services, interfaces, standards • Publish services gradually, starting with friendly users • … DAARWG _ May 24-25, 2007

  17. Keys to NOAA Integration Efforts • Standard terminology, so people can exchange information efficiently • Interfaces and APIs, so systems know how to communicate • Standards, so data can be exchanged • GEO-IDE partnerships development are the Key • Specification of roles and responsibilities DAARWG _ May 24-25, 2007

  18. Where Should CLASS Participate in Pilots with GEO-IDE? • Services and signatures • Spatial operations • Structural Data Types • Metadata • Standards • Development of guidelines • Early visibility and awareness helps us • Interoperability • WHO do we need to interoperate with? DAARWG _ May 24-25, 2007

  19. Priorities for CLASS/GEO-IDE Relationship • Start interactions now • CLASS already moving forward • Interactions now will help avoid re-work • Identify contacts and conduits • Identify joint efforts and pilot projects • APIs – Prototype now • NODC IOC/FOC • Agree on priorities, schedule • Initiate joint activities DAARWG _ May 24-25, 2007

  20. CLASS NODC IOC Campaign Status NODC IOC – Five (5) Operational Threads – June 07, 2007 • Ingest Operations – The thread begins when the NODC SIP arrives at CLASS. • Dissemination Operations – Two separate sub-threads are considered, depending on the restriction level of an AIP. • Data Update – This thread begins when NODC requests its data from CLASS for purposes of updating its data • Data Integrity Check Processing – This thread is triggered by a schedule for integrity checks on the stored data • Restriction Level Reset – This thread begins with the receipt of a restriction level reset request message from NOD DAARWG _ May 24-25, 2007

  21. CLASS NODC FOC Campaign Status NODC FOC implementation steps – FY07-08 • The Archive Requirements Working Group (ARWG) and CLASS will review the existing Submission Agreement (SA) templates, and will define additional templates for different types of data (e.g., non-periodic data, historical data) • Develop one or more SAs for NODC data as needed • Develop the NODC-CLASS Interface Control Document (ICD) • Evaluate the IOC and improve it if necessary • Resolve the technical issues regarding deletion and versioning of NODC data • Define the FOC requirements • Define steps for transition from the IOC to FOC • Implement the FOC • Transfer all NODC data to CLASS for long-term storage DAARWG _ May 24-25, 2007

  22. CLASS API prototype background • NGDC has a strong interest in a CLASS API in order to fully integrate CLASS within the center • NGDC has extensive experience in API development through SPIDR, SABR and ESG systems. • August of 2006 Users workshop showed a strong user interest in CLASS API’s as well • An initial API un-veiled and discussed at the Asheville workshop (CLASS, SABR, SPIDR, etc.) DAARWG _ May 24-25, 2007

  23. CLASS API Goals • First draft of a user focused WS interface • Demonstration of the concept of “fundamental separation” of archive and storage from access • Interaction with and demonstration for users • Technology discovery and evaluation of cutting edge tools for CLASS • First integration of multiple data types through CLASS (time-series, grid, swath, etc..) DAARWG _ May 24-25, 2007

  24. CLASS –NGDC Prototype Scope DAARWG _ May 24-25, 2007

  25. Current snapshot of the CLASS API architecture DAARWG _ May 24-25, 2007

  26. CLASS NPP Campaign DDR Deliverables • Updated Documents • CLASS-NPP Submission Agreement • Software Description Document • Allocated Requirements with NPP Requirements • Review Item Discrepancy (RID) form • New Documents • NPP Delta Design Review Organization Note • Software Upgrade Plan - (Gap Analysis) • Hardware Upgrade Plan for NPP • Network Upgrade Plan for NPP • Prioritization Policies and Procedures Document • CLASS Load Test Plan for NPP • Performance Benchmark Technical Report • DDR Presentation Slides DAARWG _ May 24-25, 2007

  27. CLASS NPP Campaign Status The CLASS-NPP Delta Design Review (DDR) Scheduled for June 21-22, ’07 at NSOF, Suitland, Md. DAARWG Membership Invited DAARWG _ May 24-25, 2007

  28. Discussion? Thank you! DAARWG _ May 24-25, 2007

  29. Background Slides DAARWG _ May 24-25, 2007

  30. Selected CLASS Bounds/Assertions • CLASS is not an archive/OAIS • OAIS-RM is important for CLASS, but CLASS does not (and can not) conform/comply with it • CLASS does not have a science mission • “Data producers and data centers are responsible for the science data stewardship missions and the development and maintenance of science data stewardship data, information, and metadata.” – CLASS L1Rs •  Ramifications for data specificity problem • CLASS is an extant operational system •  Future versions of CLASS will be evolutions • The as-is CLASS system was developed for a very different set of requirements than those which now exist DAARWG _ May 24-25, 2007

  31. Selected CLASS Architectural Drivers • Requirements: L1, L* (future), A&A v. 2.2, system and allocated • As-is system • New campaigns • Data heterogeneity & volumes • Constant change • Long-term mission • OAIS-RM • GEO-IDE DAARWG _ May 24-25, 2007

  32. Selected CLASS “ilities” • Flexibility  adaptation to change • Generality  support variety in data types, users needs, etc. • Scalability  support increasing data volumes and user activity • Interoperability  fundamental to NOAA, user community • Security  essential aspect of any generally-accessible IT system • Reliability  essential to long-term secure storage mission • Maintainability and evolvability  address long-term mission and change • Openness and standards conformance  support interoperability, usability • Modularity and layering  promote flexibility and maintainability • Heterogeneity  provide flexibility, cost reduction alternatives DAARWG _ May 24-25, 2007

  33. Flexibility is Essential • CLASS’s environment is characterized by change •  Emphasis on evolution, evolvability • What can be done, not what must be done • Example: support multiple nodes • Example: support small-footprint service deployment • Key goal: provide options to CLASS PM DAARWG _ May 24-25, 2007

  34. CLASS Long-Term System Architecture Status • Documents • Long-Term System Architecture Overview - done • To-Be System Architecture Overview – in progress, end of 2007 • Long-Term System Architecture Transition Plan • Long-Term System Architecture Reference Manual • Work in progress • Service decomposition • Interface development • Data specificity approaches workable for both CLASS, NOAA • Infusing LTSA thinking into CLASS redesigns DAARWG _ May 24-25, 2007

  35. Service-Oriented Architecture • Design style used throughout all aspects of creating and using business services • Defines the ways in which services are deployed and managed • Increases reuse • Lowers overall costs • Improves extensibility • Maps easily and directly to a business’s operational processes • Supports a better division of labor between IT and business personnel • Uses description model capable of unifying new and old IT systems • Most important application is connecting the various operational systems that automate an enterprise’s business processes •  For CLASS • Internally: connecting Ingest, Archival Storage, Data Management, and Access • Externally: enabling participation in NOAA SOA; interoperability with other NOAA systems •  For NOAA, connecting new and legacy IT systems (including CLASS) • Facilitates composition of services across disparate pieces of software, whether old or new; inter- or intra-enterprise; and regardless of platform DAARWG _ May 24-25, 2007

  36. Archive ConOps • Identification and allocation of all archival mission roles and responsibilities, operations • Essential to CLASS success, both practice and perception • Example: CLASS strategy for dealing with data specificity needs to be feasible within NOAA DAARWG _ May 24-25, 2007

  37. Strawman Process for Breaking Down Stovepipes • Extra-CLASS • Decide which stovepipes are to be transitioned into CLASS • Analysis • Analyze stovepipe holdings and capabilities • Assess impacts on CLASS for subsuming stovepipe holdings • Draft Submission Agreement and ICD • IOC • Demonstrate ingest and rudimentary access for sample of stovepipe holdings • Poll stovepipe users regarding UI issues • “Historical ingest” campaign • Finalize Submission Agreement and ICD • Extend CLASS-written UI as needed, or …. • … develop new stand-alone UI that duplicates the look-and-feel of the old stovepipe’s interface, but uses CLASS services • Initiate ingest DAARWG _ May 24-25, 2007

More Related