1 / 13

Overview of Metadata Strategy

Overview of Metadata Strategy. Kevin J. Kirby Data Architect, US EPA March 2008. Summary of Issues for Data Advisory Council. Direction from July 2007 Meeting. Warehousing is only a means to an end

haig
Download Presentation

Overview of Metadata Strategy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Overview of Metadata Strategy Kevin J. KirbyData Architect, US EPAMarch 2008 Summary of Issues for Data Advisory Council

  2. Direction from July 2007 Meeting • Warehousing is only a means to an end • Warehouses, ETL tools, data marts, and all the rest are only of interest to the extent that they promote sharing data—making data available for users inside and outside the program. • “Enable to share” means enabling EPA to share data within programs, across programs, with partners, and with the public. • Interoperability is expensive and long term, and not all data needs to be shared • Most EPA data is of interest in pretty narrow program context. • We need to develop a warehousing approach based on a determination of: • Data we always need to share (facility, geospatial, substance) • Data we occasionally need to share, and • Data we may never need to share March 2008

  3. Purpose and General Approach: Phase 1(through April 14, 2008) • Premise: “Enable to Share” • Internal to EPA • Between agencies (Environmental Line of Business) • With the public • First priority: Data Object discovery and evaluation • What data is available? • How do I know if it is adequate to my purpose? • How do I get it? • Future priorities: Understanding the details • Data elements, data models, transfer schema, etc. Data object registries are the first access point for discovery March 2008

  4. Proposed Metadata Framework for Data “Objects” Objects include: DBMS Data Sets Unstructured Data (e-mail, docs)Multimedia etc.

  5. Metadata Framework for Discovery & Evaluation Categories of metadata help the user assess the value of the data set. Levels of metadata exist within an RDBMS set, especially for evaluating quality and security issues. Standard taxonomies aid discovery. These might be specific to broad categories like “Admin./Financial”. EPA Data Classification is a start. March 2008

  6. Applying the Framework to the EDA Seldom shared: least rigorous • Frequently shared vs. seldom shared • A-Level: Frequently shared, complete application of framework required • B-Level: Less frequently shared, subset of framework required • C-Level: Rarely or never shared, no requirements • A + B Levels must be represented in at least one data object registry C B A Frequently shared: most rigorous March 2008

  7. A-Level Entire Framework required B-Level Business Security/Sensitivity Location & Access ETL Admin/Transaction data only if available Data Set and Data Profile Data Object Level Metadata for Sharing March 2008

  8. Data Object Registry Candidates Coverage is Incomplete

  9. Candidate Data Object Registries Only Informatica appears to manage all framework metadata categories, but it applies only to data objects that it manages Proposed metadata framework categories March 2008

  10. Federated Registries with a Common Front End Search Tool Conceptual Architecture Using Faceted Search

  11. Conceptual Federated Search Architecture Major gap is for RDBMS Data Sets not managed by Informatica March 2008

  12. Governance Artifacts to Implement this Framework A National Data Policy Modeled after NGDP

  13. Governance High-level Artifacts March 2008

More Related