1 / 24

Baseline Findings EPA Enterprise Data Architecture / Data Management Metadata

Baseline Findings EPA Enterprise Data Architecture / Data Management Metadata. Kevin Kirby, Enterprise Data Architect / EPA Enterprise Architecture Team Kirby.Kevin@epamail.epa.gov (202) 566-1656. Agenda. Present cross-federal context for metadata study at EPA

caron
Download Presentation

Baseline Findings EPA Enterprise Data Architecture / Data Management Metadata

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Baseline FindingsEPA Enterprise Data Architecture / Data Management Metadata Kevin Kirby, Enterprise Data Architect / EPA Enterprise Architecture Team Kirby.Kevin@epamail.epa.gov(202) 566-1656

  2. Agenda Present cross-federal context for metadata study at EPA Discuss related terms/concepts/taxonomies Present and vet approach for proposed Metadata Maturity Model Present Metadata Mapping study Next steps

  3. Conceptual Cross-Federal Data Model

  4. Types of Data Transactional data • Dollars earned or units sold Reference data • Entity by which transactions measured • ‘Country’, ‘Prefix’ and ‘Industry Master data • Single version of the truth • Key corporate reference entities like ‘Customer’, ‘Location’ and ‘Product’ Metadata • Describes objects by connecting objects to the subjects they are about

  5. Types of Metadata Technical- data sources, access protocol (ODBC, JDBC, SQL*NET, etc.), physical schema (database definition, table definition, column definition, etc), logical data source (ER models, object models, etc.) Business- contextual data about the information retrieved; taxonomies that define business organizations and product hierarchies; controlled vocabulary or reference data that are used to define business terms such as a medical dictionary and financial terminology.

  6. Metadata & Related Terms • Metadata describes objects by connecting objects to the subjects they are about • Controlled vocabularyis a closed list of subjects that can be used for classification • Taxonomy is a subject-based classification that arranges the terms in the controlled vocabulary into a hierarchy • Thesauri take taxonomies and extend them to make them better able to describe the world

  7. Taxonomy • Metadata can be organized using a taxonomy • Helps an audience find information more easily • Blue lines reflect metadata; black lines reflect taxonomy • Blue lines – metadata about the paper • Black lines – subject-based taxonomy

  8. Taxonomy Categorization Schemes Hardest Easiest

  9. Thesauri (e.g. ISO2788) • BT ( Broader Term) - refers to the term above this one in the hierarchy • SN (Scope Node) - a string attached to the term explaining its meaning • USE - refers to another term that is preferred to this term • TT (Top Term) - refers to the topmost ancestor • RT (Related Term)- refers to a term, related to this term, without being a synonym

  10. Metadata Maturity Model WITH NO METADATA MGMT • Information is lost or hidden • Data integration is costly • Cannot support everyday business • Information is difficult to find • Partial & dated information • Loss of trust in data METADATA MANAGEMENT The organization of technical and business metadata with the goal to advance the sharing, retrieving and understanding of enterprise information assets.

  11. Metadata Maturity ModelPhase I: Ad Hoc PROCESS • Changes are locally acquired, made and consumed • Sharing through conversations with ‘incumbents’ • Infrequent changes TECHNOLOGY • Spreadsheets and unstructured tools • Application specific metadata components PEOPLE • Small group of rouge metadata warriors • Knowledge is in people’s heads • Sharing of metadata is ad-hoc

  12. Metadata Maturity ModelPhase II:Discovered PROCESS • Limited sharing of metadata • Local or semi-local repositories • Local attempts at managing metadata • Exploration of core metadata and metadata tools TECHNOLOGY • Modeling tools • Application specific metadata components • Some metadata management tools • Mix PEOPLE • Management awareness • Sporadic adding to various repositories • ‘Talk’ about importance of sharing metadata

  13. Metadata Maturity ModelPhase III:Managed PROCESS • Governance process is created and enforced • Workflows • Communication with ‘outside’ departments • Beginnings of real-time integration TECHNOLOGY • Metadata management tools with governance process • Workflow engine • Business rule engine • Data integration tools PEOPLE • Data stewards • Data governance body • Management understands importance of administering metadata

  14. Metadata Maturity ModelPhase IV:Integrated PEOPLE • Constantly seeking optimization • Metadata administrators – centralized validation PROCESS • Enterprise-level standards • Taxonomy, Ontologies, etc. • Authoritative data sources for entities TECHNOLOGY • Collaboration tools • Enterprise data modeling tool • Vocabulary and taxonomy management tool

  15. Metadata Maturity ModelPhase V:Optimized PEOPLE • Start managing metadata as part of business • Critical, ubiquitous, invisible part of the organization TECHNOLOGY • Ontology management • Reasoning technology • Data mediation PROCESS • Automated real-time integration • Domain ontologies & topic maps • Seamless integration at low cost

  16. Metadata Management Framework v.2 • Metadata Category • Extendable baseline • Aids classification • Metadata Order • Intuitive faceted framework • Helps users find data • Metadata Taxonomy • Prioritizes metadata domains • Allows for domain extentions

  17. Metadata Cross-Reference Expandable Set of Core Systems Flexible & Expandable Categories Domain-Specific Extensions • Small, core set, based on Dublin Core • Owned and maintained by Enterprise Architecture • Extensible by domain and core systems • In sync with data standards • Main purpose is to aid ‘findability’

  18. The Dublin Core Standard • Created in 1995 to aid internet searches • Most common metadata standard • Every metadata description should describe just one information resource • 15 core data elements

  19. Many Standards bodies exist Content is modifiable Extensions can be used & registered

  20. Dublin Core Framework & Extensions Domain specific metadata extensions (e.g. geospatial) Dublin Core adopted as standard Metadata extensions for managing information through its lifecycle Mandatory set of Common Look and Feel elements Extensions for clusters and gateways

  21. Data Governance Components • Data Stewards • Principle – ‘Guardians’ of Data • Business – Help define data and stewardship standards • Data Architects • Part of EA; Understand EA • Broker requests for new data and data changes • Responsible for enterprise-wide taxonomy • Data Advisory Committee (DAC) • Strategic • Managers & Execs • Broad representation • Infrastructure Team • Responsible for physical architecture and data provision • DBA’s & Developers • Systems & Network Administrators

  22. DOI Data Governance Framework

  23. Value vs. Cost of Metadata High awareness but no governance • ROI point • Start of governance • Right of Phase III Sharp rise in cost for unmanaged metadata

  24. Next Steps Continue to expand the Baseline findings to include Core Mission Area Segments’ Data Stores Vet findings and proposed metadata framework with System Owners and COI Begin gathering requirements / considerations for metadata tools

More Related