1 / 26

FHA Data Architecture Working Group: EPA Data Architecture for DRM 2.0

FHA Data Architecture Working Group: EPA Data Architecture for DRM 2.0. Brand Niemann (US EPA), Chair, Semantic Interoperability Community of Practice (SICoP) Best Practices Committee (BPC), CIO Council February 22, 2006 http://web-services.gov/ and

tinoco
Download Presentation

FHA Data Architecture Working Group: EPA Data Architecture for DRM 2.0

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FHA Data Architecture Working Group: EPA Data Architecture for DRM 2.0 Brand Niemann (US EPA), Chair, Semantic Interoperability Community of Practice (SICoP) Best Practices Committee (BPC), CIO Council February 22, 2006 http://web-services.gov/ and http://colab.cim3.net/cgi-bin/wiki.pl?SICoP http://colab.cim3.net/cgi-bin/wiki.pl?DRMImplementationThroughIterationandTestingPilotProjects

  2. Overview • 1. FEA Data Reference Model 2.0 • 2. FEA Enterprise Architecture Assessment Process 2.0 • 3. EPA Data Architecture Based on DRM 2.0 • 4. EPA DRM 2.0 Community of Practice • Appendix: Q&A on DRM 2.0 for the FHA DAWG

  3. 1. FEA Reference Model 2.0 • The FEA framework and its five supporting reference models (Performance, Business, Service, Technical and Data) are now used by departments and agencies in developing their budgets and setting strategic goals. With the recent release of the Data Reference Model (DRM), the FEA will be the “common language” for diverse agencies to use while communicating with each other and with state and local governments seeking to collaborate on common solutions and sharing information for improved services. Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3. http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf

  4. 1. FEA Data Reference Model 2.0 • The following chart illustrates the potential uses of the newly released DRM Version 2.0: • The FEA mechanism for identifying what data the Federal government has and how it can be shared in response to a business/mission requirement. • The frame of reference to facilitate Communities of Interest (which will be aligned with the Lines of Business) toward common ground and common language to facilitate improved information sharing. • Guidance for implementing repeatable processes for sharing data Government-wide. Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3. http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf

  5. 1. FEA Data Reference Model 2.0 Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3. http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf

  6. 1. FEA Data Reference Model 2.0 Relationships and associations • Metamodel: Precise definitions of constructs and rules needed for abstraction, generalization, and semantic models. • Model: Relationships between the data and its metadata. • Metadata: Data about the data. • Data: Facts or figures from which conclusions can be inferred. Source: Professor Andreas Tolk, August 16, 2005 The purpose of this schematic is to show that we need to describe information model relationships and associations in a way that can be accessed and searched.

  7. 1. FEA Data Reference Model 2.0 The point of this graph is that Increasing Metadata (from glossaries to ontologies) is highly correlated with Increasing Search Capability (from discovery to reasoning).

  8. 1. FEA Data Reference Model 2.0 • Conceptual Data Model – a model to guide data architecture and not a model to guide database development. • But an ontology provides both and the pilot is both a CDM and an executable application based on DRM 2.0! • So Data Architecture can be implemented in ontology-driven information systems.

  9. FEA Reference Model Taxonomies FEA “Common Language” DRM 1.0 by committee Implementation after development. FEA Reference Model Ontology FEA Semantic Model DRM 2.0 by open, collaborative process Implementation though iteration and testing during development. 1. FEA Data Reference Model 2.0 Paradigm Shifts

  10. 1. FEA Data Reference Model 2.0 • Original FEA Lines of Business (6): • Data and Statistics: • Opted out because of FedStats, Federal Committee on Statistical Methodology, etc. (it had its act together for statistical data management) • Now it’s back with: • The new Data Reference Model 2.0 because statistical programs generally have the best data and metadata and data management practices. • The National Infrastructure for Community Statistics Community of Practice (NICS CoP) • The Federal Health Architecture Data Architecture Working Group because FHA agencies are statistical agencies: • See for example Health, United States, 2005 from the National Center for Health Statistics!

  11. 2. FEA Enterprise Architecture Assessment Process 2.0 • Section 1.3.3 Data Architecture (Information Management): • Description: Enterprise data described at the level of business data entities, linked to the FEA Data Reference Model (DRM) as it evolves and other layers of agency EA. • Rationale: An enterprise data architecture is the key to identifying data sharing and exchange opportunities both within and across agencies. • Mandate: OMB A-11, s.300; GPRA; Clinger-Cohen Act, Data Quality Act, E-Government Act of 2002, OMB M-05-04, OMB A-119, OMB Information Dissemination Memorandum 207(d) http://www.whitehouse.gov/omb/egov/a-2-EAAssessment.html

  12. 2. FEA Enterprise Architecture Assessment Process 2.0 • Level 3 Practices (Example from Levels 1-5): • Activities: The agency has created a high-level target data architecture that identifies opportunities for information sharing and consolidation. When applicable and required by law and policy, the agency has prepared and published inventories of the agency's major information holdings and dissemination products, and otherwise made them available for use by all interested and authorized parties including other agencies and as appropriate, the general public, industry, academia, and other specific user groups. • Artifacts: Target Data Architecture See EPA DRAFT Response in Slide 20.

  13. 2. FEA Enterprise Architecture Assessment Process 2.0 • Appendix A: Artifact Descriptions (Excerpts) • OMB is interested in the content of the artifacts and does not prescribe the form they should take, so long as the artifact can be submitted to OMB without requiring the use of proprietary software products such as EA modeling tools. • The Data Architecture is a perspective of the overall agency EA that provides the information about the agency's baseline and target architectures. Examples of elements that may be included: • Agency data model that describes the key data elements of the agency's business domain, and the relationships between them. The data model may include data dictionaries, thesauri, taxonomies, topic maps. • Linkage between the agency data model and the service components that access the data elements.

  14. 3. EPA Data Architecture Based on DRM 2.0 • Data Architecture: Architecture is a plan to build a home that we can live in and data architecture is a plan to build a data place that we can work in on the Web. • Methodology and Architecture: Nicola Guarino, Formal Ontology and Information Systems, Proceedings of FOIS ’98, Trento, Italy, 6-8 June 1998. • Methodology Side – the adoption of a highly interdisciplinary approach (e.g., CoP): • Analyze the structure at a high level of generality. • Formulate a clear and rigorous vocabulary. • Architectural Side – the central role in the main components of an information system: • Information resources. • User interfaces. • Application programs.

  15. 3. EPA Data Architecture Based on DRM 2.0 • Suggested Steps: • 1. The EPA Conceptual Data Architecture (CDM) is based on that for the new DRM 2.0. • 2. The use of an ontology to provide both a CDM and an executable application based on DRM 2.0! • 3. Implement the DRM 2.0 Data Architecture with XML Web Services in a Service Oriented Architecture Distributed Content Network. • 4. Strategy - implement with the EPA major data and metadata systems to develop a Universal Core outwards, like starting with water data indicators, moving to water data interoperability across agencies, which we have already done or are working on. • Based on December 28, 2005, SICoP DRM 2.0 Pilot of LoB/CoP Started in Support of the New Federal Health Architecture's Data Architecture Working Group to Model (Ontology) the Documents for the Dynamic Knowledge Repository.

  16. 3. EPA Data Architecture Based on DRM 2.0 • Implementation: Executable application in ROE 2007 Electronic Reporting Using Semantic Technologies based on Proposed Indicators for 2007 Report on the Environment (ROE 2007) (5 topics and17 subtopics with 95 indicators): • 1. Air • Outdoor Air: 25 Indicators • Indoor Air: 2 Indicators • 2. Water • Water and Watersheds: 17 Indicators • Drinking Water: 1 Indicators • Consumption of Fish and Shellfish: 2 Indicators • 3. Land • Land Cover: 2 Indicators • Land Use: 2 Indicators • Chemicals: 5 Indicators • Waste: : 2 Indicators • Contaminated Lands: 2 Indicators • 4. Human Health • Health Status: 3 Indicators • Human Disease and Conditions: 10 Indicators • Bio-measures of Exposure: 6 Indicators • 5. Ecological Condition • Extent and Distribution of Ecological Systems: 5 Indicators • Diversity and Biological Balance: 5 Indicators • Ecological Processes: 1 Indicator • Critical Physical and Chemical Attributes: 5 Indicators

  17. 3. EPA Data Architecture Based on DRM 2.0 Note: Implements Slide 6 and Provides SOA with Taxonomy of XML Web Services Nodes!

  18. 3. EPA Data Architecture Based on DRM 2.0 • The majority of EPA data architecture work has occurred outside of EPA's EA Team with the exception of the Strategic Information Model (SIM): • 1. Circa 2002 Data Areas-Data Classes - Target Data Model (Not a Model) Same As 2002 Data Architecture, I-3? • 2. Early 2004 SIM: Strategic Information Model • Joan Karrie and SRA Subcontractor (Interarc Associates, Peter Lang), put the basic structure from the EPA 2003-2008 Strategic Plan in Visible Advantage. Those files (Grid model - hairball - hard to read) were transferred to ERwin and were provided recently in HTML format with no changes in preparation for use in EPA's EA Metis System as follows: • The EPA SIM was developed for EPA in Visible Advantage, April 7, 2004. • The model was transferred to AllFusion ERwin Data Modeler, November 29, 2005. • This subject view was generated from ERwin version, November 29, 2005. • 3. April 7, 2004, EIM: Enterprise Information Model 10 Model Sub-Views. • 4. January 28, 2004, Presentation Contained 6 Model Sub-Views.

  19. 3. EPA Data Architecture Based on DRM 2.0 See http://colab.cim3.net/file/work/SICoP/EPADRM2.0/EPASIM/Untitled.htm Note: This is a non-proprietary format as OMB requires!

  20. 3. EPA Data Architecture Based on DRM 2.0 • Status (draft): The Agency has partially documented elements of its data architecture. We have implemented a System of Registries (SoR) to establish a definitive source for critical data such as facilities and substances. The Environmental Data Standards Council (EDSC) has established and maintained Agency data standards including the adoption of 14 new data standards in January. In the future, the development and management of the data standard development, maintenance, and revision will be under the supervision of the Network Operations Board (NOB). The Central Data Exchange (CDX) solution has documented each of its data exchange packages and XML Schemas, mapping them to topics, applications, and partners. The Geospatial segment architecture has defined data standards and policy and created an inventory of data repositories that are mapped to applications and servers. Furthermore, the Agency developed a Strategic Information Model (SIM) representative of the Agency's top level data architecture. Similarly, some program offices have defined their target data architectures.

  21. 3. EPA Data Architecture Based on DRM 2.0 • Some Recommendations (DRAFT): • SIM/EIM is too detailed – do at higher conceptual level like the initial sketch of a building that a person off the street would understand (e.g. the environmental indicator ontology approach). • Evolve EPA Data Standards so they capture the “tripod” (data model, business rules, and process model) more effectively. • Then apply them to real world applications to serve the EPA DRM 2.0 Community of Practice (see next slide).

  22. 4. EPA DRM 2.0 Community of Practice

  23. 4. EPA DRM 2.0 Community of Practice Collaborative Wiki Page See http://colab.cim3.net/cgi-bin/wiki.pl?EPADataArchitectureforDRM2

  24. 4. EPA DRM 2.0 Community of Practice Collaborative File Repository See http://colab.cim3.net/file/work/SICoP/EPADRM2.0/

  25. Appendix: Q&A on DRM 2.0 for the FHA DAWG • Question: Should the FHA DAWG be overly focused on metadata? • Metadata and data are integrated together in DRM 2.0 and the pilot. • Question: Should FHA DAWG work with unstructured or semi-structured data or defer this task to partners/agencies? • All three types of data are integrated together in DRM 2.0 and the pilot. • Question: Should FHA DAWG also add physical data modeling to methodology? • The DRM ITIT Pilot shows how both conceptual and physical data are done together with ontologies. • Question: Should educational material on metadata and data modeling be present in the Data Strategy? • DRM 2.0 put educational material in the DRM Reference Model and ITIT Wiki Pages, not the Reference Model Document itself. See http://web-services.gov/scopefhadawg.ppt

  26. Appendix: Q&A on DRM 2.0 for the FHA DAWG • Question: Should we align more closely to FEA DRM? • Aligning with DRM 2.0 adds credibility to the work and pilot specifically demonstrates the three components of DRM 2.0. • Question: How detailed of a level of analysis can be performed by the FHA DAWG? • This depends on the level of detailed data and information that the FHA partners are willing to expose, e.g. the pilot uses summary data that is in the public domain. • Question: Does the FHA DAWG analyze only (discover) or does it prescribe a solution (recommendation) like semantic harmonization scenarios? • SICoP and DRM ITIT are concerned with achieving semantic harmonization and interoperability. E.g., the suggestion to include the CHI vocabularies in the pilot should be implemented.

More Related