390 likes | 497 Views
Procedures for Achieving Metadata Registry Content Consistency: Data Elements. Larry Fitzwater, U.S. EPA Judith Newton, NIST Lois Fritts, SAIC January 17, 2000. Open Forum on Metadata Registries Santa Fe, NM. SDC-0002-021-JE-2026. Contents of Working Paper. 1. Scope 2. References
E N D
Procedures for Achieving Metadata Registry Content Consistency: Data Elements Larry Fitzwater, U.S. EPA Judith Newton, NIST Lois Fritts, SAICJanuary 17, 2000 Open Forum onMetadata Registries Santa Fe, NM SDC-0002-021-JE-2026
Contents of Working Paper 1. Scope 2. References 3. Definitions 4. Types of Abstraction 5. Registry Population Annexes
Scope Based on the model of a data registry described in ISO/IEC 11179, Part 3. Describes business rules for the registration of data elements and their attributes in a registry, to assist in consistently establishing good quality data elements. Helps to achieve metadata content consistency through procedures and examples.
Data Element Abstraction Open Forum onMetadata Registries Santa Fe, NM Judith Newton, NIST SDC-0002-021-JE-2026
Types of Abstraction Relevant to Data Elements 1.Specialization/generalization–all items in the superclass are also in the subclass 2. Decomposition/aggregation–the part-of relationship
Specialization/Generalization State USPS Code State USPS Code GeographicState USPS Code Mailing AddressState USPS Code FACILITY GeographicState USPS Code CUSTOMER GeographicState USPS Code FACILITY Mailing Address State USPS Code CUSTOMER Mailing AddressState USPS Code
Decomposition/Aggregation Country Identifier CountrySubdivision Code CountyCode BoroughCode MetropolitanDistrict Code Unitary Auth. Code SpecialArea Code
Register Metadata for a Data Element Open Forum onMetadata Registries Santa Fe, NM Lois Fritts, SAIC SDC-0002-021-JE-2026
A Metadata Registry Can Be Overwhelming SDC-0002-021-JE-2026
This presentation is a practical approach to populating the content of a data registry for data elements. SDC-0002-021-JE-2026
Overview • General Procedures • Examples of Registration • Data Element Groups
General Procedures 1. Understanding the data element 2. Content research 3. Population of metadata attributes 4. Classification 5. Quality control
Population of Metadata Attributes • Bottom Up Approach A data element is attributed with known facts prior to defining the conceptual information about a data element. • Top Down Approach A classified group is added to the registry, beginning with conceptual domains, value domains, and working down to the individual data elements.
Logical Bottom Up Process Conceptual Domainand Value Meanings Data Element Concept Other Data Element Attributes Data Element Name and Identifiers Permissible Values Data Element Definition SDC-0002-021-JE-2026
Logical Top Down Process Conceptual Domain andValue Meanings Data Element Concept Data Element Definition Permissible Values Data Element Name andIdentifiers Other Data ElementAttributes SDC-0002-021-LF-1005 SDC-0002-021-JE-2026
Other Attributes • Submitting Organization • Data Steward • Comment • Example • Origin • Document • System • Standard • Administrative
Data Element Examples • ISO Standard–Enumerated • ISO Standard–Non-enumerated • Application System–Enumerated
ISO Standard–Enumerated ISO 3166Country Identifiers Short English Name Long English Name 2-character abbrev. 3-character abbrev. 3-digit code Short French Name Long French Name --United States --United States of America --US --USA --840 --ÉTATS-UNIS --États-Unis d’Amérique
Codes for Data Element Registration Definition (Def) Permissible Value (PV) Value Domain (VD) Value Domain Origin (VDO) Data Element Name and Identifiers (DEID) Data Element Name Context (CNTX) Data Element Concept (DEC) Conceptual Domain (CD) Classification (Cl) Layer of Abstraction (LA) Registration Status (RS)
ISO 3166–Enumerated Def: The short name of a country, represented in the English language PV: Afghanistan, Albania,…Zimbabwe VD: Short English-language country names VDO: ISO 3166-1:1997 DEID: 209033:1 Short English-language country name CNTX:Registry DEC: Country identifier CD: Countries of the world Cl: Geopolitical entities; country identifiers LA: Generalization RS: Standard
ISO Standard–Non-enumerated Latitude Longitude Altitude ISO 6709Geographic Point Locations Latitude Sexagesimal Measure SDC-0002-021-LF-XXXX
ISO 6709–Non-enumerated Def: The sexagesimal measure of the angular distance of a position on the earth on a meridian north or south of the equator PV: <All measures recorded as DDMMSS.SS> VD: Sexagesimal measures of latitude VDO: Not applicable DEID: 312345:1 Latitude sexagesimal measure CNTX:Registry DEC: Latitude distance CD: Latitude coordinates Cl: Geographic point location LA: Generalization RS: Recorded
Application System–Enumerated Name Street Address City, State Postal Code Country 33c Mailing Address Country Name SDC-0002-021-JE-2026
Application Data Element Def: The name of a country where the addressee is located PV: Afghanistan, Albania,…Zimbabwe VD: Short English-language country names VDO: ISO 3166-1:1997 DEID: 5394:1 Mailing_Address.Country_Name CNTX: Facility data system DEC: Address country identifier CD: Countries of the world Cl: Mailing address LA: Specialization RS: Recorded
Register a Classification ofData Elements 1. Understanding the classified group 2. Specifying the data elements 3. Understanding the group’s source: • Name • Definition • Authority • Rationale • Potential usage • Identifier
Data Element Classifications • Document • Standard • Composite data element
Classify by Document Source: Facility Location and Identification Standard Definition: Core set of data elements that supports location and identification of place-based objects Authority: Federal Geographic Data Committee(FGDC) Rationale: Proposed U.S. National Standard Usage: Facilitates data sharing about facilities Identifier: 1234
Data Elements in Document Facility Name Facility Category Type Facility Identification Number Latitude Measure Longitude Measure
Classify by Standard Source: Standard representation of latitude, longitude, and altitude for geographic point locations Definition: The horizontal and vertical coordinates that define a point on the earth Authority:ISO 6709 Rationale: International standard Usage:System developers to design a database entity and transfer data files Identifier:1345
Data Elements in Standard Latitude Degrees Measure Longitude Degrees Measure Altitude Measure Latitude Sexagesimal Measure Longitude Sexagesimal Measure
Classify by Composite Data Element Name: Urban-style street address Definition:A set of precise data elements that can be combined into a street address Authority:U.S. Postal Service, Publication 28: Postal Address Standards Rationale: U.S. national standard for creating a mail piece Usage:Parse street address for validation of individual segments Identifier:2543
Data Elements inStreet Address Building Number Pre-directional Code Street Name Street Suffix Code Post-directional Code Secondary Unit Code Suite Number
Composite Data Element 200 N Glebe Road SW Suite 300 Example of data values forUrban-Style Street Address SDC-0002-021-JE-2026
Linking Data Elements • Vertical • Horizontal • Used Together
Vertical Linking State USPS Code Mailing Address State Code Facility Mailing Address State Code Generalization to Specialization
Horizontal Linking PCS_Permit_Facility.Mailing _State FacilityMailing AddressState Code BRS_Site_Information.Mail _State RCR_Mailing_Location.State Equivalent Layer of Abstraction
Linking by Use SampleQuantity Sample QuantityUnits Name Example: 17 milligrams
This is a practical, logical approach to registering “good” data elements. SDC-0002-021-JE-2026
Discussion SDC-0002-021-JE-2026