Procedures to develop and register data elements in support of data standardization
Download
1 / 31

Procedures to Develop and Register Data Elements in Support of Data Standardization - PowerPoint PPT Presentation


  • 170 Views
  • Uploaded on

Procedures to Develop and Register Data Elements in Support of Data Standardization. September 2000. Based on: ISO/IEC Draft Technical Report 20943, Information Technology –Procedures for Achieving Metadata Registry (MDR) Content Consistency – Data Elements. Metadata Registry.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Procedures to Develop and Register Data Elements in Support of Data Standardization' - brook


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Slide2 l.jpg

Based on: of

ISO/IEC Draft Technical Report 20943,

Information Technology –Procedures for Achieving Metadata Registry (MDR) Content Consistency – Data Elements


Metadata registry l.jpg
Metadata Registry of

EPA’s metadata registry is the Environmental Data Registry (EDR):

www.epa.gov/edr

The EDR is based onan international standardfor metadata registries.

www.epa.gov/edr


International standard for metadata registries l.jpg
International Standard for Metadata Registries of

ISO/IEC 11179:

Information Technology -Data Management and Interchange - Metadata Registries (MDR)


Parts of the standard l.jpg
Parts of the Standard of

  • Part 1: Framework for the Specification and Standardization of Data Elements

  • Part 2: Classification for Data Elements

  • Part 3:Registry Metamodel (MDR3)

  • Part 4: Rules and Guidelines for the Formulation of Data Definitions

  • Part 5: Naming and Identification Principles for Data Elements

  • Part 6: Registration of Data Elements


Data element registration l.jpg
Data Element Registration of

  • Characteristics of the data element are recorded as metadata attributes

  • Registration depends on the amount and quality of information available

  • Data elements might range from:

    • Standard data elements–complete, with good quality

    • Application data elements–incomplete with questionable quality


Steps to follow when registering a data element l.jpg
Steps to Follow When Registering a Data Element of

1

  • Understanding the data element

  • Content research

  • Definition and permissible values

  • Names and identifiers

  • Administrative and miscellaneous attributes

  • Data element concepts

  • Classification schemes

  • Quality control

2

3

4

5

6

7

8


Example of registration l.jpg
Example of Registration of

Registration of a data element for the code used by the United States Postal Service (USPS) to represent a state or state equivalent.

8


Understanding the data element l.jpg
Understanding the Data Element of

Step 1

What kind of data will be stored in this data element?

Are the data values determined by an arithmetic or statistical procedure?

Is there a definition or description of data values?

What will the data values look like–names,descriptions, numerals to be calculated, character strings, or identifiers?


Understanding the data element example l.jpg
Understanding the Data Element - Example of

The USPS standard format for preparing a domestic mail piece requires that the last line of the address contain city name, state code, and ZIP code.

The data element to be registered must represent the list of data values for state code that are acceptable to the USPS for mail delivery.

10


Content research l.jpg
Content Research of

Step 2

Is this data element described in an existing standard?

Does the data element exist in this registry or a federation of registries, that has the potential for being used?


Content research example l.jpg
Content Research - Example of

National Standards:

  • FIPS PUB 5-2, 6-4, 55-3

    • Contain 2-letter state codes

    • Include a code for U.S. Minor Outlying Islands – not recognized by the USPS

    • U.S. does not intend to continue maintaining FIPS codes

  • National Supercomputer Centers Usage Database

    • Contains only 4 of the 8 outlying territories

    • Omits all 4 freely associated states

12


Content research example13 l.jpg
Content Research - Example of

(continued)

National Standards :

  • U.S. Postal Service standards

    • Include codes for all states, outlying territories, and freely associated states of the United States

    • Do not recognize a code for U.S. Minor Outlying Islands, which must be identified on mail pieces by name

    • Include codes for military “States”

International Standards:

  • ISO 3166-Part 2, Country subdivision code

    • Identifies U.S. outlying territories and freely associated states as Countries in Part 1

13


Content research example14 l.jpg
Content Research - Example of

(continued)

Existing Data Elements in the EDR:

  • State USPS Code

  • Mailing Address State Code

  • Geographic Address State Code

All of the Above Include:

  • The code for U.S. Minor Outlying Islands–not acceptable for mail delivery

  • Codes for the 12 Canadian provinces

14


Decision preferred data source l.jpg
Decision - Preferred Data Source of

The preferred data source for a standard data element for state code for mail delivery within the U.S. for states and state equivalents is the USPS standard, available at:

www.USPS.gov/ncsc/lookups/usps_abbreviations.htm

15


Definition and permissible values l.jpg
Definition and Permissible Values of

Step 3

A definition must capture the essential semantic content of a data element.

Definitions are recorded in context (where did the definition originate or how is it applied?).

Permissible values are the domains of acceptable values for the data element:

Enumerated by a specific list of values?

Defined by a description, procedure, or range?


Permissible values value domain l.jpg
Permissible Values – ofValue Domain

Step 3

  • How are values represented (e.g., name, code, text, date)?

  • When did each value become valid/invalid?

  • What are the name and definition/description of the value domain?

  • How many characters are required in the database to store the value?

  • Is the data value recorded as a character string, numerals, integer, or other?

  • Are the data values formatted?


Definition example l.jpg
Definition - Example of

The code that represents a United States state or state equivalent in a mailing address.

Context: USPS Standard

18


Permissible values example l.jpg
Permissible Values - Example of

  • Representation: Code

  • Value Domain Name: The state codes for states and state equivalents of the United States

  • Definition: All codes recognized by the U.S. Postal Service on a mail piece for identification of a state or state equivalent of the United States

  • Field length: 2

  • Datatype: alphabetic

  • Format: character string

  • List of values: 62 values representing the 50 states, the District of Columbia, the 8 outlying territoriesand freely associated states, and the 3 codesfor military states

19


Names and identifiers l.jpg
Names and Identifiers of

Step 4

A name is a term or phrase that describes the data element–something to call it.

Names are recorded in context (where did the name originate or how is it applied?).

Identifiers are unique. They identify the Registration Authority, the organization, the data element, and the version of the data element if information about the data element changes.


Names and identifiers example l.jpg
Names and Identifiers - Example of

  • Name: State or State Equivalent Code

  • Context: USPS Standard

  • Identifier:

    • Registration Authority: EPA

    • Organization: OEI

    • Sub-organization: OIC

    • Data Element ID: 29324

    • Version: 1

21


Administrative and miscellaneous attributes l.jpg
Administrative and Miscellaneous Attributes of

Step 5

  • Submitting organization–the organization that has submitted the data element for registration

  • Stewardship contact–the organization delegated the responsibility for maintaining the data element

  • Data element comment–provides remarks about usage, procedure, and other explanatory information that is not appropriate to include in the definition


Administrative and miscellaneous attributes23 l.jpg
Administrative and Miscellaneous Attributes of

Step 5

(continued)

  • Data element example–an example of a value that is permissible for the data element

  • Data element origin–source of information about the data element, including document, standard, system, group, form, or message set

  • Creation/last change date–the system date when a data element was created or updated in the registry


Administrative miscellaneous attributes example l.jpg
Administrative & Miscellaneous Attributes - Example of

  • Submitting organization–Office of Environmental Information

  • Stewardship contact–Data Standards Branch

  • Data element comment–this data element is used to identify states and state equivalents for all United States mailing addresses, including military addresses

  • Data element example–NJ (New Jersey)

  • Data element origin–EPA data standard workgroup

  • Creation/last change date–system date

24


Data element concept l.jpg
Data Element Concept of

Step 6

  • Provides conceptual information

  • May relate data elements that convey the same concept with different representations

  • Singular–refers to only one concept

  • Must have a name and definition, recorded in context

  • Specified through a conceptual domain,i.e., the set of possible valid values for a data element concept, expressed without representation


Data element concept example l.jpg
Data Element Concept - Example of

  • Name: U.S. State or State Equivalent

  • Definition: An identifier for a primary political subdivision of the United States, including an outlying territory or an associated state

  • Data elements that might share this data concept include:

    • United States State Name–New Jersey

    • State Common Name–Garden State

    • Facility Location State Abbreviation–NJ

  • This data element concept uses a subset of the values in the conceptual domain:

    • Primary Geopolitical Subdivisions of Countries

26


Conceptual domain example l.jpg
Conceptual Domain - Example of

  • Name: Primary Geopolitical Subdivisions of Countries

  • Definition: Identifiers for the primary geopolitical subdivisions of the countries of the world

  • Value meanings might include:

    • The U.S. state of Alabama

    • The Canadian province of Alberta

    • The Malaysian state of Sabah

    • The U.S. state equivalent of District of Columbia

27


Classification schemes l.jpg
Classification Schemes of

Step 7

Data elements might be classified according to any of the following types of groups where the data element might be listed:

  • Usage

  • Data standard

  • Application system

  • Data collection form

  • Keywords

  • Object class


Classification schemes example l.jpg
Classification Schemes - Example of

  • Mailing address group

  • U.S. Postal Service Address Standard

  • Form R for Toxic Release Inventory

  • Keywords: State, Geopolitical

29


Quality control l.jpg
Quality Control of

Step 8

  • Registration status–records the position in the registration life cycle of the data element, that indicates the stage of quality review for a data element

    • Incomplete–all metadata are not entered

    • Recorded–all metadata are entered

    • Certified–metadata are valid

    • Standard–the preferred data element for Agency use


Quality control example l.jpg
Quality Control - Example of

Registration

Status

Quality Assurance

All data have been entered: Recorded

Data are certified to be accurate: Certified

After becoming Agency standard: Standard

31