On the Authoritative Data Sources: One Data Element at a Time
1 / 16

simapc - PowerPoint PPT Presentation

  • Uploaded on

On the Authoritative Data Sources: One Data Element at a Time. DAMA National Capital Region Chapter Meeting March 9, 2010 Washington, DC. Richard Wang, Ph.D. Deputy Chief Data Officer Chief Data Quality Officer Office of the U.S. Army CIO/G-6

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'simapc' - daniel_millan

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Slide1 l.jpg

On the Authoritative Data Sources: One Data Element at a Time

DAMA National Capital Region Chapter Meeting

March 9, 2010

Washington, DC

Richard Wang, Ph.D.

Deputy Chief Data Officer

Chief Data Quality Officer

Office of the U.S. Army CIO/G-6

Director, MIT Information Quality Program (on leave)

Massachusetts Institute of Technology

Vesting University Professor of Information Quality

University of Arkansas at Little Rock

Data quality books by mit information quality program l.jpg
Data Quality Books Timeby MIT Information Quality Program

  • http://mitiq.mit.edu/Publications.htm






Mit s role in the foundations for iq education 2007 madnick l.jpg
MIT’s role in the foundations Timefor IQ education (2007, Madnick)

Lots of time & energy

UALR: MSIQ and IQ PhD Degree Programs


- 2007 ACM Journal on Data and Information Quality (JDIQ)

Conferences and Certification Programs

  • 1996 International Conference on Information Quality (ICIQ)

  • 2002 MIT-IQ program for Executives

  • 2003 IQ-1: Principles and Foundations

  • 2007 IQ Industry Symposium


Rich Wang

(our Harry Potter)


  • Journey to Data Quality (2006)

  • … and many others


  • 1990 Polygen Data Quality Model (VLDB + ICIS)

  • 1996 Beyond Accuracy

  • 1998 Managing Information as a Product


Research Projects

  • 1988 Total Data Quality Management Program (TDQM)

  • 2002 MIT Information Quality (MITIQ) Program

* Not complete list

One data element at a time federal agency case l.jpg
One Data Element At a Time: TimeFederal Agency Case

Stakeholders Meeting

Data Element Identification

$1M+ impact per data element

90-day progress

Private sector case l.jpg
Private Sector Case Time

Data Element Selection Criteria

  • Critical to Business

  • Recognized Pain Point

  • $1M+ impact

  • Practical to model

  • Practical to Implement

  • Owner identified

  • Commitment by the Stakeholders: 3 C’s + Management

Slide7 l.jpg

Army Chief Data Quality Officer FY10 Priorities Time

300-500 critical Army Data Elements in FY10, 5000 by FY13

Army Staffing of Data Elements from Bronze to be Silver, Gold

Vertical integration up with semantics, business logic, objects (U-Core, C2-Core ontology)

Authoritative Data SourcesDesignated Data SourcesAuthoritative Data Elements

Single element approach l.jpg
Single Element Approach Time


Establish a Total Data Quality Management (TDQM) Program in the Army while utilizing limited resources

TDQM Cycle


  • Address one data element at a time using priority data elements within priority projects.

  • Take a first few data elements through the entire TDQM cycle to educate and illustrate value.

  • Establish and populate a catalog of data element quality specifications (the “Define” of TDQM) containing priority data elements for broad use.

Early success l.jpg
Early Success Time

Project: Suicide Mitigation - NIMH Study feed

Elements: UIC, SSN

  • Developed Data Quality Specification to define data quality rules.

  • Constructed Information Product Map (IP-Map) that shows the flow of the data element and its quality checks from data providers to NIMH Study consumer.

  • ADCF implemented quality checks and reported results.

  • Captured DQ Process metrics and DQ element metrics in a Dashboard.

  • Preparing DQ element metric details to feed back to data providers.

Army data element yellow pages l.jpg
(Army) Data Element Yellow Pages Time

A. Army Data Elements specifications are developed thru the Data Element Quality Definition Process and entered in the Data Element Yellow Pages

B. IP Producers utilize the Data Element Yellow Pages to discover Data Element specifications and integrate them into their Information Products

IP Producer

Data Element

Quality Definition




IP =

Information Product

C. IP Consumers access the Data Element Yellow Pages to find Data Element specifications for understanding and correctly using the data.


IP Consumer

Data element yellow pages content l.jpg
Data Element Yellow Pages Content Time

Data Element Quality Specification:

  • Element Name

  • Definition

  • Data Quality Rules

  • Approval Level

  • Examples

  • Data Element Owner (Steward?)

  • Authoritative References

  • Usage Notes

  • more…

Data Quality Rules:

Supports “fit for use”

Segmented into Three Levels

  • Container (conceptual format)

  • Content (correct in itself)

  • Context (correct in context)

Approval Level:

  • Gold – ADB Approved

  • Silver – ADC Approved

  • Bronze – CDQO Approved

Adc review and comment process proposed l.jpg
ADC Review and Comment Process (proposed) Time

1. Review DE Specifications with your SMEs

Note: you will find some documents cover the entire project; others have only the definition and quality sections completed. Review the definition and quality sections.

2. Gather and submit your comments to CDQO

All comments welcomed (positive, corrections, content, format, unaddressed). No comment [silence] is concurrence.

Send your comments to CDQO Office.

3. Suspense Date: Week before next ADC Meetingfor readout at month ADC meeting.

Ads defined l.jpg
ADS Defined Time

Authoritative Data Source:

A recognized or official data production source with a designated mission statement or source/product to publish reliable and accurate data for subsequent use by customers. An authoritative data source may be the functional combination of multiple, separate data sources.

To assure data quality l.jpg
To assure data quality… Time

  • A data source, is a mechanism through which the publication, storage, or retrieval of data is possible. Within the scope of the Information Technology domain, a data source is consists of digitized data, such as a database, a machine readable file, or a data stream. Data sources contain or provide information and fulfill specific data needs within an identified mission context.

  • A data element is an attribute in a database, a field in a machine readable file, or a basic unit in a data stream.

  • The association of a data need and a given mission characterizes a data source’s intended use.

  • A data source is referred to as a Designated Data Source if the mission and the needed data elements from the data source for this mission are clearly specified.

  • An authoritative body that has responsibility of fulfilling a particular data need attributes a data source as a designated data source.

  • A designated data source is referred to as an Authoritative Data Source if the underlying data of the data elements needed in the specified mission is certified as accurate, timely, and fit for subsequent use by data consumers.

Thank you q a l.jpg
Thank you! TimeQ & A