190 likes | 400 Views
Data Quality. Class 3. Goals. Dimensions of Data Quality Data Extraction, Transformation, and Loading Data Cleansing Project. Dimensions of Data Quality. Poor data quality is similar to obscenity- It seems as if there are no real ways to measure it, but you know it when you see it!
E N D
Data Quality Class 3
Goals • Dimensions of Data Quality • Data Extraction, Transformation, and Loading • Data Cleansing Project
Dimensions of Data Quality • Poor data quality is similar to obscenity- • It seems as if there are no real ways to measure it, but you know it when you see it! • In reality, data quality can be measured • The frame of refernce for measurement is different • Dimensions of Data Quality
Dimensions of Data Quality 2 • Data Models • Data Values • Data Presentation • Data Policy
Data Quality of Data Models • Clarity of definition • Comprehensiveness • Flexibility • Robustness
Data Quality of Data Models 2 • Essentialness • Attribute granularity • Precision of domains • Homogeneity
Data Quality of Data Models 3 • Naturalness • Identifiability • Obtainability • Relevance
Data Quality of Data Models 4 • Simplicity • Semantic Consistency • Structural Consistency
Data Quality of Data Values • Accuracy • Null values • Completeness • Consistency • Currency
Accuracy • Agreement with establsihed sources • Database of record • Other corroborative sources
Null Values • Null vs. Missing • Unavailable • Not appliable • No value • Not classified • Truly null
Completeness • Mandatory attributes require values • Optional attributes may hold values (when and how?) • Inapplicable attributes may not have a value (also when and how?) • Completeness constraints
Consistency • Are values in one set consistent with values in another set? • Consistency relations between attributes in the same table • Consistency assertions across acolumns • Consistency relationships between tables
Currency/Timeliness • What data is current? • How is it kept up-to-date? • Time expectations for accessibility to data
Data Quality of Data Presentation • Appropriateness • Correct Interpretation • Flexibility • Format Precision
Data Quality of Data Presentation 2 • Portability • Representation Consistency • Representation of Null Values
Data Quality of Data Policy • Access • Metadata • Privacy • Fault-tolerance • Security