1 / 15

Data Quality Aware Query Systems

School of Information Technology and Electrical Engineering. Data Quality Aware Query Systems. Naiem Khodabandehloo Yeganeh Supervised by: Dr. Shazia Sadiq , Co-supervisor: Prof. Xiaofang Zhue. At VLDB 2010 PhD Workshop. Data Quality Aware Query Systems.

qamra
Download Presentation

Data Quality Aware Query Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. School of Information Technology and Electrical Engineering Data Quality Aware Query Systems NaiemKhodabandehlooYeganeh Supervised by: Dr. ShaziaSadiq, Co-supervisor: Prof. XiaofangZhue At VLDB 2010 PhD Workshop

  2. Data Quality Aware Query Systems My goal is to answer a query like this and maximize user satisfaction • SELECT TOP 10 • Name, • Job Title, • Phone No, • Address, • Last Tax Paid, • Count (Publications) FROM (Anywhere) WHERE Name=“Naiem” and School=“UQ” ORDER BY DataQuality (As defined visually) NETWORK Accurate Current Consistent Complete

  3. Framework & Assumptions • Database Schemas of all data sources (metadata M) are known, and a federated view to all of them exists . • Data sources contribute to generate their own DQ profile, because they know best about their data. (i.e. England based Data source vsAustralian based data source have different rules for measuring accuracy of Address) Quality Aware Queries Communication Network DQ Profiling DQ Profiling DQ Profiling Org n Org 1 Org 2 DB DB DB M M M

  4. Challenges • Capture user preference on data quality • Data Quality Aware Query Language • (Pre-)Estimate quality of the result of query • Data Quality Profiling • Responding to the (TOP k) query efficiently • Data Quality Aware Query Planning

  5. Data Quality Aware Query Language Preference as partial oreders • Goal: Capture user preferences I have the following preference matrix about data quality I like tea more than Cofee

  6. Data Quality Aware Query Language • We defined an extension to SQL language to capture user preference on Data Quality • We developed a visual user interface to visually capture preferences • We developed methods to detect inconsistencies in user preferences with effective visual feed back

  7. Data Quality Aware Profiling • Traditional DQ Profiling -DQ scores assigned to source or schema object. -Can not estimate query results Quality of information about Apple products in a Microsoft website may not be good even if the web site has high quality data in general.

  8. Data Quality Aware Profiling • We developed a new profiling method called Conditional DQ Profiling to estimate the quality of results of a query. • This should include ANY possible query for a where clause (WHERE Name=‘Naiem’ AND School=‘UQ’)

  9. Data Quality Aware Profiling • Example a table with data about digital camera. Brand: C = Cannon S = Sony Model: S = SLR N = Normal Price: H = High L = Low

  10. Data Quality Aware Profiling

  11. Data Quality Aware Profiling Conditional DQ Profile Reduced Conditional DQ Profile with two threshold (minimum set=2, and accuracy=%20)

  12. Data Quality Aware Profiling Effect of thresholds on the size of Conditional DQ profile PPM – Power Plant Meters Database DBLP – DBLP Publications Database

  13. Data Quality Aware Profiling

  14. Possible join plans Select * from join A,B,C,D on ... Data Quality Aware Query Planning A B C D Querying Interface S3 S5 Sk Sj Si S9 S4 Sn Sx S1 Sy Sb .. .. .. .. Communication Infrastructure S1 S2 S3 Sn

  15. Love to get feedbacks • Questions?

More Related