1 / 32

Relaxing Join and Selection Queries

Relaxing Join and Selection Queries. Rares Vernica UC Irvine, USA Joint work with Nick Koudas, Chen Li, and Anthony K. H. Tung. Query Example. SELECT * FROM Jobs J, Candidates C WHERE J.Salary <= 95 AND J.Zipcode = C.Zipcode AND C.WorkExp >= 5;.

lynton
Download Presentation

Relaxing Join and Selection Queries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relaxing Join and Selection Queries Rares Vernica UC Irvine, USA Joint work with Nick Koudas, Chen Li, and Anthony K. H. Tung

  2. Query Example SELECT * FROM Jobs J, Candidates C WHERE J.Salary <= 95 AND J.Zipcode = C.Zipcode AND C.WorkExp >= 5; Rares Vernica, UC Irvine

  3. What if the query answer is empty? SELECT * FROM Jobs J, Candidates C WHERE J.Salary <= 95 AND J.Zipcode = C.Zipcode AND C.WorkExp >= 5; Adjust the conditions • What conditions to adjust? • How to adjust them? Rares Vernica, UC Irvine

  4. Example Percentages of Empty Result Queries • In a Customer Relationship Management (CRM) application developed by IBM • 18.07% (3,396 empty result queries in 18,793 queries) • In a real estate application developed by IBM • 5.75% • In a digital library application [JCM+00] • 10.53% • In a bioinformatics application [RCP+98] • 38% Efficient Detection of Empty-Result Queries (p.1015)Gang Luo (IBM T.J. Watson Research Center, USA) VLDB 2006 Rares Vernica, UC Irvine

  5. Observations Different ways to adjust the conditions: Select vs. Join How much to adjust each condition?Salary <= 100 vs. Salary <= 120 Adjust join vs. Adjust both selections WorkExp >= 5 Salary <= 95 Rares Vernica, UC Irvine

  6. Contributions Query relaxationframework for selections and joins Lattice-based approach for query relaxation Efficient relaxation algorithms Rares Vernica, UC Irvine

  7. Overview • Motivation • Query Relaxation • Lattice-based Relaxation • Relaxation Algorithms • Variations • Experiments Rares Vernica, UC Irvine

  8. Query Relaxation Top-k / Nearest neighbor • Weight for each condition Skyline • No weights are needed • Conditions are not considered equal • Return non dominated points Rares Vernica, UC Irvine

  9. Skyline Stephan Börzsönyi, Donald Kossmann, Konrad Stocker: The Skyline Operator. ICDE 2001 Query Relaxation Rares Vernica, UC Irvine

  10. Overview • Motivation • Query Relaxation • Lattice-based Relaxation • Relaxation Algorithms • Variations • Experiments Rares Vernica, UC Irvine

  11. Lattice-based Relaxation R – select on Jobs J – join condition S – select on Candidates WorkExp >= 5 Salary <= 95 Rares Vernica, UC Irvine

  12. Overview • Motivation • Query Relaxation • Lattice-based Relaxation • Relaxation Algorithms • Variations • Experiments Rares Vernica, UC Irvine

  13. Relaxing Selection Conditions INCORRECT Algorithm: • Compute Skyline on Jobs • Compute Skyline on Candidates • Join the Skylines WorkExp >= 5 Salary <= 95 Empty Join Skyline Skyline Skyline Rares Vernica, UC Irvine

  14. Relaxing Selection Conditions Join First Algorithm: • Compute the join(disregarding the selections) • Compute Skyline on join results WorkExp >= 5 Salary <= 95 Join Skyline Rares Vernica, UC Irvine

  15. Relaxing Selection Condition Variations Pruning Join • Build the Skyline during the join Pruning Join+ • Pruning Join • Build the local Skyline before the join Sorted Access Join • Fagin’s Top-k: sort the columns on relaxation • Compute the join Skyline Rares Vernica, UC Irvine

  16. Relaxing all conditions Queue Skyline Multi-Dim.-Index-based-Relaxation Algorithm: • Traverse the index structure top-down • Form pairs of nodes or records • Build the Skyline Rares Vernica, UC Irvine

  17. Overview • Motivation • Query Relaxation • Lattice-based Relaxation • Relaxation Algorithms • Variations • Experiments Rares Vernica, UC Irvine

  18. Variations Computing Top-k over Skyline • Weight to each condition Queries with multiple joins Conditions on nonnumeric attributes • Dominance checking function Rares Vernica, UC Irvine

  19. Overview • Motivation • Query Relaxation • Lattice-based Relaxation • Relaxation Algorithms • Variations • Experiments Rares Vernica, UC Irvine

  20. Experimental Setting Datasets • Real • Internet Movie Database (IMDB) Movies (120k) & ActorInMovies (1.2m) • Census-Income – UCI KDD Repository Census (200k) • Synthetic Independent, Correlated, and Anticorrelated Implementation • GNU C++ • Spatial Index Library (R-tree) • Linux, AMD Opteron 240, 1GB RAM Rares Vernica, UC Irvine

  21. Different algorithms, different behaviors IMDB Dataset Rares Vernica, UC Irvine

  22. Different datasets, different behaviors Anticorrelated Dataset Correlated Dataset Independent Dataset Rares Vernica, UC Irvine

  23. How big is the Skyline? Rares Vernica, UC Irvine

  24. Relaxing join takes time Self-join on Census Dataset Rares Vernica, UC Irvine

  25. Top-k over Skyline IMDB Dataset Rares Vernica, UC Irvine

  26. Related Work Muslea et al. • Alternate forms of conjunctive expressions Efficient Skyline algorithms • Selection queries Efficient Top-k algorithms • Require weights for conditions Rares Vernica, UC Irvine

  27. Conclusions Query relaxationframework for selections and joins Lattice-based approach for query relaxation Efficient relaxation algorithms Rares Vernica, UC Irvine

  28. Future Work Optimum use of the lattice structure Relax conditions on string attributes Algorithms applicable outside the databases Rares Vernica, UC Irvine

  29. Questions ?

  30. Rares Vernica, UC Irvine

  31. Skyline vs. Top-k Rares Vernica, UC Irvine

  32. Skyline vs. Top-k over Skyline Rares Vernica, UC Irvine

More Related