1 / 43

Relaxing Join Selection & Queries Nick Koudas et al

Relaxing Join Selection & Queries Nick Koudas et al. Presented by Pradnya Chavarkar. Motivation . Complex queries in terms of predicates against large databases. Right parameter values not known. Parameter adjustment becomes a trial-and-error method.

december
Download Presentation

Relaxing Join Selection & Queries Nick Koudas et al

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Relaxing Join Selection & Queries Nick Koudas et al Presented by Pradnya Chavarkar

  2. Motivation • Complex queries in terms of predicates against large databases. • Right parameter values not known. • Parameter adjustment becomes a trial-and-error method. • More the predicates, difficult the adjustment Relaxing Join and Selection Queries

  3. Example Relaxing Join and Selection Queries

  4. Example • How many conditions to adjust?? • How much to adjust?? • No. of choices exponential in number of conditions Relaxing Join and Selection Queries

  5. Outline • Relaxation framework • Relaxing select conditions • Relaxing All conditions • Lattice traversal • Experimental results • Conclusion Relaxing Join and Selection Queries

  6. Query Relaxation • Top–K queries Weights given to each condition • Skyline Points which are not dominated by any other points in the given set Relaxing Join and Selection Queries

  7. Relaxation Framework Selection Condition • Join Condition Relaxing Join and Selection Queries

  8. Relaxation Skyline Relaxing Join and Selection Queries

  9. Simple Solution ! • Compute Skyline on Jobs • Compute Skyline on Candidates • Join the skylines Relaxing Join and Selection Queries

  10. Simple Solution ! (Contd) • Incorrect results ! • Computes skyline locally Relaxing Join and Selection Queries

  11. Relaxing Selection Conditions • Many joins are on Identifier attributes • Relaxation skyline should satisfy : Relaxing Join and Selection Queries

  12. Join First (JF) • Join without select conditions • Compute skyline for select conditions on the resulting tuples • May not be efficient for joins which return a large number of pairs Relaxing Join and Selection Queries

  13. Pruning Join (PJ) • Compute skyline during join step • Assuming index on relation S • For each tuple of R, find joining tuples from S • Call ‘update’ for each such pair and current skyline • Discard the pair if it is already dominated else discard the pairs dominated by the current pair Relaxing Join and Selection Queries

  14. Pruning Join (PJ) (Contd.) • Advantage: Produces less pairs after join step as compared to Join First • Disadvantage Need for modification of join methods Relaxing Join and Selection Queries

  15. Pruning Join+ (PJ+) • Modifies pruning Join Algorithm • Computes “local skylines” for records joining with a record of othe relation • Does Dominance checking within local skyline • If a pair is locally dominated then it will not be in global skyline Relaxing Join and Selection Queries

  16. Pruning Join+ (PJ+) (Contd) R (A, B, C)S (C, D, E) Relaxing Join and Selection Queries

  17. Pruning Join+ (PJ+) (Contd) • Does local pruning hoping to eliminate some S tuples beforehand • May not be always beneficial, when many tuples are not eliminated • Experimental results show that the tradeoff depends upon factors like number of conditions Relaxing Join and Selection Queries

  18. Sorted Access Join (SAJ) • Constructs sorted list of tuple IDs for each select condition based on relaxation • Go through the lists in round robin fashion • For each record in Li find records which can join -- (p,q) • Stop if all the current records in list Lj (i≠j), have relaxation greater that (p,q) Relaxing Join and Selection Queries

  19. Sorted Access Join (SAJ) (Contd) Relaxing Join and Selection Queries

  20. Relaxing All Conditions • Relaxing join conditions along with selection conditions • Assumes multidimensional indexed structure on selection and join conditions • R-Tree on both relations Relaxing Join and Selection Queries

  21. MIDIR • Traverses both R-Trees top down to identify potential pairs which can be in relaxation skyline • Push such pairs in queue • In each iteration, pop one pair and perform dominance checking • If object-object pair, call Update() Relaxing Join and Selection Queries

  22. Dominance checking in MIDIR Compute lower and upper bound on relaxation Pair p’ dominates p if for each condition Ci • To compute the lower and upper bound for Ci (selection condition) convert itinto interval Relaxing Join and Selection Queries

  23. Dominance checking in MIDIR (Contd) • B: MRB or tuple • A: attribute used in B • I(B,A): interval of A in B • Selection Condition: • Join Condition Relaxing Join and Selection Queries

  24. Dominance checking in MIDIR (Contd) • Minimum distance between two intervals Then MINDIST = 0 else MINDIST = Relaxing Join and Selection Queries

  25. Variants of Query Relaxation • Top-k answers • User can specify weights on the conditions • K points that have smallest weighted summation • Store the k best skyline answers • Modify Update() in PJ • Priority queues in MIDIR Relaxing Join and Selection Queries

  26. Variants of Query Relaxation (Contd) • Queries with multiple joins • JF – compute all joins beforehand • SAJ – access multiple index structures to find the joining tuples • MIDIR – queue maintains possible vectors • Relaxation of nonnumeric attributes • RELAX(r,c) an be modified • All algorithms can refer to this computed value Relaxing Join and Selection Queries

  27. Variants of Query Relaxation (Contd) Lattice Structure Relaxing Join and Selection Queries

  28. Lattice Traversal • At least one result should be returned • Examine the reasons for the empty result – • Join condition Cj is strict • Relax Cj with the tuples provided by applying selection conditions • Either CR or Cs is strict • Relax the strict selection condition • If join still strict then relax join with either of the selections Relaxing Join and Selection Queries

  29. Lattice Traversal (Contd) • Both CR or Cs are strict • Relax both selection conditions • If join condition still strict, relax join condition • If still no result, relax all togather Relaxing Join and Selection Queries

  30. Experiments • Experimental Settings • Two real and three synthetic dabases • Three types of correlations • Independent • Correlated • Anti-correlated Relaxing Join and Selection Queries

  31. Experiments (different data size) Relaxing Join and Selection Queries

  32. Experiments (different data sizes) Relaxing Join and Selection Queries

  33. Experiments (Different join cardinality) Relaxing Join and Selection Queries

  34. Experiments (Different selection conditions) Relaxing Join and Selection Queries

  35. Summary • JF and PJ • Almost same performance for relaxation of selection conditions • PJ and PJ+ • PJ works faster, but performance gets affected by join cardinality • SAJ • Efficient for correlated data and query conditions Relaxing Join and Selection Queries

  36. Conclusion • Efficient relaxation algorithm for relaxing join and selection queries • Lattice traversal method for minimal relaxation Relaxing Join and Selection Queries

  37. Related Work (ML for query relaxation ) • LOQR algorithm proposed which is for queries in disjunctive normal form • Three steps • Exacting domain knowledge • Finding the most useful rule • Relaxing the failing query Relaxing Join and Selection Queries

  38. Introduction Let the query by the user be • The query fails because • Laptops with large screen weights more than 3 pounds • Fast laptops with large HDD cost more than $2000 Relaxing Join and Selection Queries

  39. Extracting domain knowledge Take subset D of the target database Find “typical values” of the other attributes for each constraints Forms a new dataset Di, with values of the constraint attribute indicated a Boolean Relaxing Join and Selection Queries

  40. Extracting domain knowledge (contd) Relaxing Join and Selection Queries

  41. Finding the most useful rule • For rules formed using all the constraints individually, find the most similar rule by nearest neighbour search • R1 is the most similar as they differ only on the price attribute • It is guaranteed to give a result as it gives a result on the subset of the target database Relaxing Join and Selection Queries

  42. Relaxing the failing query Most similar query is combined with the original query to give the relaxed query Relaxing Join and Selection Queries

  43. References • Relaxing Join and selection Queries Nick Koudas, Chen Li, and Anthony K. H. Tung, Rares Vernica • Machine learning for online query relaxation Ion Muslea Relaxing Join and Selection Queries

More Related