1 / 40

Skyline Query Processing for Incomplete Data

Skyline Query Processing for Incomplete Data. Mohamed E. Khalefa Mohamed F. Mokbel Jus tin J. Levandoski Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA ICDE 2008. Outline. Introduction Problem Formulation Methods and Algorithms

Download Presentation

Skyline Query Processing for Incomplete Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Skyline Query Processing for Incomplete Data Mohamed E. Khalefa Mohamed F. MokbelJus tin J. Levandoski Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA ICDE 2008

  2. Outline • Introduction • Problem Formulation • Methods and Algorithms • Experiment Results • Conclusion

  3. Introduction • Existing skyline algorithms assume: 1. Date are complete (all dimensions are available for all data ) 2.Transitive relation. p1 dominates p2, p2 dominates p3 => p3 dominates p1.

  4. (Cont.) • If data is incomplete: 1.Some dimensions are no value. 2.No transitive relation. p1 dominates p2, p2 dominates p3. But p1 don’t dominates p3. p3 dominates p1. Cycle and no transitive relation!!

  5. Problem Formulation • Dominance Relation for Incompletedata: 1.There is at least one dimension ui where both P.ui and Q.ui are known, and P.ui > Q.ui . 2.For all other dimensions j, j ≠i, either P.uj is unknown, Q.uj is unknown, or P.uj ≥Q.uj. • Example: p1dominates p2. p2 don’t domninate p3, and p3 don’t domninatep2.

  6. (Cont.) • Bitmap representation:0: unknown dimension 1:know dimension example: p1.B and p2.B=100<-comparable p1.B and p3.B=000<-incomparable

  7. Methods and Algorithms • The Replacement Algorithm. • The Bucket Algorithm. • The ISkylineAlgorithm.

  8. The Replacement Algorithm • Replace unknown dimension by . • Use traditional Skyline algorithm to get Ssky Replace Replace “–” Incomplete Data Complete Data Ssky Ssky

  9. The Bucket Algorithm • To divide all incoming points into distinct buckets where all points in each bucket have the same bitmap representation. • Skylines of each bucket: local skyline. • Collect all local skyline in one list, termed candidate skyline. • Perform an exhaustive pairwise comparison among all points to get the query answer.

  10. (Cont.) Global Skyline Candidate Skyline Local Skyline 4 1

  11. (Cont.) • In general, performance is better than the replacement algorithm because candidate list is likely to be smaller than set Ssky in the replacement algorithm. • Candidate skylines may be excessive size • Missing a chance to use the bucket data to reduce the comparisons

  12. The ISkyline Algorithm • Virtual Points • Shadow Points • The ISkyline Algorithm

  13. Virtual Points P1,P2,Q1,P3,P4依序進入

  14. Shadow Points Q1dominates P3=> add virtual point Q1v to P’s local_skyline Q4 is dominated by P3. But we just check “local skyline”. Q4 don’t be dominated.

  15. (Cont.) • Shadow Points: points that are only dominated by virtual points. Q1 is dominated by S4v. Q3 is dominated by S4v.

  16. The ISkylineAlgorithm • Phase I:Insert P, 1.If P is dominated by real point in Local Skyline=>Remoed P. 2.If P is dominated by virtual point in Local skyline =>Insert to shadow skyline point. 3.If P is local skyline point=>Insert to the Candidate skyline.(Phase II) • Phase II:the number of the Candidate skyline>t=>Insert to the global skyline

  17. (Cont.) t=2 P1(6,4,-) Global skyline Candidate skyline P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011

  18. (Cont.) t=2 P1(6,4,-) Global skyline Candidate skyline P1 P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011

  19. (Cont.) t=2 P1(6,4,-) Global skyline Candidate skyline P1 P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011

  20. (Cont.) t=2 P1(6,4,-) Q1(8,-,1) Global skyline Candidate skyline P1 Q1 P1(6,4,-) Q1(9,-,1) Node P = 110 Node Q= 101 Node R= 011

  21. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) Global skyline Candidate skyline P1 Q1 P1(6,4,-) Q1(9,-,1) Node P = 110 Node Q= 101 Node R= 011

  22. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) Global skyline Candidate skyline P1 Q1 Q1v(9,-,-) Q1(9,-,1) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  23. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) Global skyline Candidate skyline Q1 R1 Q1v(9,-,-) Q1(9,-,1) R1(-,3,1) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  24. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Global skyline Candidate skyline Q1 R1 Q1v(9,-,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  25. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Global skyline Candidate skyline Q1 R1 P2 Q1v(9,-,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) |Candidate skyline|>2 Insert to Global skyline P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  26. (Cont.) Compare against Shadow skyline t=2 P1(6,4,-) Q1(8,-,1) R1(-,3,1) P2(9,3,-) Global skyline Q1 R1 P2 Candidate skyline Q1v(8,-,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  27. (Cont.) R1 is dominated by P1 t=2 P1(6,4,-) Q1(8,-,1) R1(-,3,1) P2(9,3,-) Global skyline Q1 R1 P2 Candidate skyline Q1v(8,-,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  28. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Global skyline Q1 P2 Candidate skyline Q1v(9,-,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  29. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Q2(6,-,1) Global skyline Q1 P2 Candidate skyline Q1v(9,-,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  30. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Q2(6,-,1) Global skyline Q1 P2 Candidate skyline Q1v(9,-,-) Q1(9,-,1) R1(-,3,1) Q2(6,-,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  31. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Q2(6,-,1) R2(-,6,5) Global skyline Q1 P2 Candidate skyline R1(-,3,1) Q1v(9,-,-) Q1(9,-,1) Q2(6,-,1) R2(-,6,5) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  32. (Cont.) t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Q2(6,-,1) R2(-,6,5) Global skyline Q1 P2 Candidate skyline Q1v(9,-,-) Q1(9,-,1) R2(-,6,5) Q2(6,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 R2 dominates R1 P1(6,4,-) Shadow skyline

  33. (Cont.) Check Candidate skyline and Global skyline t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Q2(6,-,1) R2(-,6,5) Global skyline Q1 P2 Candidate skyline R2 Q1v(9,-,-) Q1(9,-,1) R2(-,6,5) Q2(6,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  34. (Cont.) Q1 and P2 are dominated by R2 t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Q2(6,-,1) R2(-,6,5) Global skyline Q1 P2 Candidate skyline R2 Q1v(9,-,-) Q1(9,-,1) R2(-,6,5) Q2(6,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  35. (Cont.) Global skyline: Global skyline Candidate skyline t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Q2(6,-,1) R2(-,6,5) Global skyline Candidate skyline R2 Q1v(9,-,-) Q1(9,-,1) R2(-,6,5) Q2(6,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  36. (Cont.) Result is Global skyline:Q2 t=2 P1(6,4,-) Q1(9,-,1) R1(-,3,1) P2(9,3,-) Q2(6,-,1) R2(-,6,5) Global skyline R2 Candidate skyline Q1v(9,-,-) Q1(9,-,1) R2(-,6,5) Q2(6,-,1) R1(-,3,1) P2(9,3,-) P1(6,4,-) Node P = 110 Node Q= 101 Node R= 011 P1(6,4,-) Shadow skyline

  37. Experiment Results

  38. (Cont.)

  39. (Cont.)

  40. Conclusion • Base on traditional skyline Query: the Replacement Algorithm and the Bucket Algorithm. • New method: the ISkylineAlgorithm. • The performance of the ISkylineAlgorithm is the best of three.

More Related