1 / 99

CURE for Cubes: C ubing U sing a R OLAP E ngine

VLDB 2006. CURE for Cubes: C ubing U sing a R OLAP E ngine. Konstantinos Morfonios Yannis Ioannidis. University of Athens. Introduction. Execution Plan. External Partitioning. Storage Format. Experimental Evaluation. Conclusions. Introduction. Execution Plan. External Partitioning.

kissner
Download Presentation

CURE for Cubes: C ubing U sing a R OLAP E ngine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VLDB 2006 CURE for Cubes:Cubing Using a ROLAP Engine Konstantinos Morfonios Yannis Ioannidis University of Athens

  2. Introduction Execution Plan External Partitioning Storage Format Experimental Evaluation Conclusions

  3. Introduction Execution Plan External Partitioning Storage Format Experimental Evaluation Conclusions

  4. GrayOnData-warehousing: CUBE Introduction SELECT region, sum(revenue) FROM SALES WHERE month = ‘September’ GROUP BY region

  5. CUBE Introduction SELECT A, B, C, SUM(M) FROM R GROUP BY A, B, C SELECT A, B, SUM(M) FROM R GROUP BY A, B SELECT SUM(M) FROM R

  6. Introduction • Problems • Construction algorithm • Storage scheme • Focusing on ROLAP techniques (MVs) • Stressed to limits? • Complete solution? Unclear (not finished with efficient storage) Unclear (not focused on hierarchies)

  7. Number of nodes: often  CURE Introduction Challenges of hierarchies: Efficient execution plan • Small domains in the higher levels of dimension hierarchies New partitioning algorithm • Number of tuples increases Novel storage scheme

  8. Introduction Execution Plan External Partitioning Storage Format Experimental Evaluation Conclusions

  9. Introduction Execution Plan External Partitioning Storage Format Experimental Evaluation Conclusions

  10. Execution Plan • Extend BUC (Bottom-Up-Cube) [BR99] • Efficient pipelining • Cheap identification of some kinds of redundancy • Inherent support for iceberg cubes and holistic functions • Existing “BUC-based” methods: BU-BST [WLFY02] and QC-Tables [LPH02]

  11. Execution Plan Dimensions:A, B, C ABC AB AC BC A B C 

  12. Execution Plan Dimensions:A0→A1→A2, B0→B1, C0

  13. A0B0C0 A0B1C0 A1B0C0 A1B1C0 A2B0C0 A2B1C0 A0B0 A0B1 A0C0 A1B0 A1B1 A1C0 A2B0 A2B1 A2C0 B0C0 B1C0 A0 A1 A2 B0 B1 C0  Execution Plan Dimensions:A0, A1, A2, B0, B1, C0

  14. A0B0C0 A0B1C0 A1B0C0 A1B1C0 A2B0C0 A2B1C0 A0B0 A0B1 A0C0 A1B0 A1B1 A1C0 A2B0 A2B1 A2C0 B0C0 B1C0 A0 A1 A2 B0 B1 C0  Execution Plan Dimensions:A0, A1, A2, B0, B1, C0

  15. A0B0C0 A0B1C0 A1B0C0 A1B1C0 A2B0C0 A2B1C0 A0B0 A0B1 A0C0 A1B0 A1B1 A1C0 A2B0 A2B1 A2C0 B0C0 B1C0 A0 A1 A2 B0 B1 C0  Execution Plan Dimensions:A0, A1, A2, B0, B1, C0 Height: 3

  16. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  Execution Plan Dimensions:A0→A1→A2, B0→B1, C0

  17. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  Execution Plan Dimensions:A0→A1→A2, B0→B1, C0

  18. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  Execution Plan Dimensions:A0→A1→A2, B0→B1, C0 Height: 6

  19. ABC AB AC BC A B C  Execution Plan • Important properties of BUC-based cubing: • Recursive calls at higher levels tend to be cheaper • Benefits from early pruning recursion at some node N increase with the number of ancestors of N in the execution plan • Advantage of taller execution plans ABC AB AC A

  20. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  Execution Plan CURE’s Plan:

  21. Introduction Execution Plan External Partitioning Storage Format Experimental Evaluation Conclusions

  22. Introduction Execution Plan External Partitioning Storage Format Experimental Evaluation Conclusions

  23. Memory External Partitioning R

  24. A0B0C0 A0B0C0 A0B0 A0B0 A0B1C0 A0B1C0 A1B0C0 A1B0C0 A0B1 A0B1 A0C0 A0C0 A1B0 A1B0 A1B1C0 A1B1C0 A2B0C0 A2B0C0 A0 A0 A1B1 A1B1 A1C0 A1C0 A2B0 A2B0 A2B1C0 A2B1C0 B0C0 B0C0 A1 A1 A2B1 A2B1 A2C0 A2C0 B0 B0 B1C0 B1C0 A2 A2 B1 B1 C0 C0   External Partitioning Memory R

  25. External Partitioning Memory R

  26. Partitions External Partitioning Memory R

  27. Partitions External Partitioning Sound Memory R

  28. External Partitioning • For sound partitioning |Biggest partition|≤|M| • In flat datasets this holds in general • In hierarchical datasets…

  29. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning  |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1 (500)→A2 (5)

  30. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1 (500)→A2 (5)

  31. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1 (500)→A2 (5)

  32. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1 (500)→A2(5) 

  33. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1 (500)→A2 (5)

  34. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1(500)→A2 (5) 

  35. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1 (500)→A2 (5)

  36. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1 (500)→A2 (5)

  37. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1 (500)→A2 (5) |A0|/|A2| times smaller than R |A2B0C0| ≈ 50 MB

  38. A0B0C0 A0B0 A0B1C0 A1B0C0 A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A0 A1B1 A1C0 A2B0 A2B1C0 B0C0 A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  External Partitioning |R| = 500 GB, |M| = 1 GB |R|/|M| = 500 A0 (50,000)→A1 (500)→A2 (5)

  39. Introduction Execution Plan External Partitioning Storage Format Experimental Evaluation Conclusions

  40. Introduction Execution Plan External Partitioning Storage Format Experimental Evaluation Conclusions

  41. Storage Format • Two types of redundancy • Dimensional Redundancy (DR) • Aggregational Redundancy (AR)

  42. A0B0C0 A0B0 A0B1C0 A1B0C0 ABC AB AC BC A0B1 A0C0 A1B0 A1B1C0 A2B0C0 A B C A0 A1B1 A1C0 A2B0 A2B1C0 B0C0  A1 A2B1 A2C0 B0 B1C0 A2 B1 C0  Storage Format Example with flat cube only for simplicity

  43. t1 t2 t t’ Storage Format CUBE with DR CUBE’ without DR

  44. t1 t2 t t’ Storage Format CUBE with DR CUBE’ without DR

  45. t1 t2 t t’ Storage Format CUBE with DR CUBE’ without DR

  46. Storage Format CUBE with DR CUBE’ without DR

  47. Storage Format CUBE with DR CUBE’ without DR

  48. Storage Format Classify tuples according to AR into: • Normal Tuples (NTs) • Trivial Tuples (TTs) • Common Aggregate • Tuples (CATs) CUBE with DR CUBE’ without DR

  49. Storage Format

  50. Storage Format

More Related