1 / 20

Parallel Query Optimization

Parallel Query Optimization. Memory. One tuple at a time. Bucket B. Bucket A. Bucket Sizes and I/O Costs. Bucket B does not fit in the memory in its entirety, It must be loaded several times. Memory. One tuple at a time. Bucket B(1). Bucket A(1). Bucket A(2). Bucket B(2). Bucket A(3).

rolf
Download Presentation

Parallel Query Optimization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Query Optimization Parallel Query Optimization

  2. Memory One tuple at a time Bucket B Bucket A Bucket Sizes and I/O Costs • Bucket B does not fit in the memory in its entirety, It must be loaded several times. Parallel Query Optimization

  3. Memory One tuple at a time Bucket B(1) Bucket A(1) Bucket A(2) Bucket B(2) Bucket A(3) Bucket B(3) Fit in Memory • Bucket B fits in memory. It needs to be loaded only once. Parallel Query Optimization

  4. Hash-Based Join Parallel Query Optimization

  5. GRACE Algorithm Parallel Query Optimization

  6. Data Skew • System performance is very sensitive to the skewness in tuple distribution. Parallel Query Optimization

  7. Zipf-like Distribution Total: 1,000,000tuples Parallel Query Optimization

  8. Partition Tuning • Best Fit Decreasing Strategy: • In this partition tuning strategy, the hash buckets are first sorted into decreasing order according to size. • In each iteration, the currently largest bucket is assigned to the currently smallest partition (or PN). • This process is repeated until all the buckets have been allocated. • This is a dynamic load balancing technique. Parallel Query Optimization

  9. Best Fit Decreasing Strategy Parallel Query Optimization

  10. Adaptive Load Balancing (ABJ+) Parallel Query Optimization

  11. ABJ+ vs. GRACE Parallel Query Optimization

  12. L_LBO in Multi-way Join Queries • L_LBO: Linear Tree with Load Balancing • A multi-way join query is treated as a sequential order of two-way (or single) joins by using ABJ+. Parallel Query Optimization

  13. B_NLB in Multi-way Join Queries • B_NLB: Bushy Tree without Load Balancing • It tries to join as many pairs of relations as possible. Split Phase: Each PN partitions its portion of each relation into small subbuckets and each subbuckets is transferred to PN corresponding to the bucket ID. Join Phase: Each PN performs the local joins. Parallel Query Optimization

  14. NLBO in Multi-way Join Queries • NLBO: No Load Balancing Optimization • Like B_NLB, it tries to join as many pairs of relations as possible. Hash Phase: Each PN partitions its portion of each relation into small subbuckets and stores them back to its own disks. Partition Tuning Phase: It allocates the buckets to the PNs using the Best Fit Decreasing Strategy. Join Phase: Each PN performs the local joins. Parallel Query Optimization

  15. LBO in Multi-way Join Queries • LBO: Load Balancing Optimization Hash Phase: hashed and stored back into local disks. Optimization Phase: using best fit decreasing strategy and a greedy algorithm to select joins which will be executed concurrently. Executing Phase: Stage 1: Tune the partitions. Stage 2: Perform the join operation. Stage 3: Update the join graph, then go to Optimization Phase. Parallel Query Optimization

  16. Optimization Phase of LBO Parallel Query Optimization

  17. Effect of Bucket Skew Parallel Query Optimization

  18. LBO-FR • LBO-SFR: LBO with Fragment & Replicate Feature • LBO-FR is similar to LBO, except it partitions bucket pairs into subbucket pairs if those buckets are too large. • Example: suppose bucket pair (S1, R1) is too large and |S1| > |R2|. Parallel Query Optimization

  19. LBO-SFR • LBO-SFR: LBO with Symmetric Fragment & Replicate Feature |S1,1,1|>|R1,1,1| |S1|>|R1| |S1,1,1|<|R1,1,1| Parti. S1 Parti. R1 Parti. S1 Parallel Query Optimization

  20. Effect of Bucket Skew Parallel Query Optimization

More Related