1 / 27

280 likes | 442 Views

Revisiting Pipelined Parallelism in Multi-Join Query Processing. Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute (binliu|rundenst)@cs.wpi.edu. http://www.davis.wpi.edu/dsrg. Multi-Join Queries. Data Integration over Distributed Data Sources

Download Presentation
## Revisiting Pipelined Parallelism in Multi-Join Query Processing

**An Image/Link below is provided (as is) to download presentation**
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.
Content is provided to you AS IS for your information and personal use only.
Download presentation by click this link.
While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

**Revisiting Pipelined Parallelism in Multi-Join Query**Processing Bin Liu and Elke A. Rundensteiner Worcester Polytechnic Institute (binliu|rundenst)@cs.wpi.edu http://www.davis.wpi.edu/dsrg VLDB 2005**Multi-Join Queries**• Data Integration over Distributed Data Sources • i.e., Extract Transform Load (ETL) Services Data Source Data Warehouse Data Source … … Data Warehouse Data Source Persistent Storage (1) High IO costs given large intermediate results (2) Disk access undesirable since one time process VLDB 2005**Applying Parallelism**• Processed in Main Memory of a PC Cluster • Make use of aggregated resources (main memory, CPU) Network Clusters of Machines VLDB 2005**Three Types of Parallelism**Pipelined: Operators be composed into producer and consumer relationship Independent: Independent operators run simultaneously on distinct machines Partitioned: Single operator replicated and run on multiple machines VLDB 2005**key**value ID Date … OID Item … 0011 5/13 … 0011 IPC … (2) Probe hash tables and output results 0012 5/14 … 0012 HPS … (1) Build hash tables of Orders based on ID … … … … … … Orders LineItems OID Item … ID Date … 0011 IPC … 0011 5/13 … 0012 HPS … 0012 5/14 … … … … … … … LineItems Orders Basics of Hash Join • Two-Phase Hash Join [SD89, LTS90] • Demonstrated High Performance • Potential High Degree of Parallelism VLDB 2005**key**value key value key value (2) Probe hash tables and output results Split (1) Build hash tables of Orders based on ID OID Item … ID Date … 0011 IPC … 0011 5/13 … 0012 HPS … 0012 5/14 … … … … … … … LineItems Orders Partitioned Hash Join • Partition (Inputs) Hash Tables across Processors • Have Each Processing Node Run in Parallel VLDB 2005**Left-Deep Query Tree**Steps: B8 P8 (9) Scan R9 – Probe P8 – Output B7 P7 R9 (8) Scan R8 – Probe P7 – Build B8 R8 B2 P2 (3) Scan R3 – Probe P2 – Build B3 B1 P1 (2) Scan R2 – Probe P1 – Build B2 R3 R1 R2 (1) Scan R1 – Build R1 Left-Deep Tree [SD90] Example Join Graph R8 R2 R9 R6 R1 R3 R7 R5 R4 … VLDB 2005**Right-Deep Query Tree**B8 P8 R9 B7 P7 (2) Scan R1, Probe P1, Probe P2, … , Probe P8 R8 B2 P2 • Scan R2 – Build R1, • Scan R3 – Build R2, • …, • Scan R9 – Build R8 B1 P1 R3 R2 R1 Right-Deep Tree [SD90] Example Join Graph R8 R2 R9 R6 R1 R3 R7 R5 R4 VLDB 2005**Tradeoffs Between Left and Right Trees**• Right-Deep Good potential for pipelined parallelism. Intermediate results exist only as a stream. Size of building relations can be predicted accurately. Large memory consumption. • Left-Deep Less memory consumption Less pipelined parallelism VLDB 2005**B8 P8**R9 B7 P7 R8 B4 P4 R5 B3 P3 B2 P2 R4 R3 B1 P1 R1 R2 State-of-the-Art Solutions • Implicit Assumption : Prefer Maximal Pipelined Parallelism !!! VLDB 2005**B8 P8**B8 P8 R9 R9 B7 P7 B7 P7 R8 R8 B4 P4 B4 P4 R5 R5 B3 P3 B3 P3 B2 P2 B2 P2 R4 R4 R3 R3 B1 P1 B1 P1 R1 R1 R2 R2 State-of-the-Art Solutions • Break tree into several pieces, and • Process one piece at a time (as pipeline) • I.e., Static Right-Deep[SD90], • ZigZag [ZZBS93], • Segmented Right-Deep [CLYY92]. • What if : Memory Constrained Environments ? • Strategy : Pipeline ! VLDB 2005**t P12**Computation Machines P12 P13 P14 P22 P23 P24 P32 P33 P34 t R4 Probing Building R3 Partition Partition Partition Partition R2 R1 R1 R2 R3 R4 Pipelined Execution Redirection Cost: The intermediate results generated may need to be partitioned to a different machine. Optimal Degree of Parallelism? I.e., It may not be necessary to partition R2 over a large number of machines if it only has 1000 tuples? VLDB 2005**Pipelined Cost Model**• Compute n-way join over k machines • Probing relation R0, building relations, R1, R2, …, Rn • Ii represents the intermediate results after joining with Ri • Total Work (Wb+Wp) & Total Processing Time (Tb+Tp) VLDB 2005**P4**R9 P2 P3 P1 R7 R1 R0 R3 R2 R7 R6 R4 R5 R1 R0 Break Pipelined Parallelism • To Break Long Pipeline and Introduce Independent Parallelism • Large number of small pipelines • High interdependence between pipelined segments • i.e., P1 > P2, P3 > P4, P2 > P4, VLDB 2005**P3**R8 R3 R2 R9 R7 R4 R1 P1 I1 R0 R5 I2 P2 R6 R5 R0 R1 R8 R7 R4 R3 R2 R6 R9 Segmented Bushy Tree • Basic Idea • Compose large pipelined segment • Run pipelined segments independently • Compose bushy tree with minimal interdependency To balance pipelined and independent parallelism VLDB 2005**Heuristics**Cost-Based Composing Segmented Tree Input: A connected join graph G with n nodes. Number m specifies maximum number of nodes in each graph. Output: Segmented bushy tree with at least n/m subtrees. completed = false; WHILE (!completed) { Choose node V with largest cardinality that has not yet been grouped as probing relation; Enumerate all subgraphs starting from V with at most m nodes; Choose best subgraph, mark nodes in this group as having been selected in original join graph; IF !(exist K, K is a connected subgraph of G with unselected nodes) && (K.size() >= 2) { completed = true; } } Compose segmented bushy tree from all groups; VLDB 2005**G2**G1 G1 R8 R3 R2 R8 R3 R2 R8 R3 R2 R9 R7 R4 R1 R9 R7 R4 R1 R9 R7 R4 R1 R6 R5 R0 R6 R5 R0 R6 R5 R0 Example (1) R7, R8, R9, R6 (2) R7, R9, R6, R8 (3) R7, R4, R8, R5 ... (1) R1, R0, R2, R3 (2) R1, R2, R0, R3 (3) R1, R2, R3, R4 ... VLDB 2005**G2**G1 R8 R3 R2 R9 R7 R4 R1 I1 R0 R5 I2 R6 R5 R0 G3 R1 R8 R7 R4 R3 R2 R6 R9 Example : Segmented Bushy Tree VLDB 2005**Machine Allocation**• Based on building relation sizes of each segment • Nb: total amount of building work. • ki: number of machines allocated to pipeline i k3 Nb = k1 I1 R0 R5 I2 k2 R1 R8 R7 R4 R3 R2 R6 R9 VLDB 2005**Insufficient Main Memory**• Break query based on main memory availability • Compose segmented bushy tree for each part R14 R19 R12 R8 R3 R2 R13 R15 R18 R10 R9 R7 R4 R1 R16 R17 R11 R6 R5 R0 VLDB 2005**Experimental Setup**• 10 Machine Cluster • Each machine has 2 2.4GHz Xeon CPUs, 2G Memory. • Connect by gigabit ethernet switch PIII 800M Hz PC, 256M Memory Application Oracle 8i 10 Machine Cluster ... 2 PIII 1G CPUs, 1G Memory PIII 800M Hz PC, 256M Memory Controller VLDB 2005**Experimental Setup (cont.)**• Generated Data Set with Integer Join Values • Around 40 bytes per tuple • Randomly Generated Join Queries • Acyclic join graph with 8, 12, 16 nodes • Each node represents one join relation • Each edge represents one join condition • Average join ratio is 1 • Cardinality of each relation is from 1K ~ 100K • Up to 600MB per query VLDB 2005**Pipelined vs. Segmented (I)**VLDB 2005**Pipelined vs. Segmented (II)**VLDB 2005**Insufficient Main Memory**VLDB 2005**Related Work**• [SD90] Tradeoffs in processing complex join queries via hashing in multiprocessor database machines. VLDB 1990. • [CLYY92] Using segmented right deep trees for execution of pipelined hash joins. VLDB 1992. • [MLD94] Parallel hash based join algorithms for a shared everything environment. TKDE 1994. • [MD97] Data placement in shared nothing parallel database systems. VLDB 1997. • [WFA95] Parallel evaluation of multi-join queries. SIGMOD 1995. • [HCY94] On parallel execution of multiple pipelined hash joins. SIGMOD 1994. • [DNSS92] Practical skew handling in parallel joins. VLDB 1992. • [SHCF03] Flux: an adaptive partitioning operator for continuous query systems. ICDE, 2003. VLDB 2005**Conclusions**• Observation: Maximal pipelined hash join processing • Redirection costs? optimal degree of parallelism? • Hypothesis: Worthwhile to incorporate independent parallelism into processing • Both, so several shorter pipelines in parallel • Solution: Segmented bushy tree processing • Heuristics and cost-driven algorithm developed • Validation : Extensive experimental studies • Achieve around 50% improvement over pure pipelined processing VLDB 2005

More Related