1 / 42

The Multiway Join Tyranny of Communication Cost The Optimal Algorithm Application to Star Joins Skew Joins

Multiway Join Using MapReduce. The Multiway Join Tyranny of Communication Cost The Optimal Algorithm Application to Star Joins Skew Joins. Jeffrey D. Ullman Stanford University. Communication Cost.

elga
Download Presentation

The Multiway Join Tyranny of Communication Cost The Optimal Algorithm Application to Star Joins Skew Joins

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiway Join Using MapReduce The Multiway JoinTyranny of Communication CostThe Optimal AlgorithmApplication to Star JoinsSkew Joins Jeffrey D. Ullman Stanford University

  2. Communication Cost • Communication costfor a MapReduce job = the total number of key-value pairs generated by all the mappers. • For many applications, communication cost is the dominant cost, taking more time than the computation performed by the mappers and reducers. • Focus: minimizing communication cost for a join of three or more relations.

  3. Natural Join of Relations • Given: a collection of relations, each with attributes labeling their columns. • Find: Those tuples over all the attributes such that when restricted to the attributes of any relation R, that tuple is in R.

  4. A B 0 1 1 2 A C 1 3 2 1 B C 2 3 A B C 1 2 3 Example: Natural Join The join:

  5. Joining by MapReduce • Suppose we want to compute R(A,B) JOIN S(B,C), using kreducers. • Remember: a “reducer” corresponds to a key of the intermediate file, so we are really asking how many keys we want to use. • R and S are each stored in a chunked file of a distributed file system. • Subtle point: when joining two relations, each tuple is sent to one reducer, so k can be as large as we like; in a multiway join, large k implies more communication.

  6. Joining by MapReduce– (2) • Use a hash function hfrom B-values to k buckets. • Earlier, we used h = identity function. • Map tasks take chunks from R and S, and send: • Tuple R(a,b) to reducer h(b). • Tuple S(b,c) to reducer h(b). • If R(a,b) joins with S(b,c), then both tuples are sent to reducer h(b). • Thus, their join (a,b,c) will be produced there and shipped to the output file.

  7. 3-Way Join • Consider a chain of three relations: R(A, B) JOIN S(B, C) JOIN T(C,D) • Example: R, S, and T are “friends” relations. • We could join any two by the 2-way MapReduce algorithm, then join the third with the resulting relation. • But intermediate joins are large.

  8. 3-Way Join – (2) • An alternative is to divide the work among k = m2 reducers. • Hash both B and C to mvalues. • A reducer (key) corresponds to a hashed B-value and a hashed C-value.

  9. 3-Way Join – (3) • Each S-tuple S(b,c) is sent to one reducer: (h(b), h(c)). • But each tuple R(a,b) must be sent to mreducers, those whose keys are of the form (h(b), x). • And each tuple T(c,d) must be sent to mreducers of the form (y, h(c)).

  10. S(b, c) R(a, b) T(c, d) Example: m = 4; k = 16. h(c) = 0 1 2 3 h(b)=0 h(b)=1 h(b)=2 h(b)=3

  11. 3-Way Join – (4) • Thus, any joining tuples R(a,b), S(b,c), and T(c,d) will be joined at the reducer (h(b), h(c)). • Communication cost: s + mr + mt. • Convention: Lower-case letter is the size of the relation whose name is the corresponding upper-case letter. • Example: ris the size of R.

  12. Comparison of Methods • Suppose for simplicity that: • Relations R, S, and T have the same size r. • The probability of two tuples joining is p. • The 3-way join has communication cost r(2m+1). • Two two-way joins have a communication cost: • 3r to read the relations, plus • pr2 to read the join of the first two. • Total = r(3+pr).

  13. Comparison – (2) • 3-way beats 2-way if 2m + 1 < 3 + pr. • pris the multiplicity of each join. • Thus, the 3-way chain-join is useful when the multiplicity is high. • Example: relations are “friends”; pris about 300. m2= k can be 20,000. • Example: relations are Web links; pris about 15. m2= k can be 64.

  14. Optimization of MultiwayJoins Share Variables and Their Optimization Special Case: Star Joins Special Case: Chain Joins Application: Skew Joins

  15. Some Questions • When we discussed the 3-way chain-join R(A, B) JOIN S(B, C) JOIN T(C,D), we used attributes B and C for the map-key (attributes that determined the reducers). • Why not include A and/or D? • Why use the same number of buckets for B and C?

  16. Share Variables • For the general problem, we use a share variablefor each attribute. • The number of buckets into which values of that attribute are hashed. • Convention: The share variable for an attribute is the corresponding lower-case letter. • Example: the share variable for attribute A is always a.

  17. Share Variables – (2) • The product of all the share variables is k, the number of reducers. • The communication cost of a multiway join is the sum of the size of each relation times the product of the share variables for the attributes that do not appear in the schema of that relation.

  18. Example: Minimizing Cost • Consider the cyclic join R(A, B) JOIN S(B, C) JOIN T(A, C) • Cost function is rc + sa + tb. • Construct the Lagrangean remembering abc = k: rc+ sa + tb – (abc – k) • Take the derivative with respect to each share variable, then multiply by that variable. • Result is 0 at minimum.

  19. Example – Continued • d/da of rc + sa + tb – (abc – k) is s – bc. • Multiply by a and set to 0: sa – abc = 0. • Note: abc= k : sa = k. • Similarly, d/db and d/dc give: sa= tb = rc = k. • Solution: a = (krt/s2)1/3; b = (krs/t2)1/3; c = (kst/r2)1/3. • Cost = rc + sa + tb = 3(krst)1/3.

  20. Every relation with B Also has A Dominated Attributes • Certain attributes can’t be in the map-key. • A dominatesB if every relation of the join with B also has A. • Example: R(A,B,C) JOIN S(A,B,D) JOIN T(A,E) JOIN U(C,E)

  21. Example – (2) • Cost expression: rde + sce + tbcd + uabd • Since bappears wherever adoes, if there were a minimum-cost solution with b > 1, we could replace bby 1 and aby ab, and the cost would lower. R(A,B,C) JOIN S(A,B,D) JOIN T(A,E) JOIN U(C,E)

  22. Dominated Attributes – Continued • This rule explains why, in the discussion of the chain join R(A, B) JOIN S(B, C) JOIN T(C,D) we did not give dominated attributes A and D a share.

  23. Solving the General Case • Unfortunately, there are more complex cases than dominated attributes, where the equations derived from the Lagrangean imply a positive sum of several terms = 0. • We can fix, generalizing dominated attributes, but we have to branch on which attribute needs to be eliminated from the map-key.

  24. Solving – (2) • Solutions not in integers: • Drop an attribute with a share < 1 from the map-key and re-solve. • Round other nonintegers, and treat kas a suggestion, since the product of the integers may not be k.

  25. Special Case: Star Joins • A star join combines a large fact table F(A1,A2,…,An) with tiny dimension tables D1(A1,B1), D2(A2,B2),…, Dn(An ,Bn). • There may be other attributes not shown, each belonging to only one relation. • Example: Facts = sales; dimensions tell about buyer, product, day, time, etc.

  26. Star-Join Pattern B1 B2 A1 A2 A4 A3 B4 B3

  27. Star Joins – (2) • Map-key = the A’s. • B’s are dominated. • Solution: di /ai = k for all i. • That is, the shares are proportional to the dimension-table sizes.

  28. Cool Application of Star Join Result • Fact/dimension tables are often used for analytics. • Aster Data approach: partition fact table among nodes permanently (shard the fact table); replicate needed pieces of dimension tables. Fact Shard 1 Fact Shard 2 Fact Shard k . . .

  29. Example: Aster-Data Method • Shard fact table by country. • Wins for distributing the Customer dimension table. • Each Customer tuple needed only at the shard for the country of the customer. • Loses big for other dimension tables, e.g., Item. • Pretty much every item has been bought by someone in each country, so Dimension tuples are replicated to every shard.

  30. Star Joins – (2) • Our solution lets you partition the fact table to kshards, one for each reducer. • Analogy: Think of sharding process as taking the join of the fact table and all dimension tables. • Shards = reducers. • Only dimension keys (the A’s) get shares, so each Fact tuple goes to exactly one shard/reducer. • Total copies of dimension tuples = communication cost, which is minimized.

  31. . . . A3 An -1 An A2 A1 A0 Chain Joins • A chain join has the form R(A0, A1) JOIN R(A1, A2) JOIN … JOIN R(An -1, An) • Other attributes may appear, but only in one relation. • A0 and An are dominated; other attributes are in the map-key.

  32. Special Case: All Relations Have the Same Size • Illustrates strange behavior. • Even and odd nhave very different distributions of the share variables. • Evenn : a2 = a4 = … = an -2 = 1; a1= a3 = … = an -1 = k2/n

  33. Pattern for Even n

  34. Odd n, Equal Relation Sizes • Even a’s grow exponentially. • That is, a4 = a22; a6 = a23; a8 = a24,… • The odd a’s form the inverse sequence. • That is, a1 = an-1; a3 = an-3; a5 = an-5;…

  35. Pattern for Odd n

  36. Application to Skew Joins • Suppose we want to join R(A,B) with S(B,C) using MapReduce, as in the introduction. • But half the tuples of R and S have value B=10. • No matter how many reducers we use, one reducer gets half the tuples and dominates the wall-clock time. • To deal with this problem, systems like PIG handle heavy-hitter values like B=10 specially.

  37. How Skew Join Works • Pick one of the relations, say R. • Divide the tuples of R having B=10 into k groups, with a reducer for each. • Note: we’re pretending B=10 is the only heavy hitter. • Works for any value and > 1 heavy hitter. • Send each tuple of S with B=10 to all k reducers. • Other tuples of R and S are treated normally.

  38. Communication Cost of Skew Join • Let R have r tuples with B=10 and S have s tuples with B=10. • Then communication cost = r + ks. • Can be minimum if s << r, but also might not be best.

  39. Improved Skew Join • Let’s partition the tuples of R with B=10 into x groups and partition the tuples of S with B=10 into y groups, where xy = k. • Use one of k reducers for each pair (i, j) of a group for R and a group for S. • Send each tuple of R with B=10 to all y groups of the form (i, ?), and send each tuple of S with B=10 to all x reducers (?, j). • Communication = ry + sx.

  40. Optimizing the Groups • We need to minimize ry + sx, under the constraint xy = k. • Solution: x = kr/s; y = ks/r; communication = 2krs . • Note: 2krs is always < r + ks(the cost of the implemented skew join), often much less. • Example: if r = s, then new skew join takes 2rk, while old takes r(k+1).

  41. Summary • Multiway joins can be computed by replicating tuples and distributing them to many compute nodes. • Minimizing communication requires us to solve a nonlinear optimization. • Multiway beats 2-way joins for star queries and queries on high-fanout graphs. • Exact solution for chain and star queries. • Simple application to optimal skew joins.

  42. Research Questions • We really don’t know the complexity of optimizing the multiway join in general. • The algorithm we offered in 2010 EDBT is exponential in the worst case. • But is it NP-hard? • The Skew-join technique borrows from the multiway join, but is a 2-way join. • Can you generalize the idea to all multiway joins with skew?

More Related