1 / 35

Adaptively Processing Remote Data

Adaptively Processing Remote Data. Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems February 28, 2005. Administrivia. Next reading assignment: Doan et al. – LSD Recall that the midterm will be due 3/16

yuval
Download Presentation

Adaptively Processing Remote Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Adaptively Processing Remote Data Zachary G. Ives University of Pennsylvania CIS 650 – Database & Information Systems February 28, 2005

  2. Administrivia Next reading assignment: • Doan et al. – LSD • Recall that the midterm will be due 3/16 • You can go ahead and choose a topic – let me know which one

  3. Today’s Trivia Question

  4. Sources of Query Answering Cost • Regular computation (this has a minimum cost) • But we can get held up by: • Inflexible query plans • Delays – we “stall” in waiting for I/O • Query scrambling • Pipelined hash joins • Bad query plans • Bad estimation of intermediate result sizes • The focus of the Kabra and DeWitt paper • Insufficient source information • Eddies and ADP • Exploration vs. exploitation; extrapolating performance

  5. Kabra and DeWitt Recap • Provides significant potential for improvement without adding much overhead • Biased towards exploitation, with very limited information-gathering • A great way to retrofit an existing system • In SIGMOD04, IBM had a paper that did this in DB2 • But not appropriate for remote data • Relies on us knowing the (rough) cardinalities of the sources • Query plans aren’t pipelined

  6. A Second Issue: Delays • May have a very computationally inexpensive plan, but slow sources • The query plan might get held up waiting for data • Solution 1: query scrambling • Rescheduling – while delayed, find a non-executing part of the query plan and start it • Operator synthesis – when nothing can be rescheduled, might tinker with the original plan • Want to do this in a cost-based way

  7. Cost-Based Query Scrambling • Divide plan into runnable subtrees based on schedule; schedule those with 75% efficiency • Each may be run out of order if it materializes • Cost = mat. write + processing + mat. read • Efficiency = (M – MR) / (P + MW): savings / cost • When no more runnable subtrees, need to do something • Operator synthesis: try to find a computation that will mask the delay – only how much should we do? • PAIR: only do a single join over a pair of relations • IN: include delay, chooses to defer delayed access to end • ED: estimated delay, progressively increases the delay

  8. Works Pretty Well for Large Mem.

  9. But Not As Well for Limited Mem

  10. Overall Assessment • Hard to estimate how much work to do! • A trade-off between aggressive and conservative strategies • If enough resources, it doesn’t matter…

  11. Solution 2: Pipelined Hash Joins • They can dynamically adapt to delays, alleviating the need for techniques like query scrambling (except between different pipeline stages) • Disadvantages: • Memory – what happens if we run out • Focus of [Ives+99], [Urhan+00] • Cost – a query plan with sub-optimal order will still be very expensive! • The central piece of the next two techniques we’ll look at…

  12. Extreme Adaptivity: Eddies • The basic idea: • Query processing consists of sending tuples through a series of operators • Why not treat it like a routing problem? • Rely on “back-pressure” (i.e., queue overflow) to tell us where to send tuples • Part of the ongoing Telegraph project at Berkeley • Large-scale federated, shared-nothing data stream engine • Variations in data transfer rates • Little knowledge of data sources

  13. Telegraph Architecture • Simple “pre-optimizer” to generate initial plan • Creates operators, e.g.: • Select: predicate(sourceA) • Join: predicate(sourceA, sourceB) • (No support for aggregation, union, etc.) • Chooses implementations Generally relies on pipelined hash joins • Goal: dataflow-driven scheduling of operations • Tuple comes into system • Adaptively routed through operators in “eddies” May be combined, discarded, etc.

  14. The Eddy • Represents set of possible orderings of a subplan • Each tuple may “flow” through a different ordering (which may be constrained) • N-ary module consisting of query operators • Basic unit of adaptivity • Subplan with select, project, join operators U

  15. Example Join Subplan Alternatives Join(R3R1.x = R2.x, JoinR1.x = R3.x(R1, R3)) R1.x = R3.x R1.x = R3.x R1 R2.x = R3.x R3 R2.x = R1.x R2 R3 R2 R1 R1.x = R3.x R1.x = R3.x … R3 R2.x = R3.x R1 R2.x = R3.x R1 R2 R2 R3

  16. Naïve Eddy • Given tuple, route to operator that’s ready • Analogous to fluid dynamics • Adjusts to operator costs • Ignores operator selectivity

  17. Adapting to Variable-Cost Selection

  18. But Selectivity is Ignored

  19. Lottery-Based Eddy • Need to favor more selective operators • Ticket given per tuple input, returned per output • Lottery scheduling based on number of tickets • Now handles both selectivity and cost

  20. Enhancing Adaptivity: Sliding Window • Tickets were for entire query • Weighted sliding window approach • “Escrow” tickets during a window • “Banked” tickets from a previous window

  21. What about Delays? Problems here: Don’t know when join buffers vs. “discards” tuples T S R (SLOW)

  22. Eddy Pros and Cons • Mechanism for adaptively re-routing queries • Makes optimizer’s task simpler • Can do nearly as well as well-optimized plan in some cases • Handles variable costs, variable selectivities • But doesn’t really handle joins very well – attempts to address in follow-up work: • STeMs – break a join into separate data structures; requires re-computation at each step • STAIRs – create intermediate state and shuffle it back and forth • Pre-optimization? Distribution (DeWitt et al.)? Beyond joins?

  23. Generalizing Adaptive Query Processing • We’ve seen a range of different adaptive techniques • How do they fit together? • Can we choose points between eddies and mid-query re-optimization?

  24. Types of Adaptive Query Processing Adaptive scheduling (q. scrambling [UF98], dyn. pipeline sched. [UF01], XJoin [UF00], PH Join [RS86][I+99], ripple join [HH99]) Changes CPU scheduling to improve feedback or reduce delays Redundant computation (competitive exec. [AZ96]) Compare two+ ways of executing the query Plan partitioning ([S+76][KD98][I+99][M+04]) Break the plan into stages; re-optimize future stages as necessary Adaptive info passing ([IT05 sub.]) Pass data between parts of an executing plan Adaptive data partitioning • Break the data into subsets; use a different plan for each subset • The only way to reduce overall computation with fine granularity • First (only) implementation has been eddies[AH00][R+03][DH04]

  25. Eddies Combine Adaptive Scheduling and Data Partitioning Decisions Intuitively, each tuple gets its own query plan • Route to next operator based on speed and selectivity of each operator • Elegant and simple to implement But performing a join creates subresults at the next level! Local & greedy choices may result in state that needs to join with all future data! Consider long-term effects of decisions before making them – separate CPU scheduling from plan selection

  26. Focusing Purely on Adaptive Data Partitioning Use adaptively scheduled operators to “fill CPU cycles” Now a query optimizer problem: Choose a plan that minimizes long- term cost (in CPU cycles) To allow multiple plans, distribute union through join (and select, project, etc.): If R1 = R11[ R12, R2 = R21[ R22 then: R1⋈ R2 = (R11[ R12) ⋈ (R21[ R22) = (R11⋈ R21) [ (R12⋈ R22) [ (R11⋈ R22) [ (R12⋈ R21) R11 R21 R22 R12 R2 R1 This generalizes to njoins, other SPJ + GUoperators…

  27. Exclude R0S0T0,R1S1T1 R1 S1T1 S1T1 È T 0 T 0 1 1 T R R S 0 0 Exclude R0S0 0 0 1 1 R S S T R 0 S 0 R S 1 1 Adaptive Data Partitioning:Routing Data across Different Plans R ⋈S ⋈ T  R0 S0T0 Options for combining across phases: • New results always injected into old plan • Old results into new plan • Wait until the end – “stitch-up” plan based on best stats … R0 S0 R S T

  28. Special Architectural Features for ADP Monitoring and re-optimization thread runs alongside execution: • System-R-like optimizer with aggregation support;uses most current selectivity estimates • Periodic monitoring and re-optimization revises selectivity estimates, recomputes expected costs Query execution with “smart router” operators Special support for efficient stitch-up plans: • Uses intermediate results from previous plans (specialized-case of answering queries using views [H01]) • Join-over-union (“stitch-up-join”) operator that excludes certain results

  29. ADP Application 1:Correcting Cost Mis-estimates Goal: react to plans that are obviously bad • Don’t spend cycles searching for a slightly better plan • Try to avoid paths that are likely to not be promising Monitor/reoptimizer thread watches cardinalities of subresults • Re-estimate plan cost, compare to projected costs of alternatives, using several techniques & heuristics (see paper) • Our experiments: re-estimate every 1 sec. “Smart router” operator does the following: • Waits for monitor/reoptimizer to suggest replacement plan • Re-routes source data into the new plan • New plan’s output is unioned with output of previous plan; this is fed into any final aggregation operations

  30. Correcting for Unexpected Selectivities Pentium IV 3.06 GHzWindows XP

  31. ADP Application 2:Optimizing for Order Most general ADP approach: • Pre-generate plans for general case and each “interesting order” • “Smart router” sends tuple to the plan whose ordering constraint is followed by this tuple • But with multiple joins, MANY plans Instead: do ADP at the operator level • “Complementary join pair” • Does its own stitch-up internally • Easier to optimize for! Can also do “partial sorting” at the router (priority queue) Q ... Merge Hash h(R) h(S) h(R) h(S) Q Q Q Q Routers R S

  32. Exploiting Partial Order in the Data Pentium IV 3.06 GHzWindows XP (1024 tuple)

  33. SUM(T.y) GROUP BY T.x T R SUM(T.y sums) GROUP BY T.x R SUM(T.y) GROUP BY T.x, T.joinAttrib T ADP Over “Windows”:Optimizing for Aggregation • Group-by optimization [CS94]: • May be able to “pre-aggregate” some tuples before joining • Why: aggregates can be applied over union • But once we insert pre-aggregation, we’re stuck (and it’s not pipelined) • Our solution: • “Adjustable window pre-aggregation” • Change window size depending on how effectively we can aggregate • Also allows data to propagate through the plan – better info for adaptivity, early answers vs.

  34. Pre-Aggregation Comparison

  35. Adaptive QP in Summary A variety of different techniques, focusing on: • Scheduling • Comparison & competition • Information passing • Data + plan partitioning A field that is still fairly open – missing: • Effective exploration methods • A true theory! • What’s possible? What kinds of queries make sense to adapt? • Guarantees of optimality and convergence (perhaps under certain assumptions)

More Related