Avoiding Idle Waiting in the execution of Continuous Queries

Avoiding Idle Waiting in the execution of Continuous Queries Carlo ZanioloCSD CS240B Notes April 2008

Continuous Query Optimization • Global optimization objectives can change during execution from, e.g., • Response time minimization (typical) • Memory minimization—when this becomes the critical resource • Change in optimization objective might cause re-partitioning of query graph (chain) • Addition & deletion of continuous queries also change topology of execution graph—agile restructuring using a DFA model • Local (operator-specific) optimization—union and join operators have a potential idle-waiting problem • Special execution models (backtracking) and timestamp management techniques are needed tominimize response time and memory usage

The Query Graph Source σ ∑1 ∑2 Sink  ∑1 Sink Source1 U σ Source2 σ ∑2 Sink • The query graph is a DAG • Nodes represent operators • Selection, projection, union, window-join, aggregates, etc. • Thick edges represent inter-connecting physical buffers • The DAG consists of a set of strong components • Strong component are the units for scheduling • A complex DAG may need be partitioned, based on the optimization goal—global optimization • For union and joins (and also slides on logical window) local optimization is needed to avoid idle waiting.

The Idle-Waiting Problem for Union(Joins have the same problem)  Source1 Sink U Source2 σ • The Union operator performs a sort-merge operation • Tuple with the smallest timestamp goes through first • Output tuples stay in sorted order on timestamp • Tuples on Union are subject to idle-waiting—short term blocking behavior • Due to network traffic and operator scheduling, timestamp of tuples over multiple inputs may be skewed – earlier tuples on one input may arrive after later tuples on another input. • When one input is empty, tuples on the other inputs have to wait • Input tuples idle-wait for future arrivals, greatly increase query response time

The Idle-Waiting Problem B A Source1 ?  ? ∑1 3 U Sink Source2 ? σ ? 6 1 ∑2 C B A Source1 ?  ? ∑1 U Sink 6 Source2 ? σ ? ∑2 C • Only timestamps of tuples are shown in buffers • Tuple with TS=1 goes through union first, followed by that with TS=3 • The union produces tuples by increasing timestamp values • Nothing is produced until there is a tuple in A— Idle Waiting • Idle Waiting: poor response time—also extra memory used.

Solving the Idle-Waiting Problem C B A ∑1 Source1 ? ?  U Sink 6 ∑2 Source2 ? σ ? • To avoid idle waiting, we need to get values into A fast. How ?? • By going back to ∑1that checks B for tuples to be processed and sent to A. If B is empty then we go to ,which processes the tuples in C. • This process is called Backtracking! • OtherExecution models, such as those used by other DSMS, will not do. • E.g., Round Robin: a fixed execution order can take us to different components or different branches. • Backtracking takes us back to the only buffers and operators that can unblock the idle waiting • Yes, But: ... what if the source buffer C is empty?  On-demand Timestamp Generation!

Time-stamped Punctuation Marks • Heart-beats: timestamps are generated and sent out from the source. • Periodically as in Gigascope • Effective but far from optimal: too few when needed, too many when not needed • On demand as in Stream Mill • Avoid useless operations when there is no idle-waiting • Send request to right source nodes that can fix the idle-waiting • Much less response time, less memory, but An execution model capable of supporting backtracking is needed for that

Time Stamps • External: generated by the system or sensor producing the tuple. Normally at a remote location, with delays in transmission. • Heartbeat mechanism to ensure synchronization. • Internal: Generated by the DSMS when the tuple arrives. • Missing. Actually Latent. Generated by the system when (and if) one is needed.

Backtracking without Tears ? ? Source σ ∑1 ∑2 Sink A Simple Rule for Next Operator Selection(NOS), based on the input & output buffers: • YIELD is true if the output buffer of the current operator contains some tuples • MORE is true if there are still tuples in the input buffer of the current operator • [Forward:] if YIELD then next := succ • [Encore:] else if MORE then next := self • [Backtrack:] else next := pred NOS for Depth-First

A General Model: Breadth/Depth First ? ? Source σ ∑1 ∑2 Sink A Simple Rule for Next Operator Selection(NOS) based on the input & output buffers: • YIELD is true if the output buffer of the current operator contains some tuples • MORE is true if there are still tuples in the input buffer of the current operator NOS for Depth-First • [Forward:] if YIELD then next := succ • [Encore:] else if MORE then next := self • [Backtrack:] else next := pred NOS for Breadth-First • [Encore:] if MORE then next := self • [Forward:] else if YIELD then next := succ • [Backtrack:] else next := pred

Timestamp Propagation by Special Arcs ∑1  Source1 U Sink Source2 ∑2 σ Source3 Timestamps can be propagated back to the idle-waiting operators • By punctuation marks • By special arcs that connect the source to idle-waiting operators • shown are dashed arcs in the Enhanced Query Graph (EQG)

Execution Model Benefits • Simple and regular: • The same basic cycle is shared by all strategies, with only the NOS rules being different • Amenable to an efficient Deterministic Finite Automata (DFA) based implementation: • Optimization/scheduling Flexibility • A run time, we can easily switch between policies • Different strategies at the same time in different components • Highly reconfigurable • At run-time.

Experiments – Timestamp Propagation • Periodic timestamp propagation reduces latency in proportion to the rate of the heartbeat • However memory overhead increases when heartbeat tuple rate is high • On-demand timestamp propagation reduces latency to very small values with no memory overhead

DFS vs. BFS • How DFS and BFS behave under different input burstiness • We introduce bursts of nearly simultaneous tuples • Both DFS and BFS shows increased latency when burstiness increases, but BFS has a steeper increase

Conclusion • A tuple-level execution model for DSMS that: • Supports different execution strategies with ease • Enables optimization of response time by • efficiently backtracking, and • propagating on-demand timestamp information. • Is amenable to a simple Deterministic Finite Automata (DFA) based implementation • Supports dynamic reconfiguration: we can re-partition the query graph for optimization, add/delete operators, at run time with little overhead • Is the linchpin of the Stream Mill System • Stream Mill System: http://wis.cs.ucla.edu

A Flexible Query Graph Based Modelfor the Efficient Execution of Continuous Queries Yijian Bai, Hetal Thakkar, Haixun Wang*and Carlo ZanioloDepartment of Computer Science University of California, Los Angeles * IBM T.J. Watson

Time for Questions Thank You!

The Stream Mill System Clients • One server, multiple clients • Server (on Linux) hosts the query language, manages storage and schedules continuous queries • Clients (Java based GUI) allow the user to specify streams, queries, and interact with the server

Avoiding Idle Waiting in the execution of Continuous Queries