1 / 9

STREAM: The Stanford Data Stream Management System

STREAM: The Stanford Data Stream Management System. Rebuttal Team Mingzhu Wei Di Yang CS525s - Fall 2006. Rebuttal Areas. Foundation Windows Joins Full Recalculation Strategy Language Issues. Foundation. Rebuttal

zielke
Download Presentation

STREAM: The Stanford Data Stream Management System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. STREAM: The Stanford Data Stream Management System Rebuttal Team Mingzhu Wei Di Yang CS525s - Fall 2006

  2. Rebuttal Areas • Foundation • Windows • Joins • Full Recalculation Strategy • Language Issues

  3. Foundation • Rebuttal • No proof provided to guarantee correctness or completeness of query plans resulting from the combination of operators, queues, and synopses • Analysis • STREAM is based on relation database theory • Relational databases have been around for a long time • Proofs exist that demonstrate their correctness • CQL is a minor extension to SQL • More effort could have been put into providing a more formal proof of CQL

  4. Windows • Rebuttal • Stream does not provide value-based windows • For example, without a value-based window the system cannot process a query such as: • Give me the name of the students who have the top 10 exam scores, efficiently • Analysis • Feature not supported by STREAM

  5. Joins • Rebuttal • STREAM only uses the self-purge mechanism when performing a window-based join • Analysis • STREAM’s criteria for judging when a tuple (in the state of the window) has expired is determined by comparison of its timestamp with that of the new incoming tuples in the same stream • Cross-purge might be more efficient in some cases • Cross-purge = Compare timestamps across two streams

  6. Full Recalculation Strategy • Rebuttal • Stream uses a full recalculation strategy for result updating • Could be very inefficient with big window sizes • Example: • We are trying to join two windows each of size 1000 • If both windows only slide 10 at each time, recalculation for the whole result would be much more expensive, than incremental result updating • Analysis • Using an Incremental Result Update Strategy might be more efficient in some cases • Keep most of the joined result and only calculate those for newly arrived tuples

  7. Language Issues • Rebuttal • Stream does not provide the stream to stream operator • Analysis • The absence of a stream-to-stream operator is not explicitly justified in the paper • It’s absence is reasonable because STREAM operators treat all input as relations • STREAM does provide operators for converting streams to relations and for converting relations to streams

  8. Language Issues (cont) • Rebuttal • Stream uses an append-only model • It does not provide an operator for updating data value in stream • Analysis • Although not perfect, this is a common assumption in current stream processing papers

  9. Conclusion • Foundation • Windows • Joins • Full Recalculation Strategy • Language Issues

More Related