A deterministic algorithm for summarizing asynchronous streams over a sliding window
This presentation is the property of its rightful owner.
Sponsored Links
1 / 58

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window. Costas Busch Rensselaer Polytechnic Institute Srikanta Tirthapura Iowa State University. Outline of Talk. Introduction. Algorithm. Analysis. Time. Data stream:. For simplicity assume

Download Presentation

A Deterministic Algorithm for Summarizing Asynchronous Streams over a Sliding Window

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


A deterministic algorithm for summarizing asynchronous streams over a sliding window

A Deterministic Algorithm forSummarizing Asynchronous Streamsover a Sliding Window

Costas Busch

Rensselaer Polytechnic Institute

Srikanta Tirthapura

Iowa State University


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Outline of Talk

Introduction

Algorithm

Analysis


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Time

Data stream:

For simplicity assume

unit valued elements


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Most recent time window of duration

Current

time

Data stream:

Goal:

Compute the sum of elements with

time stamps in time window


A deterministic algorithm for summarizing asynchronous streams over a sliding window

  • Example I: All packets on a network link, maintain the number of different ip sources in the last one hour

  • Example II: Large database, continuously maintain averages and frequency moments


A deterministic algorithm for summarizing asynchronous streams over a sliding window

  • Synchronous stream

    • ti: In ascending order

  • Asynchronous stream

    • ti: No order guaranteed

Data stream:


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Why Asynchronous Data Streams?

Synchronous stream

Asynchronous stream

Network delay & multi-path routing

Synchronous

Asynchronous

Synchronous

Merge w/o control


A deterministic algorithm for summarizing asynchronous streams over a sliding window

  • Processing Requirements:

    • One pass processing

    • Small workspace: poly-logarithmic in the size of data

    • Fast processing time per element

    • Approximate answers are ok


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Our results:

A deterministic data aggregation algorithm

Time:

Space:

Relative Error:


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Previous Work:

[Datar, Gionis, Indyk, Motwani.

SIAM Journal on Computing, 2002]

Deterministic, Synchronous

Merging buckets

[Tirthapura, Xu, Busch, PODC, 2006]

Randomized, Asynchronous

Random sampling


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Outline of Talk

Introduction

Algorithm

Analysis


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Time

Current

time

Data stream:

For simplicity assume

unit valued elements


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Most recent time window of duration

Current

time

Data stream:

Goal:

Compute the sum of elements with

time stamps in time window


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Divide time into periods of duration


A deterministic algorithm for summarizing asynchronous streams over a sliding window

sliding window

The sliding window may span at most

two time periods


A deterministic algorithm for summarizing asynchronous streams over a sliding window

sliding window

Sum can be written as two sub-sums

In two time periods


A deterministic algorithm for summarizing asynchronous streams over a sliding window

sliding window

Data structure that

maintains an estimate of

In left time period


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Without loss of Generality,

Consider data structure

in time period


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Data structure consists of various levels

is an upper bound of the sum in a period


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Consider level

Bucket at Level

Time period

Counts up to elements


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Increase counter value


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Increase counter value


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Increase counter value


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Increase counter value


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Split bucket

Counter threshold of reached


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

New buckets have threshold also


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Increase appropriate bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Increase appropriate bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Increase appropriate bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Split bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Increase appropriate bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Split bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Splitting Tree


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Max depth =

Leaf buckets of duration 1

are not split any further


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Leaf buckets

The initial bucket may be split into

many buckets


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Leaf buckets

Due to space limitations

we only keep the last

buckets


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Suppose we want to find the sum

of elements in time period


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Consider various levels

of splitting threshold


A deterministic algorithm for summarizing asynchronous streams over a sliding window

First level with a leaf bucket

that intersects timeline


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Estimate of S:

Consider buckets on right of timeline


A deterministic algorithm for summarizing asynchronous streams over a sliding window

OR

First level with a leaf bucket

On right timeline


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Outline of Talk

Introduction

Algorithm

Analysis


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Suppose that we use level

in order to compute the estimate


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

Consider splitting threshold level

A data element is counted in the

appropriate bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

We can assume that the element is

placed in the respective bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

We can assume that when bucket

splits the element is placed in an

arbitrary child bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

If:

GOOD!

Element counted in correct bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Stream:

If:

BAD!

Element counted in wrong bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Consider Leaf Buckets

GOOD!

If


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Consider Leaf Buckets

BAD!

If

Element counted in wrong bucket


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Consider Leaf Buckets

:elements of left part counted on right

:elements of right part counted on left


A deterministic algorithm for summarizing asynchronous streams over a sliding window

elements of left part

counted on right

Must have been initially inserted

in one of these buckets


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Since tree depth


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Since tree depth

Similarly, we can prove

Therefore:


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Since

It can be proven


A deterministic algorithm for summarizing asynchronous streams over a sliding window

Since

It can be proven

Combined with

We obtain relative error :


  • Login