Online interval skyline queries on time series
Sponsored Links
This presentation is the property of its rightful owner.
1 / 32

Online Interval Skyline Queries on Time Series PowerPoint PPT Presentation


  • 81 Views
  • Uploaded on
  • Presentation posted in: General

Online Interval Skyline Queries on Time Series. Bin Jiang, Jian Pei. Outline. Problem Definition An On-the-fly Method Interval Skyline Query Answering Algorithm Online Interval Skyline Query Algorithm Radix Priority Search Tree A View-Materialization Method

Download Presentation

Online Interval Skyline Queries on Time Series

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Online Interval Skyline Queries on Time Series

Bin Jiang, Jian Pei


Outline

  • Problem Definition

  • An On-the-fly Method

    • Interval Skyline Query Answering Algorithm

    • Online Interval Skyline Query Algorithm

      • Radix Priority Search Tree

  • A View-Materialization Method

    • Non-redundant skyline time series---NRSky[i:j]

  • Experiments


Problem Definition

  • Notions

    • Time Series: A time series s consists of a set of ( value, timestamp) pairs.Here we denote the value of s at timestamp I by s[i], and s as a sequence of values s[1],s[2],…

    • Time Interval: a range in time, denoted as [i : j]. We write

      if ; if .

Some Notions in This Paper


Problem Definition

  • Interval Skyline

    • Given a set S of time series and interval[i:j], the interval skyline is the set of time series that are not dominated by any other time series in [i:j], denoted by

Suppose S={S1, S2, S3}

S1 and S2 are in Sky[16:22], while S3 is doninated by S2.

S2

S1

S3


Problem Definition

  • Interval Skyline

    Property 1:If there exist timestamps k1,…,kl(i≤k1<…<kl≤j) such that

    and s is the only such a time series, then

    time series is in .


Problem Definition

  • Problem Definition

    • Given a set of time series S such that each time series is in the base interval ,we want to maintain a data structure D such that any interval skyline queries in interval can be answered efficiently using D.

  • Methods

    • An On-The-Fly Method

      • Original Interval Skyline Query Algorithm

      • Online Interval Skyline Query Algorithm

    • A View-Materialization Method


Outline

  • Problem Definition

  • An On-the-fly Method

    • Interval Skyline Query Answering Algorithm

    • Online Interval Skyline Query Algorithm

      • Radix Priority Search Tree

  • A View-Materialization Method

    • Non-redundant skyline time series---NRSky[i:j]

  • Experiments


An Interval Skyline Query Algorithm

  • Idea

    Using the maximum value and minimum value of the time series, we can determine the domination of some time series without checking the details.


An Interval Skyline Query Algorithm

  • Algorithm

  • Set current Skyline Set Sky is null;

  • Sort the time series in a list L in the descending order of their maximum value;

  • Set the maximum value of the minimum value of the time series in Sky

  • For each time series s that satisfies in L, determine whether it can dominate or be dominated by time series in Sky; If it can not be dominated:

  • add it into Sky ;

  • delete its dominance in Sky ;

  • update ;

  • Return Sky;


An Interval Skyline Query Algorithm

  • Example

Goal: compute the skyline in interval [2:3]

Steps:

1. s2->Sky, maxmin =1

2. s3->Sky, maxmin =2

3. s5->Sky, maxmin =4

4. s5->s1, s1 is discarded, maxmin =4

5. s4.min=3<4=maxmin, s4 is discarded.

Return Sky={s2,s3,s5}


An Interval Skyline Query Algorithm

  • Disadvantage

    Checking the max value for each time series and the min[i:j] for the query interval [i:j] is costly.

  • Improvement Idea

  • Utilize Radix Priority Search Tree to maintain the min[i:j]

  • Use a sketch to keep the max value for each time series


Online Interval Skyline Query Algorithm

  • Radix Priority Search Tree

    Radix Priority Search Tree is a two-dimensional data structure, a hybrid of a heap on one dimension and a binary search tree on the other dimension.

  • Advantages:

    • Insertion in O(h)

    • Deletion in O(h)

    • Query in O(h)

  • h: the height of the tree


Online Interval Skyline Query Algorithm

  • Radix Priority Search Tree

    • Build

      • Use the timestamps as the binary tree dimension X and the data value as the heap dimension Y;

      • Map W into a fixed domain of X, {0,1,...,w-1};

      • The height of the tree is O(logw)

    • Update →

      One insertion s[ ]

      One deletion s[ ]

      : the most recent timestamp


Maintain max values Using Sketches

  • Sketches

    • A pair (v,t) is maintained if no other pair (v1,t1) such that v1>v, t1>t;

    • These pairs form the skyline of points in the interval;

    • The expected number of points in the skyline is O(logw);

    • With the sketches, finding the maximum value in W costs O(1) time ;

W=[1,3]

Sketches : (4,1),(3,2),(2,3)

W=[1,4]

Sketches : (5,4)


Online Interval Skyline Query Algorithm

  • Complexity

    • Space

      • Radix priority search tree O(w)

      • Sketch of the max values O(logw)

        Total: O(nw)

    • Time

      • Radix priority search tree O(logw)

      • Sketch of the max values O(logw)

        Total: O(nlogw)


Outline

  • Problem Definition

  • An On-the-fly Method

    • Interval Skyline Query Answering Algorithm

    • Online Interval Skyline Query Algorithm

      • Radix Priority Search Tree

  • A View-Materialization Method

    • Non-redundant skyline time series---NRSky[i:j]

  • Experiments


A View-Materialization Method

  • Non-redundant interval skylines

    A time series s is called a non-redundant skyline time series in interval [i:j] if

    • S is in the skyline in interval[i:j]

    • S is not in the skyline in any subinterval[i׳:j׳] [i:j]

      It can be proved by pigeonhole principle, if there are more than w skyline intervals, at least two of them will share the same starting timestamps, then one of them is not a minimum skyline interval.


Useful Theories


A View-Materialization Method

  • Idea

    Suppose all non-redundant interval skylines are materialized, we can union all these skylines over all intervals in [i:j] and remove those fail Lemma 2.

    • Algorithm


A View-Materialization Method

  • Example

W= [2:4]

Goal: compute the interval skyline in [3:4]

Steps:

1. s3->Sky

2. s4->Sky

3. s1->Sky(s2 is dominated by s1)

Return Sky={s1,s3,s4}

How to maintain the non-redundant skylines ?


Maintain Non-Redundant Interval Skylines

  • Steps


Maintain Non-Redundant Interval Skylines

  • Step1

    • Use the on-the-fly algorithm to obtain the interval skyline in the new interval W׳.

    • Find possible false negatives .


Maintain Non-Redundant Interval Skylines

  • Step2-Shared Divide-and-Conquer Algorithm

    • This algorithm is an extension of the divide-and conquer algorithm(DC).

    • In SDC, a space is defined as a time interval. Each timestamp represents a dimension.

    • The related spaces(intervals) are organized as a path, eg. [j:j],[j-1,j],...,[i,j](i<j).


Divide-and-Conquer Algorithm

Merge Step

Divide Step

S12

S22

B

B

S1

S2

B

P4

P4

P3

P3

P3

P1

P1

P1

mB

P5

P5

P5

P2

P2

P2

S11

S21

mA

mA

A

A

A


SDC Algorithm

  • Comparisons

  • Results


Maintain Non-Redundant Interval Skylines

  • Step3-Remove “redundant time series”


Outline

  • Problem Definition

  • An On-the-fly Method

    • Interval Skyline Query Answering Algorithm

    • Online Interval Skyline Query Algorithm

      • Radix Priority Search Tree

  • A View-Materialization Method

    • Non-redundant skyline time series---NRSky[i:j]

  • Experiments


Experiments

  • Parameters


Experiments

  • Synthetic Data Sets

    • Data Sets Properties

    • Query Efficiency


Experiments

  • Synthetic Data Sets

    • Update Efficiency

    • Space Cost


Experiments

  • Stock Data Sets

    • Query Time


Q&A


  • Login