caching dynamic skyline queries
Download
Skip this Video
Download Presentation
Caching Dynamic Skyline Queries

Loading in 2 Seconds...

play fullscreen
1 / 50

Caching Dynamic Skyline Queries - PowerPoint PPT Presentation


  • 151 Views
  • Uploaded on

Caching Dynamic Skyline Queries. D. Sacharidis 1 , P. Bouros 1 , T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management of Information Systems – R.C. Athena. Outline. Introduction Skyline (SL) and dynamic skyline queries (DSL) Related work

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Caching Dynamic Skyline Queries' - liora


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
caching dynamic skyline queries

Caching Dynamic Skyline Queries

D. Sacharidis1, P. Bouros1, T. Sellis1,2

1National Technical University of Athens

2Institute for Management of Information Systems – R.C. Athena

HDMS\'08

outline
Outline
  • Introduction
    • Skyline (SL) and dynamic skyline queries (DSL)
  • Related work
  • Evaluating dynamic skyline queries
    • Computing orthant skylines (OSL)
    • Computing dynamic skyline via caching
      • LRU, LFU, LPP cache replacement policies
  • Experimental evaluation
  • Conclusions and Future work

HDMS\'08

skyline queries sl
Skyline queries (SL)
  • Given a dataset of d-dimensional points
    • SL contains points not dominated by others
    • x dominates y iff x as good as y in all dimensions and strictly better in at least one

HDMS\'08

skyline queries sl1
Skyline queries (SL)
  • Given a dataset of d-dimensional points
    • SL contains points not dominated by others
    • x dominates y iff x as good as y in all dimensions and strictly better in at least one

Price

Distance from sea

  • Example
    • Dataset of hotels
    • Prefer cheap hotels close to the sea

HDMS\'08

skyline queries sl2
Skyline queries (SL)
  • Given a dataset of d-dimensional points
    • SL contains points not dominated by others
    • x dominates y iff x as good as y in all dimensions and strictly better in at least one

Skyline points

Price

Distance from sea

  • Example
    • Dataset of hotels
    • Prefer cheap hotels close to the sea

HDMS\'08

skyline queries sl3
Skyline queries (SL)
  • Given a dataset of d-dimensional points
    • SL contains points not dominated by others
    • x dominates y iff x as good as y in all dimensions and strictly better in at least one

Skyline points

Price

p1

Distance from sea

  • Example
    • Dataset of hotels
    • Prefer cheap hotels close to the sea

HDMS\'08

skyline queries sl4
Skyline queries (SL)
  • Given a dataset of d-dimensional points
    • SL contains points not dominated by others
    • x dominates y iff x as good as y in all dimensions and strictly better in at least one

Skyline points

Price

p1

p2

Distance from sea

  • Example
    • Dataset of hotels
    • Prefer cheap hotels close to the sea

HDMS\'08

dynamic skyline queries dsl
Dynamic skyline queries (DSL)
  • Extension of skyline queries
    • Given a query point q
    • DSL contains points not dynamically dominated by others w.r.t q
    • x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one
  • Can be treated as static SL
    • Transform points w.r.t. q

HDMS\'08

dynamic skyline queries dsl1
Dynamic skyline queries (DSL)
  • Extension of skyline queries
    • Given a query point q
    • DSL contains points not dynamically dominated by others w.r.t q
    • x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one
  • Can be treated as static SL
    • Transform points w.r.t. q

Query point q

Price

Distance from sea

  • Example
    • User defines “ideal” hotel q

HDMS\'08

dynamic skyline queries dsl2
Dynamic skyline queries (DSL)
  • Extension of skyline queries
    • Given a query point q
    • DSL contains points not dynamically dominated by others w.r.t q
    • x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one
  • Can be treated as static SL
    • Transform points w.r.t. q

Dynamic Skyline points

Price

q

Distance from sea

  • Example
    • User defines “ideal” hotel q

HDMS\'08

dynamic skyline queries dsl3
Dynamic skyline queries (DSL)
  • Extension of skyline queries
    • Given a query point q
    • DSL contains points not dynamically dominated by others w.r.t q
    • x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one
  • Can be treated as static SL
    • Transform points w.r.t. q

p4

p5

Dynamic Skyline points

Price

q

Distance from sea

  • Example
    • User defines “ideal” hotel q

HDMS\'08

intuition 1
Intuition (1)
  • Traditional SL algorithms need to run anew for each DSL query
  • Our idea
    • Exploit results from past queries to reduce processing cost for future DSL queries
    • Cache past queries
    • Decide which queries in cache are useful

HDMS\'08

intuition 2
Intuition (2)

Price

Distance from sea

HDMS\'08

intuition 21
Intuition (2)
  • 2 past DSL queries
    • qa, qb
  • Each query partitions space in 4quadrants

qa

Price

qb

Distance from sea

HDMS\'08

intuition 3
Intuition (3)

p4

  • A new query q arrives
  • Consider DSL for qa
    • p1 is contained DSL(qa)
    • p1 dominates p2, p3, p4
  • p1 lies in upper right quadrant w.r.t. qa
  • qa lies in upper right quadrant w.r.t. q
  • p1 dominates also p2, p3,p4 w.r.t. to q
    • Exclude p2, p3, p4 from dominance test for DSL(q)

p2

p1

p3

qa

q

Price

qb

Distance from sea

  • Shaded area denotes points dominated by p1

HDMS\'08

contribution in brief
Contribution in brief
  • Caching past DSL queries cannot reduce processing cost for future ones
    • We need more information about dominance relationships
  • Introduce orthantskylines (OSL) and examine their relationship with DSL
  • ExtendBitmap algorithm to compute OSL in parallel with DSL
  • Cache OSL to enhance DSL queries evaluation
    • Present 3 cache replacement policies
      • LRU, LFU, LPP
  • Experimental evaluation of caching mechanism

HDMS\'08

related work
Related work
  • Non-indexed methods
    • Block-Nested Loops (BnL)
    • Bitmap
    • Multidimensional Divide and Conquer (DnC)
    • Sort First Scan (SFS)
  • Index-based methods
    • B-tree
      • sort points according to the lowest valued coordinate
    • R-tree
      • Nearest neighbor based (NN)
      • Branch and bound (BBS)

HDMS\'08

related work1
Related work
  • Non-indexed methods
    • Block-Nested Loops (BnL)
    • Bitmap
    • Multidimensional Divide and Conquer (DnC)
    • Sort First Scan (SFS)
  • Index-based methods
    • B-tree
      • sort points according to the lowest valued coordinate
    • R-tree
      • Nearest neighbor based (NN)
      • Branch and bound (BBS)

HDMS\'08

bitmap
Bitmap
  • BnL variant
  • Suitable for domains with low cardinality and discrete
  • In brief
    • Computes a bitmap representation of the points in the dataset
    • Examines each point separately (dominance test)
      • Checks whether it is contained in the skyline or not
      • Exploits fast bitwise operations OR/AND

HDMS\'08

bitmap dominance test
Bitmap – Dominance test
  • For each point p
    • Define A = A1 & A2 & … & Ad
      • Denotes the points as good as p in all dimensions
    • Define B = B1 | B2 | … | Bd
      • Denotes the points strictly better than p in at least one dimension
    • Dominance test:
      • If C = A & B has all bits set to 0 then p is in SL

HDMS\'08

orthant skyline osl
Orthant skyline (OSL)
  • OSL provides more information about dominance relationships than DSL
    • Useful for pruning
  • Given a dataset of d-dimensional points and a query point q
    • Space partitioned in 2d orthants
    • o-th orthant skyline (OSL) of q contains points of the o-th orthantnot dynamically dominated by others inside orthant o w.r.t q

HDMS\'08

orthant skyline osl1
Orthant skyline (OSL)

Quadrant 1

Quadrant 0

  • OSL provides more information about dominance relationships than DSL
    • Useful for pruning
  • Given a dataset of d-dimensional points and a query point q
    • Space partitioned in 2d orthants
    • o-th orthant skyline (OSL) of q contains points of the o-th orthantnot dynamically dominated by others inside orthant o w.r.t q

Query point q

Price

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

orthant skyline osl2
Orthant skyline (OSL)

Quadrant 1

Quadrant 0

  • OSL provides more information about dominance relationships than DSL
    • Useful for pruning
  • Given a dataset of d-dimensional points and a query point q
    • Space partitioned in 2d orthants
    • o-th orthant skyline (OSL) of q contains points of the o-th orthantnot dynamically dominated by others inside orthant o w.r.t q

Query point q

Price

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

orthant skyline osl3
Orthant skyline (OSL)

Quadrant 1

Quadrant 0

  • OSL provides more information about dominance relationships than DSL
    • Useful for pruning
  • Given a dataset of d-dimensional points and a query point q
    • Space partitioned in 2d orthants
    • o-th orthant skyline (OSL) of q contains points of the o-th orthantnot dynamically dominated by others inside orthant o w.r.t q

Query point q

Quadrant 2 skyline points

Price

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

osl and dsl relationship
OSL and DSL relationship

Quadrant 1

Quadrant 0

Price

q

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

osl and dsl relationship1
OSL and DSL relationship

Quadrant 1

Quadrant 0

Price

q

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

osl and dsl relationship2
OSL and DSL relationship

Quadrant 1

Quadrant 0

  • Map points from quadrants 1,2,3 to points inside quadrant 0

Price

q

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

osl and dsl relationship3
OSL and DSL relationship

Quadrant 1

Quadrant 0

  • Map points from quadrants 1,2,3 to points inside quadrant 0
  • Compute DSL w.r.t. q

Price

q

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

osl and dsl relationship4
OSL and DSL relationship

Quadrant 1

Quadrant 0

  • Map points from quadrants 1,2,3 to points inside quadrant 0
  • Compute DSL w.r.t. q
  • Union of all OSLs is superset of DSL w.r.t. to q

Price

q

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

osl and dsl relationship5
OSL and DSL relationship

Quadrant 1

Quadrant 0

  • Map points from quadrants 1,2,3 to points inside quadrant 0
  • Compute DSL w.r.t. q
  • Union of all OSLs is superset of DSL w.r.t. to q

p1

p2

Price

q

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

osl and dsl relationship6
OSL and DSL relationship

Quadrant 1

Quadrant 0

  • Map points from quadrants 1,2,3 to points inside quadrant 0
  • Compute DSL w.r.t. q
  • Union of all OSLs is superset of DSL w.r.t. to q

p2

Price

q

p3

Distance from sea

Quadrant 2

Quadrant 3

HDMS\'08

computing orthant skylines
Computing orthant skylines
  • Algorithm DBM
    • Extends Bitmap to compute DSL and OSLs at the same time
  • Method:
    • Compute bitmap representation
      • Transform each point coordinates w.r.t. to query q
    • Dominance test, point p, orthant o
      • p not in OSLo and not in DSL
      • p not in DSL, but in OSLo
      • p in DSL and in OSLo

HDMS\'08

dynamic skylines via caching
Dynamic skylines Via Caching
  • Cache OSLs instead of DSLs
    • Query cache contains (query point qj, OSLs)
    • OSLs encode by bitmaps
  • Algorithm cDBM
    • OSL contains information about dominance test inside orthant
    • Discard points inside orthants from dominance tests
  • Method:
    • Compute bitmap representation
    • For each point p consider its position (orthant) w.r.t. to cache queries qj
    • If p in the same orthant o w.r.t qj as qj w.r.t. q and p not in OSLo(qj) then exclude it from OSLo(q), DSL(q)

HDMS\'08

cache replacement policies
Cache Replacement Policies
  • General idea
    • Limited cache space
    • Identify least useful query point in cache
    • Replace it with new one

HDMS\'08

usage based policies
Usage-based policies
  • Only a few queries in cache are useful
  • Log cache query usage
  • Given a new query q
    • Consider as input the query point cache Q
    • Only query points in OSL of Q w.r.t. q are useful
    • Update cache - remove:
      • Least Recently Used (LRU) query point
      • Least Frequently Used (LFU) query point

HDMS\'08

usage based policies1
Usage-based policies
  • Only a few queries in cache are useful
  • Log cache query usage
  • Given a new query q
    • Consider as input the query point cache Q
    • Only query points in OSL of Q w.r.t. q are useful
    • Update cache - remove:
      • Least Recently Used (LRU) query point
      • Least Frequently Used (LFU) query point

qd

qb

qc

qa

Price

q

Distance from sea

HDMS\'08

usage based policies2
Usage-based policies

Redundant queries

  • Only a few queries in cache are useful
  • Log cache query usage
  • Given a new query q
    • Consider as input the query point cache Q
    • Only query points in OSL of Q w.r.t. q are useful
    • Update cache - remove:
      • Least Recently Used (LRU) query point
      • Least Frequently Used (LFU) query point

qd

qb

qc

qa

Price

q

Distance from sea

HDMS\'08

usage based policies3
Usage-based policies

Redundant queries

  • Only a few queries in cache are useful
  • Log cache query usage
  • Given a new query q
    • Consider as input the query points in cache Q
    • Only query points inOSL of Q w.r.t. q are useful
    • Update cache - remove:
      • Least Recently Used (LRU) query point
      • Least Frequently Used (LFU) query point

qd

qb

qc

qa

Price

q

Distance from sea

HDMS\'08

usage based policies4
Usage-based policies

Redundant queries

  • Only a few queries in cache are useful
  • Log cache query usage
  • Given a new query q
    • Consider as input the query points in cache Q
    • Only query points in OSL of Q w.r.t. q are useful
    • Update cache - remove:
      • Least Recently Used (LRU) query point
      • Least Frequently Used (LFU) query point

qd

qb

qc

qa

Price

q

Distance from sea

HDMS\'08

pruning power based policy
Pruning power-based policy
  • Usage-based policies do not indicate usefulness
  • Useful cached query
    • Great pruning power
      • Probability that a query can prune points of dataset from DSL computation
    • Depends on
      • Points dominated by query in an orthant j
      • Points contained in the antisymetric orthantof j
  • Update cache – remove
    • Query point with less pruning power (LPP)

qa

Price

q

Distance from sea

HDMS\'08

pruning power based policy1
Pruning power-based policy
  • Usage-based policies do not indicate usefulness
  • Useful cached query
    • Great pruning power
      • Probability that a query can prune points of dataset from DSL computation
    • Depends on
      • Points dominated by query in an orthant j
      • Points contained in the antisymetric orthantof j
  • Update cache – remove
    • Query point with less pruning power (LPP)

5:24

2:24

qa

3:24

74:24

Price

q

Distance from sea

HDMS\'08

pruning power based policy2
Pruning power-based policy
  • Usage-based policies do not indicate usefulness
  • Useful cached query
    • Great pruning power
      • Probability that a query can prune points of dataset from DSL computation
    • Depends on
      • Points dominated by query in an orthant j
      • Points contained in the antisymetric orthant of j
  • Update cache – remove
    • Query point with less pruning power (LPP)

5:74

2:34

qa

3:44

74:884

Price

q

Distance from sea

HDMS\'08

pruning power based policy3
Pruning power-based policy
  • Usage-based policies do not indicate usefulness
  • Useful cached query
    • Great pruning power
      • Probability that a query can prune points of dataset from DSL computation
    • Depends on
      • Points dominated by query in an orthant j
      • Points contained in the antisymetric orthant of j
  • Update cache – remove
    • Query point with less pruning power (LPP)

5:74

2:3176

qa

3:44

74:884

Price

q

Distance from sea

HDMS\'08

pruning power based policy4
Pruning power-based policy
  • Usage-based policies do not indicate usefulness
  • Useful cached query
    • Great pruning power
      • Probability that a query can prune points of dataset from DSL computation
    • Depends on
      • Points dominated by query in an orthant j
      • Points contained in the antisymetric orthant of j
  • Update cache – remove
    • Query point with less pruning power (LPP)

5:720

2:3176

qa

3:421

74:88222

Price

q

Distance from sea

HDMS\'08

pruning power based policy5
Pruning power-based policy
  • Usage-based policies do not indicate usefulness
  • Useful cached query
    • Great pruning power
      • Probability that a query can prune points of dataset from DSL computation
    • Depends on
      • Points dominated by query in an orthant j
      • Points contained in the antisymetric orthant of j
  • Update cache – remove
    • Query point with less pruning power (LPP)

5:720

2:3176

qa

3:421

74:88222

Price

q

Distance from sea

HDMS\'08

experimental evaluation
Experimental Evaluation
  • Synthetic datasets
    • Distribution types
      • Independent, correlated, anti-correlated
    • Number of points N
      • 10k, 20k, 50k, 100k,
    • Dimensionality
      • d = {2,3,4,5,6}
    • Domain size for dimension
      • |D| = {10,20,50}
  • Compare
    • Bitmap (NO-CACHE)
    • cDBM with LFU,LRU,LPP cache replacement policies
    • Query cache
      • |Q| = {10,20,30,40,50} past query points
      • Cache size is |Q|*N bits uncompressed

HDMS\'08

varying query cache size
Varying query cache size

Independent

Anti-correlated

  • Dataset: N = 50k points, with d = 4 dimensions of |D| = 20 domain size
  • LFU,LRU cache queries not representative for future ones
  • LPP caches queries with great pruning power

HDMS\'08

effect of distribution parameters
Effect of distribution parameters

Correlated vary N

Correlated vary d

  • Relative improvement in running time over NO-CACHE
  • Vary number of points N
    • d = 4 dimensions of |D| = 20 domain size
  • Vary number of dimensions d
    • N = 50k, |D| = 20

HDMS\'08

conclusions and future work
Conclusions and Future work
  • Conclusions
    • Introduced orthant skylines(OSLs) and discussed its relationship with DSL
    • Extended Bitmap to compute OSLs and DSL at the same time (DBM algorithm)
    • Proposed caching mechanism of OSLs to reducecost for future DSL queries
      • LRU, LFU, LPPcache replacement policies
    • Experimentally verified the efficiency of caching mechanism
  • Future work
    • Apply caching mechanism to index-based methods
    • Further increase pruning power of cached queries

HDMS\'08

questions

Questions ?

HDMS\'08

ad