Flexible data cube for range sum queries in dynamic olap data cubes
Sponsored Links
This presentation is the property of its rightful owner.
1 / 20

Flexible Data Cube for Range-Sum Queries in Dynamic OLAP Data Cubes PowerPoint PPT Presentation


  • 60 Views
  • Uploaded on
  • Presentation posted in: General

Flexible Data Cube for Range-Sum Queries in Dynamic OLAP Data Cubes. Authors: C.-I Lee and Y.-C. Li Speaker: Y.-C. Li Date :Dec. 19, 2002. Outline. Introduction Related works Analysis of the average query and update costs Flexible data cube Performance analysis Conclusions.

Download Presentation

Flexible Data Cube for Range-Sum Queries in Dynamic OLAP Data Cubes

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Flexible Data Cube for Range-Sum Queries in Dynamic OLAP Data Cubes

Authors: C.-I Lee and Y.-C. Li

Speaker: Y.-C. Li

Date :Dec. 19, 2002


Outline

  • Introduction

  • Related works

  • Analysis of the average query and update costs

  • Flexible data cube

  • Performance analysis

  • Conclusions


Introduction

  • Data cubes are frequently adopted to implement OLAP and provides aggregate information

  • Data cube: also known as Multi-dimensional Database(MDDB)

  • Measure attributes: be chosen as metrics of interest

  • Functional attributes(dimensions): other attributes of records.

  • Cells: store measure attribute values

  • Range-Sum Query: add all cells in query region


Car-sales example

  • Measure attribute → Sale_Volume

  • Dimensions → Year and Age of customers


+

255

4

+

1430

20


  • Several previous approaches are used to accelerate the response time

  • But they slow down the update speed and require further space overhead

  • This study considers both query and update costs to construct data cubes

    • No extra space overhead

    • Choice the best cube in any query or update ratio

  • We also present a FDC method

    • No extra space overhead (for dense data cube)

    • Select or integrate some pre-aggregation techniques for each dimension


Hierarchical Cube (HC)

[Chan & Ioannidis, 1999]

Double RPS[Liang et al., 2000]

Iterative Data Cube

(IDC)[Riedewal et al., 2001]

Relative Prefix Sum

(RPS) [Geffer et al., 1999a]

Space-Efficient Data Cube

(SEDC)[Riedewal et al., 2000]

Dynamic Data Cube

(DDC)[Geffer et al., 1999b]

1997 1998 1999 2000 2001

Related works

  • The history of pre-aggregate range-sum queries

Prefix Sum(PS)

[Ho et al., 1997]


Prefix Sum(PS) ( Ho et al., 1997 )

  • 3+5+1+2+7+3+2+6+2+4+2+3=40

  • A: 2+3+3+3+1+5+3+5+1+3+3+4=36

  • P: 103-50-35+18=36


Prefix Sum(PS)


Other methods

  • RPS ( Geffer et al., 1999a)

    • Two levels(Local PS and overlay boxes) but extra space overhead

  • HC ( Chan & Ioannidis, 1999 )

    • Hierarchical method

  • DDC ( Geffer et al., 1999b )

    • Hierarchical method but need extra space overhead

  • SEDC ( Riedewald et al., 2000 )

    • No exrtra space overhead of RPS and DDC (SRPS and SDDC)

  • Double RPS ( Liang et al., 2000 )

    • Three levels but need extra space overhead

  • IDC ( Riedewald et al., 2001 )

    • No extra space overhead (different method in different dimension)


  • Our work focuses mainly on methods that do not require any extra space overhead for dense data cubes.


Analysis of the average query and update costs

  • Assume query ratio + update ratio =100%

  • Average query cost:

  • Average update cost: Cu(n) / n


Flexible Data Cube(FDC)

  • Exponential time is required to find the optimal pre-aggregated data cube

  • Proposed the FDC method that is a heuristic method to select or integrate any two pre-aggregation techniques for each dimension.


A, LPS or PS

A, LPS or PS

k’=6

A, LPS or PS

A, LPS or PS

A, LPS or PS

k’=4

k’=7

A, LPS or PS

A, LPS or PS

k’=5

A, LPS or PS

k’=0

A, LPS or PS

PS

A

k’=4

A, LPS or PS

k’=3

A, LPS or PS

A, LPS

or PS

k’=1

A, LPS or PS

A, LPS

or PS

k’=2

A, LPS or PS

The FDC Method

  • In certain situation

    • Size

    • Query ratio

  • FDCopt = min average cost{FDC candidates}

  • FDCopt =min{q×CaqFDC + u×CauFDC}

  • Time complexity O(9n)=O(n)


Performance analysis

  • Average cost at different query ratios d = 2, n = 16, 64


  • Average cost for different dimension sizes: d = 4, q = 1, 0.9


  • Average cost for different dimension sizes: d = 4, q = 0.1, 0


Conclusions

  • Take both the query and update costs into consideration to select the suitable data cube.

  • Propose the FDC method

    • select or integrate pre-aggregating techniques for each dimension.

    • Outperform other methods for any query (or update) ratio situation

    • linear time: determine the best FDC structure.

  • In the future, develop new techniques to support sparse data sets


Thank You


  • Login