dynamat a dynamic view management system for data warehouses
Download
Skip this Video
Download Presentation
DynaMat A Dynamic View Management System for Data Warehouses

Loading in 2 Seconds...

play fullscreen
1 / 22

DynaMat A Dynamic View Management System for Data Warehouses - PowerPoint PPT Presentation


  • 82 Views
  • Uploaded on

DynaMat A Dynamic View Management System for Data Warehouses. Vicky :: Cao Hui Ping Sherman :: Chow Sze Ming CTH :: Chong Tsz Ho Ronald :: Woo Lok Yan Ken :: Yiu Man Lung. Outline. Introduction Background DynaMat Experiments Conclusions References. Introduction.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'DynaMat A Dynamic View Management System for Data Warehouses' - nerice


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
dynamat a dynamic view management system for data warehouses

DynaMatA Dynamic View Management System for Data Warehouses

Vicky :: Cao Hui Ping

Sherman :: Chow Sze Ming

CTH :: Chong Tsz Ho

Ronald :: Woo Lok Yan

Ken :: Yiu Man Lung

outline
Outline
  • Introduction
  • Background
  • DynaMat
  • Experiments
  • Conclusions
  • References
introduction
Introduction
  • On-Line Analytical Processing (OLAP)
    • Why OLAP?
      • A dominant factor for Support Decision Application
    • Ad-hoc data-intensive queries
    • Costly multi-joins and aggregations
  • Materialized View
    • Why materialize view?
      • Data amount in data warehouses is very big
      • OLAP query is very complex and costly
      • OLAP query result maybe summary data
    • Represent a set of redundant entities in a data warehouse that are used to accelerate OLAP.
introduction cont
?How many? Which?Introduction(cont.)
  • Basic rule to materialize view
    • Given some space restriction, select some suitable views to materialize.

Query

Data warehouse

Materialized View

Not all data redundant

background
Background
  • Research topics on materialized view
    • Store summary data as materialized view
    • Efficiently compute and update views
  • Static selection of views
    • Pre-determine which view should be materialized and materialize them before the queries come
      • Static!
background cont
Background(cont.)
  • Limitations of Static Selection of Views
    • Many queries can’t be answered by the materialized data since query patterns change
    • Update is costly as data is changing overtime
    • Administrator:
      • Monitor query patterns
      • Re-calibrate such views by rerunning the query
  • Automated view selection
    • Dynamic View Management: DynaMat

workload heavy!!!

dynamat
DynaMat
  • Charactmaeristics:
    • Dynamically materializes information at different granularity
    • View Selection + View maintenance in a single framework
  • System overview
  • View pool organization
  • Directory index
  • Query execution
  • Pool maintenance
system overview
Off-line update

Support sub-linear search in V

Whether the materialized data can be used to answer query?

Maintain View Pool

Store materialized data

System Overview
  • Components
  • Two phrases
    • On-line Query
    • Off-line Update

1

S

4.1

4.2

2

3.1

3.2

view pool organization
View Pool Organization

F (product, country, year, sales)

Product(p1, p35)

Country (c1, c30)

Year (1995,2000)

  • Multi-Range query(MRQ)
    • Hyper-plane: n-vector
    • n: number of group by attributes
    • Ri:
      • full range of the domain;
      • single value;
      • empty range

Select product, year, sum(sales)

From F

Where product=‘p1’

Group by product, year

view pool organization cont
View Pool Organization(cont.)
  • MRF(Multidimensional Range Fragments)
    • Each fragment can also be represented by a hyper-plane
    • Basic logical unit in the pool
  • Many fragments in the View Pool

MRF

F

directory index
Product

P15

P10

P1

1995

1997

2000

Directory Index
  • Facilitate the search in view pool
  • Directory index is a R-tree based on fragment’s hyper-planes.
  • Each fragment corresponds to one entity in directory index

Year

Directory Index

query execution
Product

P15

P10

P1

1995

1997

2000

Query Execution
  • Query Step:
    • From MR query, get its hyper-plane
    • Query the view pool based on the directory index

f2

f3

Year

Directory Index

query execution cont
Query Execution(cont.)
  • Query cases:
    • One fragment f matches the query exactly
      • Retrieve f and return it back to the user
    • No exact match, but many fragments can be used to answer the query
      • Choose the best fragment to answer the query
    • The query can not be answered by the view pool
      • Perform the query directly on the DW
  • Query results ACE in the later two cases
pool maintenance
Pool Maintenance
  • Admission Control Entity(ACE)
  • Two cases to maintenance
    • New query results come
    • Data in base relation changes
  • Space Bound &Time Bound
    • Space bound: View pool hits the pre-defined space window Wspace replace
    • Time bound: the system restrict the time window Wtime to refresh the fragments.
  • Goodness measure to determine whether a fragment is good enough.
pool maintenance cont
evicted

fnew:

new query result

fvictim

Goodness(fvictim)< goodness(fnew)

Pool Maintenance(cont.)
  • Pool maintenance during queries
    • New query results can be stored in the view pool if it has enough space
    • Call replace algorithm if it hits the space constraint.
      • If goodness(new result) >goodness(fvictim), Evict fvictim,
      • This process doesn’t stop until there is enough space for the new query result.
      • Maintenance of the father pointers

f1

f2

pool maintenance cont1
={(p1,p35)},(1995,2000),(c1,C10)}Pool Maintenance(cont.)
  • Pool maintenance during updates
    • Condition:data in base relation changes
    • Step:
      • For each fragment compute minimum update cost UC(f)
        • Get all necessary deltas, which make change to the DW
        • Get from the directory index
        • Calculate dV and update each f by querying dV
      • Total update cost:
      • Evict fragments from the view pool according to the non-ascending order of their cost, if the UC(V) is greater than the time bound

Delta

dV

pool maintenance cont2
Product

P15

={(p1,p20)},(1995,2000),(c1,C10)}

P10

P1

1995

1997

2000

Pool Maintenance(cont.)

Year

Delta

dV

experiments
Experiments
  • Measure: Detailed Cost Savings Ratio
    • Ci: Cost of answering queries in DW
    • Si: Saving cost when answering queries in view pool
    • The greater the DCSR, the better the performance
experiments cont
Experiments(cont.)
  • Comparison with the optimal static view selection
    • 1 Fact table: 6 dims, 20 million records
    • updates: 40 sets * 100 thousand records
    • Time constraint: 2% of the full Data Cube
    • Queries: 40 sets*500 MR Queries.
conclusion
Conclusion
  • DynaMat: A view management system
    • Dynamically materializes results from incoming queries
    • Exploits them to future use
    • Considering time and space constraint
    • Better performance than static methods
reference
Reference
  • Y. Kotidis, N. Roussopoulos. DynaMat: A Dynamic View Management System for Data Warehouses. In Proceedings of ACM SIGMOD International Conference on Management of Data, 371-382, Philadelphia, Pennsylvania, June 1999.
  • Y. Kotidis, N. Roussopoulos. A Case for Dynamic View Management. ACM Transactions on Database Systems, Volume 26(4), 388-423, 2001.
  • Original presentation by the author, http://www.cs.umd.edu/~kotidis/Publications/Sigmod99
ad