1 / 22

DynaMat A Dynamic View Management System for Data Warehouses

DynaMat A Dynamic View Management System for Data Warehouses. Vicky :: Cao Hui Ping Sherman :: Chow Sze Ming CTH :: Chong Tsz Ho Ronald :: Woo Lok Yan Ken :: Yiu Man Lung. Outline. Introduction Background DynaMat Experiments Conclusions References. Introduction.

nerice
Download Presentation

DynaMat A Dynamic View Management System for Data Warehouses

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DynaMatA Dynamic View Management System for Data Warehouses Vicky :: Cao Hui Ping Sherman :: Chow Sze Ming CTH :: Chong Tsz Ho Ronald :: Woo Lok Yan Ken :: Yiu Man Lung

  2. Outline • Introduction • Background • DynaMat • Experiments • Conclusions • References

  3. Introduction • On-Line Analytical Processing (OLAP) • Why OLAP? • A dominant factor for Support Decision Application • Ad-hoc data-intensive queries • Costly multi-joins and aggregations • Materialized View • Why materialize view? • Data amount in data warehouses is very big • OLAP query is very complex and costly • OLAP query result maybe summary data • Represent a set of redundant entities in a data warehouse that are used to accelerate OLAP.

  4. ?How many? Which? Introduction(cont.) • Basic rule to materialize view • Given some space restriction, select some suitable views to materialize. Query Data warehouse Materialized View Not all data redundant

  5. Background • Research topics on materialized view • Store summary data as materialized view • Efficiently compute and update views • Static selection of views • Pre-determine which view should be materialized and materialize them before the queries come • Static!

  6. Background(cont.) • Limitations of Static Selection of Views • Many queries can’t be answered by the materialized data since query patterns change • Update is costly as data is changing overtime • Administrator: • Monitor query patterns • Re-calibrate such views by rerunning the query • Automated view selection • Dynamic View Management: DynaMat workload heavy!!!

  7. DynaMat • Charactmaeristics: • Dynamically materializes information at different granularity • View Selection + View maintenance in a single framework • System overview • View pool organization • Directory index • Query execution • Pool maintenance

  8. Off-line update Support sub-linear search in V Whether the materialized data can be used to answer query? Maintain View Pool Store materialized data System Overview • Components • Two phrases • On-line Query • Off-line Update 1 S 4.1 4.2 2 3.1 3.2

  9. View Pool Organization F (product, country, year, sales) Product(p1, p35) Country (c1, c30) Year (1995,2000) • Multi-Range query(MRQ) • Hyper-plane: n-vector • n: number of group by attributes • Ri: • full range of the domain; • single value; • empty range Select product, year, sum(sales) From F Where product=‘p1’ Group by product, year

  10. View Pool Organization(cont.) • MRF(Multidimensional Range Fragments) • Each fragment can also be represented by a hyper-plane • Basic logical unit in the pool • Many fragments in the View Pool MRF F

  11. Product P15 P10 P1 1995 1997 2000 Directory Index • Facilitate the search in view pool • Directory index is a R-tree based on fragment’s hyper-planes. • Each fragment corresponds to one entity in directory index Year Directory Index

  12. Product P15 P10 P1 1995 1997 2000 Query Execution • Query Step: • From MR query, get its hyper-plane • Query the view pool based on the directory index f2 f3 Year Directory Index

  13. Query Execution(cont.) • Query cases: • One fragment f matches the query exactly • Retrieve f and return it back to the user • No exact match, but many fragments can be used to answer the query • Choose the best fragment to answer the query • The query can not be answered by the view pool • Perform the query directly on the DW • Query results ACE in the later two cases

  14. Pool Maintenance • Admission Control Entity(ACE) • Two cases to maintenance • New query results come • Data in base relation changes • Space Bound &Time Bound • Space bound: View pool hits the pre-defined space window Wspace replace • Time bound: the system restrict the time window Wtime to refresh the fragments. • Goodness measure to determine whether a fragment is good enough.

  15. evicted fnew: new query result fvictim Goodness(fvictim)< goodness(fnew) Pool Maintenance(cont.) • Pool maintenance during queries • New query results can be stored in the view pool if it has enough space • Call replace algorithm if it hits the space constraint. • If goodness(new result) >goodness(fvictim), Evict fvictim, • This process doesn’t stop until there is enough space for the new query result. • Maintenance of the father pointers f1 f2

  16. ={(p1,p35)},(1995,2000),(c1,C10)} Pool Maintenance(cont.) • Pool maintenance during updates • Condition:data in base relation changes • Step: • For each fragment compute minimum update cost UC(f) • Get all necessary deltas, which make change to the DW • Get from the directory index • Calculate dV and update each f by querying dV • Total update cost: • Evict fragments from the view pool according to the non-ascending order of their cost, if the UC(V) is greater than the time bound Delta dV

  17. Product P15 ={(p1,p20)},(1995,2000),(c1,C10)} P10 P1 1995 1997 2000 Pool Maintenance(cont.) Year Delta dV

  18. Experiments • Measure: Detailed Cost Savings Ratio • Ci: Cost of answering queries in DW • Si: Saving cost when answering queries in view pool • The greater the DCSR, the better the performance

  19. Experiments(cont.) • Comparison with the optimal static view selection • 1 Fact table: 6 dims, 20 million records • updates: 40 sets * 100 thousand records • Time constraint: 2% of the full Data Cube • Queries: 40 sets*500 MR Queries.

  20. Conclusion • DynaMat: A view management system • Dynamically materializes results from incoming queries • Exploits them to future use • Considering time and space constraint • Better performance than static methods

  21. Reference • Y. Kotidis, N. Roussopoulos. DynaMat: A Dynamic View Management System for Data Warehouses. In Proceedings of ACM SIGMOD International Conference on Management of Data, 371-382, Philadelphia, Pennsylvania, June 1999. • Y. Kotidis, N. Roussopoulos. A Case for Dynamic View Management. ACM Transactions on Database Systems, Volume 26(4), 388-423, 2001. • Original presentation by the author, http://www.cs.umd.edu/~kotidis/Publications/Sigmod99

  22. Thanks! Q&A?

More Related