1 / 21

Michael Goshey University of Minnesota, Fall 2006 CSci 8701: Overview of Database Research

An Analysis of the Publication "An Overview of Data Warehousing and OLAP Technology” by Surajit Chaudhuri, Umeshwar Dayal. Michael Goshey University of Minnesota, Fall 2006 CSci 8701: Overview of Database Research. Outline. Introduction Problem Addressed Major Contributions Key Concepts

bert
Download Presentation

Michael Goshey University of Minnesota, Fall 2006 CSci 8701: Overview of Database Research

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Analysis of the Publication "An Overview of Data Warehousing and OLAP Technology” by Surajit Chaudhuri, Umeshwar Dayal Michael Goshey University of Minnesota, Fall 2006 CSci 8701: Overview of Database Research

  2. Outline • Introduction • Problem Addressed • Major Contributions • Key Concepts • Validation Methodology • Assumptions • 2006 Rewrite Michael Goshey: 9/19/2006

  3. Introduction • Selected paper • S. Chaudhuri and U. Dayal, An Overview of Data Warehousing and OLAP Technology, SIGMOD Record 26(1): 65-74(1997). • Motivation • Personal Interest Michael Goshey: 9/19/2006

  4. Outline • Introduction • Problem Addressed • Major Contributions • Key Concepts • Validation Methodology • Assumptions • 2006 Rewrite Michael Goshey: 9/19/2006

  5. Problem Addressed • Problem Statement • Survey: organizing the data warehousing space • Differing requirements between OLTP and OLAP • Significance • Growth area • Reference work establishing consensus on terms, architectures and issues Michael Goshey: 9/19/2006

  6. Outline • Introduction • Problem Addressed • Major Contributions • Key Concepts • Validation Methodology • Assumptions • 2006 Rewrite Michael Goshey: 9/19/2006

  7. Major Contributions • Bridging the gulf between industry and academia • OLTP vs. OLAP: clarifying the differences • Concise survey of relevant issues, architectures and tools • Concrete list of data warehouse design and build steps Michael Goshey: 9/19/2006

  8. Outline • Introduction • Problem Addressed • Major Contributions • Key Concepts • Validation Methodology • Assumptions • 2006 Rewrite Michael Goshey: 9/19/2006

  9. Key Concepts • Data warehouses and data marts • OLTP, OLAP, ROLAP vs. MOLAP) • Relational and dimensional data models • Bitmap Index • ETL • Metadata • Managed query vs. ad hoc environments • Materialized views • SQL extensions (cube, rollup, rank, percentile, etc.) Michael Goshey: 9/19/2006

  10. Data Warehouse, Data Mart Michael Goshey: 9/19/2006

  11. Relational or Dimensional? Michael Goshey: 9/19/2006

  12. Relational or Dimensional? (image from http://www.laynetworks.com) Michael Goshey: 9/19/2006

  13. Bitmap Indices • cardinality: unique values/total rows • B-Tree vs. bitmap: 1% rule, uniqueness • Boolean algebra directly on indices Michael Goshey: 9/19/2006

  14. Outline • Introduction • Problem Addressed • Major Contributions • Key Concepts • Validation Methodology • Assumptions • 2006 Rewrite Michael Goshey: 9/19/2006

  15. Validation Methodology • Survey paper goals • Academic and industry citations • Referencing tools, vendors • Case studies Michael Goshey: 9/19/2006

  16. Outline • Introduction • Problem Addressed • Major Contributions • Key Concepts • Validation Methodology • Assumptions • 2006 Rewrite Michael Goshey: 9/19/2006

  17. Assumptions • Read-only environments • Shortcomings • (occasional) transactional commitments • the data revision problem Michael Goshey: 9/19/2006

  18. Outline • Introduction • Problem Addressed • Major Contributions • Key Concepts • Validation Methodology • Assumptions • 2006 Rewrite Michael Goshey: 9/19/2006

  19. 2006 Rewrite • Changes in terminology, tools, vendors • Fact constellations -> conformed dimensions • Decision support -> BI • Vendors and tools in BI, ETL, OLAP • Multiple user constituencies • Data history difficulties • petabyte databases -> very large warehouses common • data expiry challenges • slowly changing dimensions Michael Goshey: 9/19/2006

  20. Slowly Changing Dimensions • Before • After: Type 1 • After: Type 2 • After: Type 3 Michael Goshey: 9/19/2006

  21. Questions? Michael Goshey: 9/19/2006

More Related