1 / 20

Data Warehouse (IV): The Star Schema

Data Warehouse (IV): The Star Schema. Developed by: Dr Eddie Ip Modified by : Dr Arif Ansari. Outline. DW design Multidimensional design (star schema) Examples. DW Design. OLTP systems use normalized data structure Problem with normalized structure: hard to understand

cdickinson
Download Presentation

Data Warehouse (IV): The Star Schema

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Warehouse (IV): The Star Schema Developed by: Dr Eddie Ip Modified by : Dr Arif Ansari

  2. Outline • DW design • Multidimensional design (star schema) • Examples

  3. DW Design • OLTP systems use normalized data structure • Problem with normalized structure: • hard to understand • queries require understanding of SQL • not optimized to support decisional queries

  4. DW Design • Ralph Kimball (1996) : The DW Lifecycle Toolkit • Dimensional Model • Denormalized structure • Schema suitable for decisional support

  5. Dimensional Design • Divide variables into two categories • A. Things we want to track/measure • B. The way how things we measure are broken into

  6. Dimensional Design • Example 1 • “Show me gross margin by product category.” (from manager of store of sporting goods) • The thing he wants to track/measure : gross margin • The way he wants gross margin to be broken into : by product category

  7. Dimensional Design • Example 2 • “Show me average account balances by educational level.” (bank manager) • The thing he wants to measure: ? • The way he wants it to be broken into: ?

  8. Dimensional Design • Example 3 • “What is the return rate by supplier?” (store manager) • The thing he wants to measure: ? • The way he wants it to be broken into: ?

  9. Dimensional Design • Example 4 • “ What is the sales of PC by region by quarter?” (Frito-Lay manager) • The thing he wants to measure: ? • The way he wants it to be broken into: ?

  10. Dimensional Design • The thing (variable) we want to track/ measure = measure or fact • The way we want it to be broken into = dimension

  11. Example • Kitchen appliances manufacturer • Sells products to customers (retailers) through sales reps • p4-8 Adamson & Venerable (reader)

  12. Fact tables • Store numeric data that are of interest to the business • e.g. Retail: sales amount, quantities sold Telecom: length of call, average no. of calls /day Banking: average daily balance

  13. Fact tables • E.g. (more) Airline: ticket cost, baggage weight Finance: ROI, average 60-day share price • fact tables contain lots of records : “deep” tables

  14. Dimension tables • Contain context of facts • Technically: store fields that describe the facts • e.g. Retail: store (attributes: name, zip), product category (attributes=?), time (attribute=?) Telecom: ?? Banking: ??

  15. Star Schema • Fact table (at the center) plus its related dimension tables (on the sides) • Example (p. 6-7 A & V)

  16. Dimension Fact

  17. Dimensional Design • Design to answer business questions • DW: to provide answers to business users • Dimensional design -- user’s view of the DB • Capacity should be built in the DW at the design stage: what facts, what dimension • OLTP: to ensure operations run well • OLTP Design -- IT professional’s view of the DB

  18. Star schema • Issues • Granularity • Multiple fact tables in a single star • different granularity • different processes • core & custom facts • snapshot & transaction

  19. Design principles • Facts : fully normalized (no redundancy) • PK = concatenation of dimension keys • Dimensions : Denormalized (all relevant attributes in one table) • Different processes, different stars • Different grains, different fact tables

  20. Summary • DW design • Dimensional design • Star Schema • Design Principles

More Related