Adv dbms data warehouse csc5301
Sponsored Links
This presentation is the property of its rightful owner.
1 / 15

Adv. DBMS Data Warehouse CSC5301 PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Adv. DBMS Data Warehouse CSC5301. Review Hachim Haddouti. Do You Remember?. DSS. Data cube. MD. OLTP. RollUp. drill down. Slice/dice. MD. ROLAP. Star schema. MOLAP. Data mining. Data extraction. Fact table. Data Warehouses.

Download Presentation

Adv. DBMS Data Warehouse CSC5301

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Adv. DBMSData WarehouseCSC5301


Hachim Haddouti

Do You Remember?


Data cube




drill down




Star schema


Data mining

Data extraction

Fact table

Data Warehouses

  • “Subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management’s decision-making process” Inmon (AP = analytical processing is missing)

  • Used for analysis of existing data

  • Resolves performance issues suffered by operational RDBMSs and OLTPs

Sizing DW?

Mining of mobile phone calls:

(Caller, Callee, Time, Duration, Geogr. Location) ~ 100 B/tuple

In Germany

107 users * 10 calls/(day*user) * 100 B/call =

= 1010 B/day ~ 3*1012 B/year = 3 TB/year

Scanning data at 107 B/s takes

3*1012/107 = 3*105 s > 3 days

Data Warehouse Architecture

ER Model

a disaster for querying a huge amount of data (time)

not understandable for users and they can not be navigated usefully by DBMS software.

hard to visualize; many possible connections between tables,

To avoid redundancy

MD Model

better performance

Better data organisation

Better visualization

Business queries (why, what if)

Data model

Typical DWH Analyses/Queries

  • What are the consequences of new orders for production capacity w.r. to investment, personnel, maintenance, extra hours, ...

  • Seasonal adaptions, e.g. when to produce how many skis, bikinis, convertibles, ...

  • Influence of external financing on profits

  • Operations:

  • aggregation

  • slice

  • dice (cube)

  • rollup to coarser level

  • drill down to more detailed level

  • grouping

  • sorting

Data Cube Representation

Steps to build a DWH

  • Acquisition of data

  • Data cleansing

  • Storage

  • Processing: AP

  • Maintenance, ...

    Not possible with classical DB-technology alone

On-Line Analytical Processing

  • OLTP (online transaction processing) for operational data of enterprise, e.g. in relational DBMS, IMS, SAP/R3, ...

  • DSS:Decision Support System to store data/information for strategic management decisions: aggregations, summaries, etc.

  • Optimized to work with data warehouses

  • Used to answer questions

  • Allows users to perceive data as a multidimensional data cube

  • Data mining

OLTP versus OLAP

Thematic focus

  • OLTP: many small transactions (microscopic view of business processes, individual steps at lowest level, single order, delivery)

  • OLAP: finances in general, personnel in general, ...

  • OLAP requires integration and unification of many detailed data into big picture

  • Time orientation

  • Durability: data extracted once, no updates

Technical Comparison OLTP vs OLAP

  • OLTP: high rate of updates, several thousand t/s

  • OLAP: read only transactions, very complex, DWH is loaded at certain time intervals, e.g. after the end of the month, quarter

    • Compute intensive

    • Special systems with new access methods, e.g. multidimensional data organization and access methods

    • Special OLAP systems necessary to offload OLTP systems


Solution 1: ROLAP relational online analytical processing, built on top of relational DBS, additional middleware or client front end (star schema)

Solution 2: MOLAP: multidimensional online analytical processing

  • new model

  • new data organizations

  • new algorithms

  • new query languages

  • new optimization techniques

  • DW Review

  • degenerate dimension

  • big dimensions

  • hierarchies

  • snow falcking

  • Slowly changing dimensions

  • dirty dimensions

  • Hetegrogeneous prodcuts (core and custom)

  • Factless Fact table

  • Login