Adv dbms data warehouse csc5301
This presentation is the property of its rightful owner.
Sponsored Links
1 / 15

Adv. DBMS Data Warehouse CSC5301 PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Adv. DBMS Data Warehouse CSC5301. Review Hachim Haddouti. Do You Remember?. DSS. Data cube. MD. OLTP. RollUp. drill down. Slice/dice. MD. ROLAP. Star schema. MOLAP. Data mining. Data extraction. Fact table. Data Warehouses.

Download Presentation

Adv. DBMS Data Warehouse CSC5301

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Adv dbms data warehouse csc5301

Adv. DBMSData WarehouseCSC5301


Hachim Haddouti

Do you remember

Do You Remember?


Data cube




drill down




Star schema


Data mining

Data extraction

Fact table

Data warehouses

Data Warehouses

  • “Subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management’s decision-making process” Inmon (AP = analytical processing is missing)

  • Used for analysis of existing data

  • Resolves performance issues suffered by operational RDBMSs and OLTPs

Sizing dw

Sizing DW?

Mining of mobile phone calls:

(Caller, Callee, Time, Duration, Geogr. Location) ~ 100 B/tuple

In Germany

107 users * 10 calls/(day*user) * 100 B/call =

= 1010 B/day ~ 3*1012 B/year = 3 TB/year

Scanning data at 107 B/s takes

3*1012/107 = 3*105 s > 3 days

Adv dbms data warehouse csc5301

Data Warehouse Architecture

Data model

ER Model

a disaster for querying a huge amount of data (time)

not understandable for users and they can not be navigated usefully by DBMS software.

hard to visualize; many possible connections between tables,

To avoid redundancy

MD Model

better performance

Better data organisation

Better visualization

Business queries (why, what if)

Data model

Adv dbms data warehouse csc5301

Typical DWH Analyses/Queries

  • What are the consequences of new orders for production capacity w.r. to investment, personnel, maintenance, extra hours, ...

  • Seasonal adaptions, e.g. when to produce how many skis, bikinis, convertibles, ...

  • Influence of external financing on profits

Adv dbms data warehouse csc5301

  • Operations:

  • aggregation

  • slice

  • dice (cube)

  • rollup to coarser level

  • drill down to more detailed level

  • grouping

  • sorting

Adv dbms data warehouse csc5301

Data Cube Representation

Steps to build a dwh

Steps to build a DWH

  • Acquisition of data

  • Data cleansing

  • Storage

  • Processing: AP

  • Maintenance, ...

    Not possible with classical DB-technology alone

On line analytical processing

On-Line Analytical Processing

  • OLTP (online transaction processing) for operational data of enterprise, e.g. in relational DBMS, IMS, SAP/R3, ...

  • DSS:Decision Support System to store data/information for strategic management decisions: aggregations, summaries, etc.

  • Optimized to work with data warehouses

  • Used to answer questions

  • Allows users to perceive data as a multidimensional data cube

  • Data mining

Oltp versus olap

OLTP versus OLAP

Thematic focus

  • OLTP: many small transactions (microscopic view of business processes, individual steps at lowest level, single order, delivery)

  • OLAP: finances in general, personnel in general, ...

  • OLAP requires integration and unification of many detailed data into big picture

  • Time orientation

  • Durability: data extracted once, no updates

Technical comparison oltp vs olap

Technical Comparison OLTP vs OLAP

  • OLTP: high rate of updates, several thousand t/s

  • OLAP: read only transactions, very complex, DWH is loaded at certain time intervals, e.g. after the end of the month, quarter

    • Compute intensive

    • Special systems with new access methods, e.g. multidimensional data organization and access methods

    • Special OLAP systems necessary to offload OLTP systems

Rolap and molap


Solution 1: ROLAP relational online analytical processing, built on top of relational DBS, additional middleware or client front end (star schema)

Solution 2: MOLAP: multidimensional online analytical processing

  • new model

  • new data organizations

  • new algorithms

  • new query languages

  • new optimization techniques

Adv dbms data warehouse csc5301

  • DW Review

  • degenerate dimension

  • big dimensions

  • hierarchies

  • snow falcking

  • Slowly changing dimensions

  • dirty dimensions

  • Hetegrogeneous prodcuts (core and custom)

  • Factless Fact table

  • Login