adv dbms data warehouse csc5301 n.
Skip this Video
Download Presentation
Adv. DBMS Data Warehouse CSC5301

Loading in 2 Seconds...

play fullscreen
1 / 15

Adv. DBMS Data Warehouse CSC5301 - PowerPoint PPT Presentation

  • Uploaded on

Adv. DBMS Data Warehouse CSC5301. Review Hachim Haddouti. Do You Remember?. DSS. Data cube. MD. OLTP. RollUp. drill down. Slice/dice. MD. ROLAP. Star schema. MOLAP. Data mining. Data extraction. Fact table. Data Warehouses.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Adv. DBMS Data Warehouse CSC5301' - randy

Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
adv dbms data warehouse csc5301

Adv. DBMSData WarehouseCSC5301


Hachim Haddouti

do you remember
Do You Remember?


Data cube




drill down




Star schema


Data mining

Data extraction

Fact table

data warehouses
Data Warehouses
  • “Subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management’s decision-making process” Inmon (AP = analytical processing is missing)
  • Used for analysis of existing data
  • Resolves performance issues suffered by operational RDBMSs and OLTPs
sizing dw
Sizing DW?

Mining of mobile phone calls:

(Caller, Callee, Time, Duration, Geogr. Location) ~ 100 B/tuple

In Germany

107 users * 10 calls/(day*user) * 100 B/call =

= 1010 B/day ~ 3*1012 B/year = 3 TB/year

Scanning data at 107 B/s takes

3*1012/107 = 3*105 s > 3 days

data model
ER Model

a disaster for querying a huge amount of data (time)

not understandable for users and they can not be navigated usefully by DBMS software.

hard to visualize; many possible connections between tables,

To avoid redundancy

MD Model

better performance

Better data organisation

Better visualization

Business queries (why, what if)

Data model

Typical DWH Analyses/Queries

  • What are the consequences of new orders for production capacity w.r. to investment, personnel, maintenance, extra hours, ...
  • Seasonal adaptions, e.g. when to produce how many skis, bikinis, convertibles, ...
  • Influence of external financing on profits


  • aggregation
  • slice
  • dice (cube)
  • rollup to coarser level
  • drill down to more detailed level
  • grouping
  • sorting
steps to build a dwh
Steps to build a DWH
  • Acquisition of data
  • Data cleansing
  • Storage
  • Processing: AP
  • Maintenance, ...

Not possible with classical DB-technology alone

on line analytical processing
On-Line Analytical Processing
  • OLTP (online transaction processing) for operational data of enterprise, e.g. in relational DBMS, IMS, SAP/R3, ...
  • DSS:Decision Support System to store data/information for strategic management decisions: aggregations, summaries, etc.
  • Optimized to work with data warehouses
  • Used to answer questions
  • Allows users to perceive data as a multidimensional data cube
  • Data mining
oltp versus olap
OLTP versus OLAP

Thematic focus

  • OLTP: many small transactions (microscopic view of business processes, individual steps at lowest level, single order, delivery)
  • OLAP: finances in general, personnel in general, ...
  • OLAP requires integration and unification of many detailed data into big picture
  • Time orientation
  • Durability: data extracted once, no updates
technical comparison oltp vs olap
Technical Comparison OLTP vs OLAP
  • OLTP: high rate of updates, several thousand t/s
  • OLAP: read only transactions, very complex, DWH is loaded at certain time intervals, e.g. after the end of the month, quarter
    • Compute intensive
    • Special systems with new access methods, e.g. multidimensional data organization and access methods
    • Special OLAP systems necessary to offload OLTP systems
rolap and molap

Solution 1: ROLAP relational online analytical processing, built on top of relational DBS, additional middleware or client front end (star schema)

Solution 2: MOLAP: multidimensional online analytical processing

  • new model
  • new data organizations
  • new algorithms
  • new query languages
  • new optimization techniques

DW Review

  • degenerate dimension
  • big dimensions
  • hierarchies
  • snow falcking
  • Slowly changing dimensions
  • dirty dimensions
  • Hetegrogeneous prodcuts (core and custom)
  • Factless Fact table