Adv dbms data warehouse csc5301
This presentation is the property of its rightful owner.
Sponsored Links
1 / 15

Adv. DBMS Data Warehouse CSC5301 PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on
  • Presentation posted in: General

Adv. DBMS Data Warehouse CSC5301. Review Hachim Haddouti. Do You Remember?. DSS. Data cube. MD. OLTP. RollUp. drill down. Slice/dice. MD. ROLAP. Star schema. MOLAP. Data mining. Data extraction. Fact table. Data Warehouses.

Download Presentation

Adv. DBMS Data Warehouse CSC5301

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Adv dbms data warehouse csc5301

Adv. DBMSData WarehouseCSC5301

Review

Hachim Haddouti


Do you remember

Do You Remember?

DSS

Data cube

MD

OLTP

RollUp

drill down

Slice/dice

MD

ROLAP

Star schema

MOLAP

Data mining

Data extraction

Fact table


Data warehouses

Data Warehouses

  • “Subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management’s decision-making process” Inmon (AP = analytical processing is missing)

  • Used for analysis of existing data

  • Resolves performance issues suffered by operational RDBMSs and OLTPs


Sizing dw

Sizing DW?

Mining of mobile phone calls:

(Caller, Callee, Time, Duration, Geogr. Location) ~ 100 B/tuple

In Germany

107 users * 10 calls/(day*user) * 100 B/call =

= 1010 B/day ~ 3*1012 B/year = 3 TB/year

Scanning data at 107 B/s takes

3*1012/107 = 3*105 s > 3 days


Adv dbms data warehouse csc5301

Data Warehouse Architecture


Data model

ER Model

a disaster for querying a huge amount of data (time)

not understandable for users and they can not be navigated usefully by DBMS software.

hard to visualize; many possible connections between tables,

To avoid redundancy

MD Model

better performance

Better data organisation

Better visualization

Business queries (why, what if)

Data model


Adv dbms data warehouse csc5301

Typical DWH Analyses/Queries

  • What are the consequences of new orders for production capacity w.r. to investment, personnel, maintenance, extra hours, ...

  • Seasonal adaptions, e.g. when to produce how many skis, bikinis, convertibles, ...

  • Influence of external financing on profits


Adv dbms data warehouse csc5301

  • Operations:

  • aggregation

  • slice

  • dice (cube)

  • rollup to coarser level

  • drill down to more detailed level

  • grouping

  • sorting


Adv dbms data warehouse csc5301

Data Cube Representation


Steps to build a dwh

Steps to build a DWH

  • Acquisition of data

  • Data cleansing

  • Storage

  • Processing: AP

  • Maintenance, ...

    Not possible with classical DB-technology alone


On line analytical processing

On-Line Analytical Processing

  • OLTP (online transaction processing) for operational data of enterprise, e.g. in relational DBMS, IMS, SAP/R3, ...

  • DSS:Decision Support System to store data/information for strategic management decisions: aggregations, summaries, etc.

  • Optimized to work with data warehouses

  • Used to answer questions

  • Allows users to perceive data as a multidimensional data cube

  • Data mining


Oltp versus olap

OLTP versus OLAP

Thematic focus

  • OLTP: many small transactions (microscopic view of business processes, individual steps at lowest level, single order, delivery)

  • OLAP: finances in general, personnel in general, ...

  • OLAP requires integration and unification of many detailed data into big picture

  • Time orientation

  • Durability: data extracted once, no updates


Technical comparison oltp vs olap

Technical Comparison OLTP vs OLAP

  • OLTP: high rate of updates, several thousand t/s

  • OLAP: read only transactions, very complex, DWH is loaded at certain time intervals, e.g. after the end of the month, quarter

    • Compute intensive

    • Special systems with new access methods, e.g. multidimensional data organization and access methods

    • Special OLAP systems necessary to offload OLTP systems


Rolap and molap

ROLAP and MOLAP

Solution 1: ROLAP relational online analytical processing, built on top of relational DBS, additional middleware or client front end (star schema)

Solution 2: MOLAP: multidimensional online analytical processing

  • new model

  • new data organizations

  • new algorithms

  • new query languages

  • new optimization techniques


Adv dbms data warehouse csc5301

  • DW Review

  • degenerate dimension

  • big dimensions

  • hierarchies

  • snow falcking

  • Slowly changing dimensions

  • dirty dimensions

  • Hetegrogeneous prodcuts (core and custom)

  • Factless Fact table


  • Login