adv dbms data warehouse csc5301
Download
Skip this Video
Download Presentation
Adv. DBMS Data Warehouse CSC5301

Loading in 2 Seconds...

play fullscreen
1 / 15

Adv. DBMS Data Warehouse CSC5301 - PowerPoint PPT Presentation


  • 100 Views
  • Uploaded on

Adv. DBMS Data Warehouse CSC5301. Review Hachim Haddouti. Do You Remember?. DSS. Data cube. MD. OLTP. RollUp. drill down. Slice/dice. MD. ROLAP. Star schema. MOLAP. Data mining. Data extraction. Fact table. Data Warehouses.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Adv. DBMS Data Warehouse CSC5301' - randy


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
adv dbms data warehouse csc5301

Adv. DBMSData WarehouseCSC5301

Review

Hachim Haddouti

do you remember
Do You Remember?

DSS

Data cube

MD

OLTP

RollUp

drill down

Slice/dice

MD

ROLAP

Star schema

MOLAP

Data mining

Data extraction

Fact table

data warehouses
Data Warehouses
  • “Subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management’s decision-making process” Inmon (AP = analytical processing is missing)
  • Used for analysis of existing data
  • Resolves performance issues suffered by operational RDBMSs and OLTPs
sizing dw
Sizing DW?

Mining of mobile phone calls:

(Caller, Callee, Time, Duration, Geogr. Location) ~ 100 B/tuple

In Germany

107 users * 10 calls/(day*user) * 100 B/call =

= 1010 B/day ~ 3*1012 B/year = 3 TB/year

Scanning data at 107 B/s takes

3*1012/107 = 3*105 s > 3 days

data model
ER Model

a disaster for querying a huge amount of data (time)

not understandable for users and they can not be navigated usefully by DBMS software.

hard to visualize; many possible connections between tables,

To avoid redundancy

MD Model

better performance

Better data organisation

Better visualization

Business queries (why, what if)

Data model
slide7

Typical DWH Analyses/Queries

  • What are the consequences of new orders for production capacity w.r. to investment, personnel, maintenance, extra hours, ...
  • Seasonal adaptions, e.g. when to produce how many skis, bikinis, convertibles, ...
  • Influence of external financing on profits
slide8

Operations:

  • aggregation
  • slice
  • dice (cube)
  • rollup to coarser level
  • drill down to more detailed level
  • grouping
  • sorting
steps to build a dwh
Steps to build a DWH
  • Acquisition of data
  • Data cleansing
  • Storage
  • Processing: AP
  • Maintenance, ...

Not possible with classical DB-technology alone

on line analytical processing
On-Line Analytical Processing
  • OLTP (online transaction processing) for operational data of enterprise, e.g. in relational DBMS, IMS, SAP/R3, ...
  • DSS:Decision Support System to store data/information for strategic management decisions: aggregations, summaries, etc.
  • Optimized to work with data warehouses
  • Used to answer questions
  • Allows users to perceive data as a multidimensional data cube
  • Data mining
oltp versus olap
OLTP versus OLAP

Thematic focus

  • OLTP: many small transactions (microscopic view of business processes, individual steps at lowest level, single order, delivery)
  • OLAP: finances in general, personnel in general, ...
  • OLAP requires integration and unification of many detailed data into big picture
  • Time orientation
  • Durability: data extracted once, no updates
technical comparison oltp vs olap
Technical Comparison OLTP vs OLAP
  • OLTP: high rate of updates, several thousand t/s
  • OLAP: read only transactions, very complex, DWH is loaded at certain time intervals, e.g. after the end of the month, quarter
    • Compute intensive
    • Special systems with new access methods, e.g. multidimensional data organization and access methods
    • Special OLAP systems necessary to offload OLTP systems
rolap and molap
ROLAP and MOLAP

Solution 1: ROLAP relational online analytical processing, built on top of relational DBS, additional middleware or client front end (star schema)

Solution 2: MOLAP: multidimensional online analytical processing

  • new model
  • new data organizations
  • new algorithms
  • new query languages
  • new optimization techniques
slide15

DW Review

  • degenerate dimension
  • big dimensions
  • hierarchies
  • snow falcking
  • Slowly changing dimensions
  • dirty dimensions
  • Hetegrogeneous prodcuts (core and custom)
  • Factless Fact table
ad