Data warehouse prerequisites
This presentation is the property of its rightful owner.
Sponsored Links
1 / 23

Data Warehouse Prerequisites PowerPoint PPT Presentation


  • 120 Views
  • Uploaded on
  • Presentation posted in: General

Data Warehouse Prerequisites. Familiarity with Microsoft SQL Server System Administration for Microsoft SQL Server 7.0 and Implementing a Database on Microsoft SQL Server 7.0 Knowledge of Transact-SQL Usage in Developing OLTP Systems

Download Presentation

Data Warehouse Prerequisites

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Data warehouse prerequisites

Data Warehouse Prerequisites

  • Familiarity with Microsoft SQL Server

  • System Administration for Microsoft SQL Server 7.0 and Implementing a Database on Microsoft SQL Server 7.0

  • Knowledge of Transact-SQL Usage in Developing OLTP Systems

  • Basic Understanding of Programming Principles and Experience with a Scripting Language

  • Understanding of Basic Database Design, Administration, and Implementation Concepts


What is data warehousing

What is Data Warehousing?

Peter Rawsthorne

Special Thanks to Bill Inmon, the “grandfather”

of data warehousing.


Oltp vs dss

OLTP vs. DSS

  • Online Transaction Processing (OLTP)

  • Decision Support System (DSS)

  • OLTP

    • ATM, Bank Teller, Ticket Master, POS…

  • DSS

    • Marketing, What if?, Inventory, Health (Walmart)…

    • Click-through analysis


In class exercise

In-class Exercise

  • Break into teams of three

  • Think of a business or subject area

  • Determine three OLTP systems required to support business or subject

  • Determine two DSS systems required to support business or subject

  • Provide one example of how the DSS system could be used to predict the future


Exercise example

Exercise Example

  • Business: Yacht Club

  • OLTP

    • Membership System

    • Accounting System

    • Yacht Racing Results System

  • DSS

    • Quarterly and Yearly Expenses

    • Race Results

  • How much beer will we need for next years regatta for the male non-members who are crew on yachts over 40 feet?


What then is a data warehouse

What then is a data warehouse?

A data warehouse is a:

  • subject oriented,

  • integrated,

  • time variant,

  • non volatile

    collection of data in support of management's decision making process.


Subject orientation

Subject Orientation

  • Data is organized via subject rather than process or business function.

  • The application world is concerned both with data base design and process design.

  • The data warehouse world focuses on data modeling and database design exclusively.


Integration

Integration

  • Easily the most important aspect of the data warehouse environment is that data found within the data warehouse is integrated. ALWAYS. WITH NO EXCEPTIONS.

  • consistent naming conventions,

  • consistent measurement of variables,

  • consistent encoding structures,

  • consistent physical attributes of data,

  • and so forth.


Time variant

OPERATIONAL

Current valued data

Time horizon: 60 – 90 days

Key fields may or may not have an element of time

Data can be updated

DATA WAREHOUSE

Snapshot data

Time horizon: 5 – 10 days

Keys do not have an element of time

Once snapshot is made, records cannot be updated

Time Variant


Non volatile

non volatile

  • Inserts, deletes, and changes - are done regularly to the operational environment on a record by record basis.

  • There are only two kinds of operations that occur in the data warehouse - the initial loading of data, and the access of data.


The structure of the warehouse

The structure of the warehouse

The different components of the data warehouse are:

  • metadata,

  • current detail data,

  • old detail data,

  • lightly summarized data, and

  • highly summarized data.


Current detail data

Current Detail Data

  • Most recent happenings

  • Voluminous

  • Lowest level of granularity

  • Almost always stored on disk storage

  • Fast to access

  • Expensive and complex to manage


Older detail data

Older detail data

  • Stored on some form of mass storage

  • Infrequently accessed

  • Stored at a level of detail consistent with current detailed data

  • Often stored on an alternate storage medium

  • Anticipated large volume


Lightly summarized data

Lightly summarized data

  • Distilled from the low level of detail

  • Almost always stored on disk

  • Design issues facing the data architect are;

    • what unit of time

    • what contents – attributes

  • Frequently mined data, a lot of “what if?”


Highly summarized data

Highly summarized data

  • Compact and easily accessible

  • Sometimes found in the data warehouse

  • Sometimes found outside the data warehouse

  • In any case, the highly summarized data is part of the data warehouse

  • Yearly or multi year summaries


Metadata

Metadata

  • Sits in a different dimension

  • Contains no data directly taken from the operational environment

  • Special and very important role

  • Metadata is used as:

    • a directory to locate the contents

    • a guide to the mapping of data

    • a guide to the algorithms used for summarization


Metadata levels of summarization

Metadata – levels of summarization


Flow of data

Flow of Data

  • data enters from the operational environment, it is transformed

  • data goes into the current detail level of detail

  • It resides there and is used there until one of three events occurs:

    • it is purged,

    • it is summarized, and/or

    • it is archived.


Using the data warehouse

Using the Data Warehouse


Example

Example


Summary

Summary

  • A data warehouse is a

    subject oriented,

    integrated,

    time variant,

    non volatile

    collection of data in support of management's decision needs.

  • Four levels of data warehouse data:

    old detail,

    current detail,

    lightly summarized data, and

    highly summarized data.

  • Metadata is a very important part


Lab deliverables

Lab deliverables

  • W2KS Install

  • SQL7.0 Install

  • SQL7.0 OLAP Services Install

  • MSPress install

  • Complete MSPress Chapter 1


Contact information

Contact Information

Peter Rawsthorne, B.Tech, MCSD, MCT, CCR

President, Eclectic Endeavours Inc.

559A Artisan Lane

PO Box 281

Bowen Island, BC

CANADA V0N 1G0

Phone: 604-947-2760

Fax: 604-947-2715

email: [email protected]

web: http://www.endeavours.com


  • Login