Data Warehouse Fundamentals. Chapter 5. Analyzing and Defining Business Requirements for a Data Warehouse. Paul K Chen. 1. Chapter 5- Objectives. Learn what is the definition of the business requirements Understand the role of business dimensions related to DW business requirements
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
The requirements can be broadly divided into two areas:
1 Functional requirements—written in user terminology since it is user operations that are being described.
2 Non-functional requirements –these are the limitations and demands imposed upon the computing solutions; such as architectural plan, data storage specifications and information system performance expectations.
The business data of sales units (fact) is measured and analyzed
in three dimensional.
Step 1: Choosing the process
Step 2: Choosing the grain
Step 3: Identifying and conforming the dimensions
Step 4: Choosing the facts
Step 5: Storing pre-calculations in the fact table
Step 6: Rounding out the dimension tables
Step 7: Choosing the duration of the database
Step 8: Tracking slowly changing dimensions
Step 9: Deciding the query priorities and the query modes.
Use the enterprise level data model in selecting appropriate subject area(s)
-- Implement a single subject area (best option)
-- Implement a subset of a subject area
-- Implement a subset of several subject areas (most common)
Weight the Fact attributes based upon the following
Subject areas are collections of like data that support
analysis of the major subjects in a business. Election
Less space used
Percentage_of_breaking_lease (< 3_months)
Percentage_of_breaking_lease (>3 but < 6 months)
Percentage_of_breaking_lease (>6 but <9 months)
Percentage_of_breaking_lease (>9 but <12 months)
Percentage_of_breaking_lease (> 12 months)
Individual Daily sales
Date Product Qty Sales $
Jan 1 nuts 100 300
Jan 1 nuts 200 600
Jan 2 nuts 300 900
Jan 2 nuts 100 300
Jan 3 Nuts 50 150
Jan 3 Nuts 40 120
Daily Sales Summary
Date Product Qty Sales $
Jan 1 Nuts 300 900
Jan 2 Nuts 400 1,200
Jan 3 Nuts 90 20
Group data attributes based on usage and
Aggregation is used to create data marts.
For instance, a group of users frequently perform analysis comparing sales across geographic regions, broken by product line. If a data mart were created that stores the sales data already aggregated to the desired level, the users’ queries would be simpler.
Add up amounts by day
in sql: SELECT date, sum (amt) FROM SALE
GROUP BY date
Summarization building upon a single dimensional
Architectural criteria describes way the entire system
Administration criteria are considered to be essential
to the ‘smooth running’ of a dimensionally-oriented
Expression criteria are mostly analytic capabilities
that are needed in real-life situations.
In order to get an idea of the data to be used by the sales and
Inventory department, a facilitation session was held with
15 key end users and the IT data warehouse team. The
following business questions were generated from that
JAD (Joint Application Development) vs. Traditional Way of
Advantages: Achieving consensus during the session when
multiple sources of information exist, raising and
addressing issues or assigning them for resolution, and
immediately confirming information.
Whether they are formal or informal, there are four
necessary roles to be filled:
The Facilitator is the session leader. It is the facilitator’s
responsibility to ensure that the objectives of the sessions
Scribes are responsible for recording the minutes of the
session and optionally constructing deliverables using an
automated tool as the session progresses.
The users provide knowledge specific to the scope of the
Developers are the team members who will be building
The session is divided into three segments:
Open-ended interview questions
Open describes the interviewee’s options for responding.
They are open.
Closed interview questions
Such as “ How many subordinates do you have?
Type of hard data (other than interviewing and observation) - Quantitative Data
Relevant Architecture Concepts Impacted by
Machine Configuration Example
3 Tier hosting configuration for Typical Queries (or any process that makes a high volume of DB calls) suffers in performance due to Network Messaging overhead.
2 Tier hosting configuration supports the performance requirements of Typical Queries by eliminating 99.8% of Network Messaging overhead.