Chapter 3
Sponsored Links
This presentation is the property of its rightful owner.
1 / 41

Chapter 3 PowerPoint PPT Presentation

  • Uploaded on
  • Presentation posted in: General

Data Warehouse Fundamentals. Chapter 3. Trends In Data Warehousing. Paul K Chen. 1. Data Warehousing is Becoming Mainstream. In the early stages, four significant factors drove many companies to move into data warehousing: Fierce competition Government deregulation

Download Presentation

Chapter 3

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Chapter 3

Data Warehouse Fundamentals

Chapter 3

Trends In Data Warehousing

Paul K Chen


Data warehousing is becoming mainstream

Data Warehousing is Becoming Mainstream

In the early stages, four significant factors drove many

companies to move into data warehousing:

  • Fierce competition

  • Government deregulation

  • Need to revamp internal processes

  • Imperative for customized marketing

Walmart vs amazon com

Walmart vs.

  • Walmart is the US company most quoted for the successful application & deployment of Data Warehousing technology.

  • Walmart filed lawsuit against for its unlawful way of pirating its DW technology by hiring away its DA personnel by offering hefty stock option to these people.

Significant factors

Significant Factors

These significant factors reflect the new trends in data


  • Multiple Data Types

  • Data Visualization

  • Parallel Processing

  • Query Tools

  • Browser Tools

  • Data Fusion

  • Multidimensional Analysis

  • Agent Technology

  • E-Business- ERP, KM, CRM

Decision making and data warehousing

Decision Making and Data Warehousing

“A data warehouse is the data, processes, tools, and facilities to manage and deliver complete, timely, accurate, and understandable business information to authorized individuals for effective decision making.”

  • Structured Data

    • Includes traditional relational databases

    • Typically internal and enterprise-owned

    • Predetermined

  • Unstructured Data

    • Includes articles, reports, images, and videos

    • Utilizes external data and expert opinion

    • Ad hoc


Decision making and data warehousing1

Decision Making and Data Warehousing

  • Management Systems

    • Extend relational databases to store and support multimedia

    • User-defined types (UDT) and functions (UDF) in SQL-3

  • Specialized Servers

    • Used for data which is incompatible with relational databases (e.g., Streaming video servers)

    • Objects may be linked to a relational database

  • Search Engines

    • Query by Image Content (shape, color, texture, etc)

    • Text retrieval on free-text documents

    • Audio and video searching

Decision making and data warehousing2

Decision Making and Data Warehousing

~ The trend is toward unstructured data and ad hoc warehouses. ~

~ Trend toward multimedia. ~


Types of decision support tools

Types of Decision Support Tools

  • Data Inquiry

    • A request for a set of data based on some search criteria

  • Data Interpretation

    • Manipulation and visualization of a set of data (statistical analysis)

  • Multidimensional Analysis (OLAP)

    • n-dimensional spreadsheet analysis

  • Information Discovery

    • Pattern recognition, trends

  • Browsers

    • Search metadata catalogs

    • Search information object lists

    • Launch analysis tools


File based processing

File-based Processing


Types of decision support tools1

Types of Decision Support Tools

~ Trend toward utilization of the Web, facilitated by Java. ~


Data warehouse architectures

Data Warehouse Architectures

  • Single Level

    • Decision support tools access operational data directly

    • Feasible only with “clean” data

    • Valid for unstructured data

  • Two Level Reconciled

    • Scrubbed operational data supporting ad hoc queries

  • Two Level Derived

    • Summarized data

  • Three Level

    • Maintains both scrubbed operational data, and summarized data.


Data warehouse architectures1

Data Warehouse Architectures

~ Trend toward multidimensional data. ~


Data stores and access enablers

Data Stores and Access Enablers

  • Specialized Multidimensional Databases

    • Data is peregrinated and loaded into multidimensional databases

    • Long loading times but quick response

  • Relational-like Stores

    • Indexing is used to proved pseudo-multidimensional functionality

  • Relational Data Stores

    • An extra semantic layer generates multidimensional data on the fly

  • Hybrids

    • Details are stored in a traditional relational format

    • A subset is cached in a multidimensional data structure


Database management system dbms

Database Management System (DBMS)


Data stores and access enablers1

Data Stores and Access Enablers

  • ~ Trend toward multidimensional data.




  • Integrated Components

    • All components (sources, stores, etc) use a common metadata repository to maintain their metadata

  • Standardized Metadata Interchange

    • Components keep their own metadata

    • Components use a common interchange information model and syntax to share metadata

  • Synchronized Metadata Interchange

    • Metadata changes are updated automatically across all components

  • Building of Business Metadata

    • Manually entered, free-text, plain language descriptions



~ Trend toward better metadata, exchanged between systems. ~

Middleware gluing the warehouse together

Middleware - Gluing the Warehouse Together

  • Definition: software that shields users and developers from differences in services and resources used by applications

  • Data warehouses often have heterogeneous databases, operating systems, networks, hardware, applications

Business issues for middleware

Business Issues for Middleware

  • Role of middleware

    • Assist developer in data extraction/transformation and populating DW

    • Assist business user in accessing DW

    • Therefore needed at different points in life cycle

  • Types

    • Copy management: data extraction, transformation, replication, and propagation

    • Gateways: DB and independent gateways

    • Program-to program: RPCs, TP monitors, ORBs

    • Message-oriented

Data quality

Data Quality

  • Preprocessing Ownership

    • Source application owners know their data

    • Warehouse owners still must integrate the entire system

  • Automated Preprocessing Tools

    • Specialized packages

    • Generalized tools using pattern processing, lexical analysis, and statistical matching to reconcile a wide range of data sources

    • Custom programming

  • Reliability and Credibility of External Data

    • Quality ratings

    • Posted statistical meta-information (sample size, randomness, etc)


Data quality1

Data Quality

Trend toward better understanding of data quality. ~


Significant trends multiple data types

Significant Trends- Multiple Data Types



Structured Numeric


Data Warehouse


Structured Text



Unstructured Documents

Significant trends data visualization

Significant Trends- Data Visualization

  • More Chart Types-Pie chart, scatter plot

  • Interactive Visualization

  • Chart Manipulation

  • Drill Down

Significant trends parallel processing

Significant Trends- Parallel Processing

  • Aims to solve decision-support problems using multiple nodes working on the same problem.

  • Performs many database operations simultaneously, splitting individual tasks into smaller parts so that tasks can be spread across multiple processors.

  • Parallel DBMSs must be capable of running parallel queries, parallel data loading, table scanning, and data archiving, and back up.

Significant trends parallel processing1

Significant Trends- Parallel Processing

  • Shared memory architecture (SMP)

    • All the servers share all the data

  • Shared nothing architecture (MPP)

    • Each server has its own partition of data

Significant trends query tools browse tools

Significant Trends- Query Tools, Browse Tools

  • Flexible Presentation –online results and report generator

  • Aggregate Awareness

  • Crossing Subject Areas

  • Multiple Heterogeneous Sources

  • Integration

  • Overcoming SQL Limitations

  • Data Fusion

Significant trends integrating erp and data warehouse

Significant Trends- Integrating ERP and Data Warehouse

  • Option 1: Companies implement the data warehouse solutions of the ERP vendor with the currently available functionality and await the enhancements.

  • Option 2: Companies implement customized data warehouse and use third-party tools to extract data from the ERP datasets. Retrieving and loading data from the proprietary ERP datasets is not easy.

  • Option 3: It is a hybrid approach that combines the functionalities provided by the vendor’s data warehouse with additional functionalities from third-party tools.

Significant trends integrating km and data warehouse

Significant Trends- Integrating KM and Data Warehouse

What’s KM?

  • It is a systematic process for capturing, integrating, organizing, and communicating knowledge accumulated by employees.

  • It is a vehicle to share corporate knowledge so that employees may be more more effective and be productive in their work.

  • A knowledge management system must store all such knowledge in a knowledge repository.

Significant trends integrating km and data warehouse1

Significant Trends- Integrating KM and Data Warehouse

A specific corporate scenario:

  • Sales have dropped in the South region.

  • Your marketing VP is able to discern this from your data warehouse by running some queries and doing some preliminary analysis. If he or she has access to a document prepared by an analyst explaining why the sales are low and suggesting remedial action.

  • Knowledge must be linked to the sales result to provide context to the sales numbers from the data warehouse.

Significant trends integrating km and data warehouse2

Significant Trends- Integrating KM and Data Warehouse

An airplane sales scenario: The following information is essential

For a successful pitch for airplane sales.

  • Model configuration

  • Production schedule (Delivery schedule)

  • Part replacement

  • Warranty

    Knowledge obtained from the knowledge management

    system can provide context to the information received from

    the data warehouse to understand the story behind the above


Summary of trends

Summary of Trends

  • Ad Hoc Questions

  • Multidimensional Analysis (OLAP)

  • Web-Enabled Data Warehouse

  • Multimedia

  • Middleware

  • Metadata Interchange

  • Integrating ERP with Data Warehouse

  • Integrating KM with Data Warehouse

Complete e business suite a review

Complete E-Business Suite– A Review








One Database

Order Mgt


Human Resources




Supply Chain (SCM)

Information system categories

Information System Categories

Information system categories1

Information System Categories

Data warehouse erp

Data Warehouse & ERP

  • ERP = Enterprise Resource Planning

    – A software solution that addresses enterprise needs taking the process view of an organization to meet the

    organization goals tightly integrating all the functions

    of an organization.

    -- It integrates all the departments and functions across

    a company into a single computer system that can serve all those different departments’ particular


Why erp


  • Business

    Customer satisfaction

    Business development – new areas, products and services

    Ability to face competition

    Efficient processes required for company’s growth

  • IT

    Present software does not met business needs.

    Legacy systems difficult to maintain

    Obsolete hardware/software difficult to maintain

How erp

How ERP?

  • ERP Combines various department systems into a single, integrated software program that runs off a single database so that the various departments can more easily share information and communicate with each other.

  • The best part of ERP is the way in which it improves the order fulfillment process that is taking the customer order and process it into an invoice and revenue.

  • It doesn’t handle the front-end that is handled by CRM (Customer Relationship Management).

How erp cont d

How ERP?(cont’d)

  • When a customer service representative enters a customer order to an ERP system, he has all the information necessary to complete the order such as customer’s credit rating and order history from the finance module, the company’s inventory levels from the warehouse module and the shipping dock’s trucking schedule from the logistics module.

  • How it’s being done: It integrates the financial information and customer order information . It does so by integrating the following:






How erp cont d1

How ERP? (cont’d)

  • It standardizes and speeds up the manufacturing process. This saves time, increases productivity and reduces head count.

  • It reduces the inventory. Due to the information available about all the orders it helps to maintain the right level of stock and smoothes the manufacturing process.

Data warehouse eai

Data Warehouse & EAI

  • What is EAI? EAI refers to Enterprise Application Integration. EAI is the merging of applications and data from various new and legacy systems within a business. Various means are employed to accomplish EAI, including middleware, in order to unify IT resources, maximize new ERP investments, diminish errors and get everyone on the same page. EAI enables companies to link their existing software applications with each other and with portals. EAI provide the ability to get their applications to exchange critical data. EAI is usually close to the top of any CIO's list of concerns. There are different approaches to EAI. Some rely on linking specific applications with tailored code, but most rely on generic solutions, typically called middleware. XML, combined with SOAP and UDDI is a kind of middleware.

E business


~ Trend toward better understanding as well as consolidation of internal processes and data ~

~ Trend toward web-enabled data warehouse. ~

  • Login