1 / 43

Introduction to Data Warehousing

Introduction to Data Warehousing. Why Data Warehouse?. Scenario 1. ABC Pvt. Ltd is a company with branches at Mumbai, Delhi, Chennai and Bangalore. The Sales Manager wants quarterly sales report. Each branch has a separate operational system. Scenario 1 : ABC Pvt Ltd. Mumbai. Delhi.

steven-moss
Download Presentation

Introduction to Data Warehousing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to Data Warehousing

  2. Why Data Warehouse?

  3. Scenario 1 ABC Pvt. Ltd is a company with branches at Mumbai, Delhi, Chennai and Bangalore. The Sales Manager wants quarterly sales report. Each branch has a separate operational system.

  4. Scenario 1 : ABC Pvt Ltd. Mumbai Delhi Sales per item type per branch for first quarter. Sales Manager Chennai Banglore

  5. Solution 1:ABC Pvt Ltd. • Extract sales information from each database. • Store the information in a common repository at a single site.

  6. Solution 1:ABC Pvt Ltd. Mumbai Data Warehouse Report Delhi Query & Analysis tools Sales Manager Chennai Banglore

  7. Scenario 2 One Stop Shopping Super Market has huge operational database. Whenever Executives wants some report the OLTP system becomes slow and data entry operators have to wait for some time.

  8. Scenario 2 : One Stop Shopping Data Entry Operator Operational Database Report Wait Management Data Entry Operator

  9. Solution 2 • Extract data needed for analysis from operational database. • Store it in another system, the data warehouse. • Refresh warehouse at regular intervals so that it contains up to date information for analysis. • Warehouse will contain data with historical perspective.

  10. Solution 2 Data Entry Operator Operational database Data Warehouse Report Extract data Manager Transaction Data Entry Operator

  11. Scenario 3 Cakes & Cookies is a small, new company. The chairman of this company wants his company to grow. He needs information so that he can make correct decisions.

  12. Solution 3 • Improve the quality of data before loading it into the warehouse. • Perform data cleaning and transformation before loading the data. • Use query analysis tools to support adhoc queries.

  13. Solution 3 Expansion sales Data Warehouse Query & Analysis tool Chairman time Improvement

  14. Summing up? • Why do you need a warehouse? • Operational systems could not provide strategic information • Executive and managers need such strategic information for • Making proper decision • Formulating business strategies • Establishing goals • Setting objectives • Monitoring results

  15. Why operational data is not capable of producing valuable information? • Data is spread across incompatible structures and systems • Not only that, improvements in technology had made computing faster, cheaper and available

  16. FAILURES OF PAST DECISION-SUPPORT SYSTEMS OLTP systems

  17. Decision support systems

  18. Operational and informational

  19. What is Data Warehouse??Is it the only viable solution

  20. Business intelligence at DW

  21. Functional definition of a DW • The data warehouse is an informational environment that • Provides an integrated and total view of the enterprise • Makes the enterprise’s current and historical information easily available for decision making • Makes decision-support transactions possible without hindering operational systems • Renders the organization’s information consistent • Presents a flexible and interactive source of strategic information

  22. Questions???? • Describe five differences between operational systems and informational systems • A data warehouse in an environment, not a product. Discuss.

  23. Building Blocks of a Datawarehouse

  24. Inmons’s definition A data warehouse is - subject-oriented, - integrated, - time-variant, - nonvolatile collection of data in support of management’s decision making process.

  25. Subject-oriented • Data warehouse is organized around subjects such as sales, product, customer. • It focuses on modeling and analysis of data for decision makers. • Excludes data not useful in decision support process.

  26. Integration • Data Warehouse is constructed by integrating multiple heterogeneous sources. • Data Preprocessing are applied to ensure consistency. RDBMS Data Warehouse Legacy System Flat File Data Processing Data Transformation

  27. Integration • In terms of data. • encoding structures. • Measurement of attributes. • physical attribute. of data • naming conventions. • Data type format remarks

  28. Time-variant • Provides information from historical perspective, e.g. past 5-10 years • Every key structure contains either implicitly or explicitly an element of time, i.e., every record has a timestamp. • The time-variant nature in a DW • Allows for analysis of the past • Relates information to the present • Enables forecasts for the future

  29. Non-volatile • Data once recorded cannot be updated. • Data warehouse requires two operations in data accessing • Initial loading of data • Incremental loading of data load access

  30. Data Granularity • In an operational system, data is usually kept at the lowest level of detail. • In a DW, data is summarized at different levels. Three data levels in a banking data warehouse

  31. Operational v/s Information System

  32. Operational v/s Information System

  33. Two approaches in designing a DW

  34. Bottom Up Approach

  35. Top Down Approach

  36. A Practical Approach-Kimball • Plan and Define requirements • Create a surrounding architecture • Conform and Standardize the data Content • Implement Data Warehouse as series of super-mart one at a time.

  37. Distribution Sales Product Customer Accounts Marketing Inventory Finance Vendors An Incremental Approach Glossary Common Business Metrics Common Business Rules Common Business Dimensions Common Logical Subject Area ERD Individual Architected Data Marts

  38. Distribution Sales Product Customer Accounts Marketing Inventory Finance Vendors Enterprise Data Warehouse The Eventual Result Architected Enterprise Foundation

  39. Data Warehouse: Holds multiple subject areas Holds very detailed information Works to integrate all data sources Does not necessarily use a dimensional model but feeds dimensional models. Data Mart Often holds only one subject area- for example, Finance, or Sales May hold more summarised data (although many hold full detail) Concentrates on integrating information from a given subject area or set of source systems Is built focused on a dimensional model using a star schema.

  40. Data Warehouse verses data marts

More Related