1 / 24

Atlanta Microsoft Database Forum

Atlanta Microsoft Database Forum. Introduction to Data Warehousing Concepts. Presented by. Brian Thomas. Solution Builders, Inc. March 8, 2004. Brian.Thomas@SolutionBuilders.com. What is a Data Warehouse?.

tom
Download Presentation

Atlanta Microsoft Database Forum

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Presented by Brian Thomas Solution Builders, Inc. March 8, 2004 Brian.Thomas@SolutionBuilders.com

  2. What is a Data Warehouse? Data collected from one or many systems that exist within and outside the organization. The Data is structured in such a way as to reduce the amount of time that it takes to produce reliable information.

  3. Why Build a Data Warehouse? • To Provide a Consistent Common Source for Corporate Information • To Store Large Volumes of Historical Detail Data from Mission Critical Applications • Improve the Ability to Access, Report Against, and Analyze Information • To Solve or Improve Upon Business Processes

  4. System Generated Reports Sales Analysis is extrapolated from the System Reports. Turning Data into Information Functional Data Warehouse Sales System

  5. Functional Data Warehouse of Sales Information Sales Information is available to a wider audience of decision makers. Turning Data into Information Functional Data Warehouse Sales System

  6. Analysis performed and Decisions drawn from the Cross Organizational Sales Data Centralized Data Warehouse of Sales Data from across the Organization Turning Data into Information Cross Organizational Functional Data Warehouse Division A Sales System Division B Sales System Division C Sales System

  7. Corporate Performance Analysis is extrapolated from the System Reports. System Generated Reports Turning Data into Information Cross Functional Data Warehouse Marketing System Sales System Production Systems

  8. Cross Functional Data Warehouse of Information Corporate Performance Analysis is available to a wider audience. Turning Data into Information Cross Functional Data Warehouse Marketing System Sales System Production Systems

  9. Analysis is performed and Decisions made from the Cross Functional Organizational Performance Data Centralized Cross Functional Data Warehouse of Information Turning Data into Information Cross Organizational & Cross Functional Data Warehouse Division A Division B Division C

  10. Enterprise Data Warehouse Increased Local Specifications Increased Level of Standardization Data Warehouse Architecture Management Systems Access Methods Source Systems Data Warehouse Components Planning & Forecasting Corporate Level Portal / Web Interface Division A Analytics & Modeling Desktop Applications Division B Business Group Level Performance Management DW / DM Data Access & Query Management Services DW / DM Extraction Transformation Load (ETL) DW / DM Printed Reports Division C Scorecards & Dashboards Email Divisional Level DM DM DM DM DM DM Query & Reporting External Data Mobile Devices

  11. Data Warehouse Architecture Source Systems Data Staging Area Data Warehouse Repository Division A Division B Extract, Transformation and Load (ETL) Division C External Data

  12. Data Warehouse Architecture Data Staging Area • Subject Area Oriented • Data Structure more closely mirrors Operational System Data Layouts • Supports Identification of Changed Data • Acts as a Working Area to Support the Transformation Process

  13. Data Warehouse Architecture Extraction, Transformation & Load (ETL) • Perform Attribute Standardization and Cleansing • Apply Business Rules and Calculations • Consolidate using Matching and Merge / Purge Logic • Ensure Proper Linking and Tracking of History Extract, Transformation and Load (ETL)

  14. Data Warehouse Architecture Extraction, Transformation & Load (ETL) App. A: Male , Female App. B: 1 , 0 App. C: x , y App. D: m , f Male, Female Lookup Function App. A: pipeline (cm) App. B: pipeline (inches) App. C: pipeline (mcf) App. D: pipeline (yds) pipeline (cm) Conversion Function App. A: Date (julian) App. B: Date (yyyymmdd) App. C: Date (mm/dd/yyyy) App. D: Date (absolute) Date (julian) Formatting Function App. A: Description App. B: Description App. C: Description App. D: Description Description Merging Function App. A: balance on hand App. B: current balance App. C: cash in house App. D: balance Balance Mapping Function

  15. Data Warehouse Architecture Data Warehouse Repository • Organized around Conformed Dimensions and Facts • Promotes Usability and Intuitiveness • Consolidated and Cross-Functional • Historical and Atomic Representation of Data • Insulated from Source System Modifications and Additions

  16. Data Warehouse Repository Star Schema Concepts Fact Table This table is the core of the Star Schema Structure and contains the Facts or Measures available through the Data Warehouse. These Facts answer the questions of “What”, “How Much”, or “How Many”. Some Examples: Sales Dollars, Units Sold, Gross Profit, Expense Amount, Net Income, Unit Cost, Number of Employees, Turnover, Salary, Tenure, etc.

  17. Data Warehouse Repository Star Schema Concepts Dimension Tables These tables describe the Facts or Measures. These tables contain the Attributes and may also be Hierarchical. These Dimensions answer the questions of “Who”, “What”, “When”, or “Where”. Some Examples: • Day, Week, Month, Quarter, Year • Sales Person, Sales Manager, VP of Sales • Product, Product Category, Product Line • Cost Center, Unit, Segment, Business, Company

  18. Shipper_Dim Time_Dim Customer_Dim Employee_Dim Product_Dim ShipperKey CustomerKey ProductKey EmployeeKey TimeKey Sales_Fact CustomerID . . . ProductID . . . ShipperID . . . TheDate . . . EmployeeID . . . TimeKey EmployeeKey ProductKey CustomerKey ShipperKey Required Data (Business Metrics) or (Measures) . . . Data Warehouse Repository Star Schema Concepts

  19. Data Warehouse Repository Cube Concepts Atlanta Chicago MarketsDimension Denver Grapes Cherries Dallas Melons Product Dimension Apples Q4 Q1 Q2 Q3 TimeDimension

  20. SalesFact Data Warehouse Repository Cube Concepts Atlanta Chicago MarketsDimension Denver Grapes Cherries Dallas Melons Product Dimension Apples Q4 Q1 Q2 Q3 TimeDimension

  21. Data Warehouse Repository Storage Concepts • Relational On-Line Analytical Processing (ROLAP): The information that is stored in the Data Warehouse is held in a relational structure. Aggregations are performed on the fly either by the database or in the analysis tool. • Multidimensional On-Line Analytical Processing (MOLAP): This information is aggregated in a predefined manner based on the characteristics of the Measures and the defined hierarchy of the Dimensions. Since the data is pre-aggregated, navigating through the hierarchies is instantaneous. The user is simply navigating to a point within the Multidimensional Cube and not performing any on the fly aggregations. • Hybrid On-Line Analytical Processing (HOLAP): This is a combination of MOLAP and ROLAP. A portion of the data is predefined and aggregated. This would typically be the set of information that is accessed most frequently. Additional detail can be held in a ROLAP structure and allow a user to drill through the MOLAP structure into the ROLAP structure.

  22. Client perspective MOLAP HOLAP ROLAP Query performance Fastest Faster Fast Storage consumption High Medium Low Data Warehouse Repository Cube Concepts

  23. Enterprise Data Warehouse Increased Local Specifications Increased Level of Standardization Microsoft Office, Reporting Services and .NET Framework Where does Microsoft fit in? SQL Server DTS SQL Server Relational Database and Analysis Services Management Systems Access Methods Source Systems Data Warehouse Components Planning & Forecasting Corporate Level Portal / Web Interface Division A Analytics & Modeling Desktop Applications Division B Business Group Level SharePoint Portal, Exchange, and .NET Framework Performance Management DW / DM Data Access & Query Management Services DW / DM Extraction Transformation Load (ETL) DW / DM Printed Reports Division C Scorecards & Dashboards Email Divisional Level DM DM DM DM DM DM Query & Reporting External Data Mobile Devices SQL Stored Procedures, SQL Views, MDX, and .NET Web Services

  24. Q & A

More Related