mis 5346 data warehousing n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
MIS 5346 Data warehousing PowerPoint Presentation
Download Presentation
MIS 5346 Data warehousing

Loading in 2 Seconds...

play fullscreen
1 / 30

MIS 5346 Data warehousing - PowerPoint PPT Presentation


  • 148 Views
  • Uploaded on

MIS 5346 Data warehousing. Concepts and Components Chapters 1 , 2, 7, 14, 15. Agenda. Background Data Warehouse vs Operational Data Store Characteristics of a Data Warehouse Improvements in Data Warehousing Relationship to Business Intelligence.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'MIS 5346 Data warehousing' - fagan


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
mis 5346 data warehousing
MIS 5346 Data warehousing

Concepts and Components

Chapters 1, 2, 7, 14, 15

agenda
Agenda
  • Background
  • Data Warehouse vs Operational Data Store
  • Characteristics of a Data Warehouse
  • Improvements in Data Warehousing
  • Relationship to Business Intelligence
evolution of decision support technologies
Evolution of Decision Support Technologies
  • Business people need information to make plans, decisions, and assess results
  • 60's
    • Batch reports
  • 70’s
    • DSSs
  • 80’s
    • Info Centers
  • 90’s
    • Early DWs
  • 2000's
    • Business Intelligence
  • Issues:
    • Dependency on IT resources
    • Based on OLTP or extracts
    • Functionality often pre-programmed
    • "Big Data" Analytics
dw vs business intelligence
DW vs. Business Intelligence
  • Short:
    • DW = populating structures with data
    • BI = using DW data
  • Long:
    • DW = body of historical data, separate from the operations of the organization, used to create BI
    • BI = the delivery of timely, accurate, and useful information to decision-makers
  • Broad:
    • BI = a broad category of applications, technologies, and organizational processes for gathering, storing, accessing, and analyzing data to help business users make better decisions
need for decision optimized data storage
Need for Decision-Optimized Data Storage
  • Business people need information to make plans, decisions, and assess results
    • What were sales volumes by region and product category for the last 3 years?
    • Which of two new medications will result in the best outcomes (higher recovery rate and shorter hospital stay)?
  • Data captured by complex operational systems (OLTPs) optimized to support well-defined transaction requirements
  • Difficult to get needed information from data grounded in OLTPs
data warehouse
Data Warehouse

“… a subject-oriented, integrated, nonvolatile, and time variant collection of data in support of management decisions.”

    • Managing the Data Warehouse, W. H. Inmon, John Wiley & Sons, December, 1996.
  • “… a copy of transaction data specifically structured for query and analysis.”
    • The Data Warehouse Toolkit, R. Kimball, John Wiley & Sons, February, 1996.
  • Enterprise data, transformed, integrated, accumulated over time, optimized for decision-making, and accessible via analytical tools
characteristics of a dw ala inmon
Characteristics of a DW (ala Inmon)
  • Subject-Oriented
    • As opposed to business-process oriented
  • Integrated
    • Multiple sources, internal and external
    • Critical part of DW implementation
  • Time-Variant
    • History, time periods important
  • Non-Volatile
    • DW data not changed once stored
characteristics of a dw cont
Characteristics of a DW, cont…
  • Subject-Oriented
    • Needs are business subject-focused
  • Integrated
    • Multiple sources, internal and external
  • Time-Variant
    • History, time periods important
  • Non-Volatile
    • DW data not changed once stored
  • Data Granularity
data granularity
Data Granularity
  • Level of detail stored in database
    • Operational focus
    • Analytical focus
  • Examples:
    • Life Insurance Policy vs. Life Insurance Coverage
    • Product Category vs. Product Sales
  • High granularity (eg, transactional grain) is most flexible
challenges in early dw implementation
Challenges in Early DW Implementation
  • Improper or infeasible architectures, approaches
  • Insufficient attention to organizational strategy and culture
  • Early information delivery tools too complex for business users
  • Storage technology made it difficult to store much detail or history, and slow to process
improved technology
Improved Technology
  • User-friendly tools for analysis, visualization
    • Excel
    • Tableau
    • Reporting Services, …
  • Improved technology for accessing, aggregating, partitioning data
  • Advances in processing technology
    • Parallel processing
  • Advances in storage technology
    • RAID
    • Solid State
improved architectures
Improved Architectures
  • Based on
    • Data Marts
    • Conformed dimensions
    • BI-emphasis
data warehouse vs data marts
Data Warehouse vs. Data Marts
  • Enterprise Data Warehouse
    • Information about ALL subjects important to the organization
data warehouse vs data marts cont
Data Warehouse vs. Data Marts, cont…
  • Data Marts
    • Subsets of data warehouse that focus on a selected subject area; typically departmental in nature
bi architecture
BI Architecture

Source: Chaudhuri et. al. , An Overview of Business Intelligence Technology, Communications of the ACM, 54(8), August 2011, pp. 88-98.

bi architecture cont
BI Architecture, cont…

Source: Oracle Corporation. Information Management and Big Data: A Reference Architecture, Oracle White Paper, February 2013, p. 12.

architecture components
Architecture Components
  • Data Sources
  • Data Staging (Movement)
  • Data Storage (Warehouse)
  • Data Analysis/Discovery (Mid-tier)
  • Information Delivery (Front-end Presentation)
1 data sources
1. Data Sources
  • Identifying required business data from
    • Production
    • Internal, Personal
    • Archived
    • External
2 data staging
2. Data Staging
  • Extract
    • From source systems
  • Transform
    • Cleanse
    • Supplement
    • Convert
    • Combine…
  • Load
    • Populate data warehouse/mart tables
3 data storage
3. Data Storage
  • Data Warehouse / Data Mart
    • Relational database for structured data
    • Non-relational (e.g., Hadoop) data store for "loosely-structured" data
  • Metadata
    • Relational database
      • Catalog
      • Extended properties
      • Custom tables
    • External products/tools
    • Spreadsheets…
4 data analysis supporting knowledge discovery
4. Data Analysis: Supporting Knowledge Discovery
  • Layout-Led Discovery
    • Pre-Designed Reports
  • Data-Led Discovery
    • OLAP Analysis
  • Model-Led Discovery
    • Data Mining
pre defined reports
Pre-Defined Reports
  • Information pushed to user
  • Content and layout pre-determined
  • Can be parameter-driven
  • Can support some drill-down
  • May also include basic report development
slide26
OLAP
  • Online Analytical Processing
    • Providing On-Line Analytical Processing to User Analysts, E. F. Codd, Codd & Date, Inc 1993.
  • Short Definition:
    • Class of applications or tools that support ad-hoc analysis of multidimensional data
  • Longer Definition:
    • “…technology that enables [users]… to gain insight into data through…fast, consistent, interactive access [to]…information that has been transformed…to reflect the real dimensionality of the enterprise…”
      • OLAP Council (www.olapcouncil.org)
data mining
Data Mining
  • Search for patterns in large amounts of data
    • Making connections/associations with data
    • Predicting future outcomes
  • OLAP vs. Data Mining
    • “Report on the past” vs. “Predict the future”
  • Part of Knowledge Discovery…
next time
Next Time…
  • Data Warehouse Design (Dimensional Modeling)
  • *** Assignment 1 Due 1/28 ***