data warehouses n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Data Warehouses PowerPoint Presentation
Download Presentation
Data Warehouses

Loading in 2 Seconds...

play fullscreen
1 / 13

Data Warehouses - PowerPoint PPT Presentation


  • 93 Views
  • Uploaded on

Data Warehouses. Richard Goerwitz. What’s a Data Warehouse?. People disagree on what a DW is Instead of defining one up front I’m going to Talk about key concepts used by data warehouse architects Show you an example of a small data warehouse component

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Data Warehouses


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data warehouses

Data Warehouses

Richard Goerwitz

what s a data warehouse
What’s a Data Warehouse?
  • People disagree on what a DW is
  • Instead of defining one up front I’m going to
    • Talk about key concepts used by data warehouse architects
    • Show you an example of a small data warehouse component
    • Talk about how you can repeat what I’ve done
key terms acronyms
Key Terms, Acronyms
  • By the end of this talk, you’ll know what these things all mean:
    • OLTP
    • OLAP
    • ETL
    • Star schema
    • Conformed dimension
    • Data mart
    • Data warehouse
slide4
OLTP
  • OLTP = online transaction processing
  • The process of moving data around to handle day-to-day affairs
    • Scheduling classes
    • Registering students
    • Recording grades
    • Recording payments, etc.
  • Systems supporting this kind of activity are called transactional systems
transactional systems
Transactional Systems
  • Transactional systems are optimized primarily for the here and now
    • They allow for constant change
    • Don’t record all previous data states
      • Name changes
      • Course rosters, etc.
    • Support many simultaneous users
    • Permit often heavy read/write access
transactional databases
Transactional Databases
  • Databases that support transactional systems must be heavily normalized
  • This means they should -
    • Shun duplicate data (1NF)
    • Keep dependent data with the stuff it depends on (2NF)
    • Keep data dependencies straightforward within each database table (3NF)
    • Structure tables so that changes only “lock up” a small part of the DB at a time
carleton databases
Carleton Databases
  • Here are some major DBs at Carleton
    • Informix – Card access
    • UniData 6.1 – SIS, financials, etc.
    • MS SQL Server 2000 – Asset tracking, etc.
    • Oracle 9i – Alumni, document management
    • MySQL 4.x – web system back ends
    • PostgreSQL 8.x – courses like this one
  • You tell me -
    • Which ones are transactional in nature?
slide8
OLAP
  • OLAP = online analytical processing
  • The process of analyzing data derived from OLTP (and other) systems
  • What might we analyze?
    • Class enrollments, by department
    • Courses taught, by department
    • Who is majoring in what
    • All of the above, correlated against an additional time dimension
olap support systems
OLAP Support Systems
  • OLAP support systems are optimized for analysis
  • What does this mean?
    • They are geared mainly for read access
    • They support fewer simultaneous users
    • They hold snapshots of OLTP data
      • Provide history
      • Give us time depth to our analyses
    • They change relatively slowly
slide10
ETL
  • When OLAP support systems change, the changes are typically made via periodic (e.g., nightly) ETL processes
  • ETL = extract, transform, load
  • ETL means specifically -
    • Extracting data from various sources
    • Transforming and cleaning the data
    • Loading data into databases used for analysis and reporting
star schemas
Star Schemas
  • ETL processes need to load data into a database
  • How does this database look?
    • It’s structurally simple
    • Separates data into fact tables and dimension tables
    • Reorganizes fact and dimension tables into a series of star-like schemas
    • (Don’t worry, I’ll be showing you a graphic example of one of these.)
data marts
Data Marts
  • A data mart is really just a series of star schemas
  • Dimension tables in a series of star schemas must be identical or at least conformed
    • By conformed I mean
      • The tables are proper subsets of each other
      • They are hierarchized the same way internally
data warehouses1
Data Warehouses
  • So what is a data warehouse (DW)?
  • As noted, people use this term loosely, in various ways
  • But in general we can say that a DW:
    • Encompasses a number of smaller data marts
    • Derives its data from multiple back-end systems
    • Is optimized for OLAP, not OLTP
    • Holds historical snapshots
    • Must be heavily documented