data management data warehouses
Skip this Video
Download Presentation
Data Management & Data Warehouses

Loading in 2 Seconds...

play fullscreen
1 / 28

Data Management & Data Warehouses - PowerPoint PPT Presentation

  • Uploaded on

Data Management & Data Warehouses. MIS 320 Kraig Pencil Summer 2014. Game Plan. Introduction Why use a relational database? Database management systems Data warehouses Data mining Data marts. A. Why use a relational database?.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Data Management & Data Warehouses' - deliz

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
data management data warehouses

Data Management & Data Warehouses

MIS 320

Kraig Pencil

Summer 2014

game plan
Game Plan
  • Introduction
  • Why use a relational database?
  • Database management systems
  • Data warehouses
  • Data mining
  • Data marts
a why use a relational database
A. Why use a relational database?

1. A database sounds great, but why don’t we just store all our data in one big table in an Excel spreadsheet?

  • Example: Can you foresee any hassles or potential difficulties associated with entering/storing order information in the following Excel table?
a why use a relational database1
A. Why use a relational database?

1. Why don’t we just store our data in one spreadsheet table? (cont.)

  • Potential problems
    • May have “redundant” data entry
    • Potential for data entry errors (different/wrong phone numbers)
    • Updates can be a hassle/inefficient (e.g., change phone no)
  • Solution
    • “Normalize” the data …

 Break up the table into a set of linked tables in a data base (instead of having one spreadsheet)

      • See example
example normalized tables and the advantages of a database
Example: Normalized Tables(and the advantages of a database)
  • Questions:
  • Any unneeded redundancy?
  • Is it now efficient to update customer info?
  • Where is the foreign key?
example non normalized data table for an auto shop rainer turban fig 4 6
Example: Non-Normalized Data Table for an Auto Shop (Rainer & Turban, Fig 4.6)

Examples of redundancy

b database management systems
B. Database Management Systems

1. What is a “database management system” (DBMS)?

  • SW that allows one to create, store, organize, manage, and use data
    • Example of a DBMS?

2. Key components

  • Data Definition subsystem
  • Data Manipulation subsystem
  • Application Generation subsystem
  • Data Administration subsystem
  • DBMS engine
dbms components
DBMS Components

Lab Tutorials 1,2

Lab Tutorials 3,5

Lab Tutorials 4,6

b database management systems1
B. Database Management Systems

3. Examples of DBMS components in Access

Data Definition subsystem

  • Data dictionary (“Design view” for a table)

Data Manipulation subsystem: Move, change, and “ask questions”

  • View of a table (“Datasheet view” for a table)
  • Query-by-example (QBE) tool
  • Structured query language (SQL)

Application Generation subsystem: the “front end”

  • Design of forms and reports

Data Administration subsystem

  • Optimize query performance
  • Security settings with password
b database management systems2
B. Database Management Systems

4. What aspects of data need to be specified?

  • Lots of aspects!!!
    • Recall table creation in MS Access (Tutorials 1 & 2)
  • Common data properties
    • Data “type” (number, text, date, etc.)
    • Description
    • Field size
    • Required/not required
    • Etc.
  • An important reference for a database system:

 Data dictionary

      • Stores information about the data in a database
access example
Access Example:

Data “type” (number, text, date, etc.)


Field size

Required/not required

Information about the “Gender” field is specified in “Field Properties” section

access example data manipulation subsystem low stock products query
Access Example: Data Manipulation Subsystem(Low Stock Products query)

QBE or SQL may be used to prepare a query. Which approach would be easier for most people?

b database management systems cont
B. Database Management Systems (cont.)

5. DBMS: Example products

  • You are very likely to work with – and possibly help develop a database– using one or more of the following:
    • Small-Midsize DBMS: MicrosoftAccess, dBase, Paradox
    • Mid-to-Large DBMS: MicrosoftSQL Server, Oracle, DB2, Informix, IMS
c data warehouses
C. Data Warehouses
  • Business problem:
    • Difficult for larger organizations to analyze organizational data from multiple sources
    • Solution: Data warehouse
  • Gather/integrate information from existing operational databases into a “warehouse”
    •  Create “Business Intelligence” system
    • See next figure
create a data warehouse from operational databases
Create a Data Warehouse from Operational Databases

From Haag, et al., MIS for the Information Age, 2004

c data warehouses cont
C. Data Warehouses (cont.)

3. Data warehouse features

    • Designed to support business decision making
      • Not transactions!
      • Supports OLAP
        • On-line Analytical Processing
    • Crosses functional boundaries of an organization
    • Can be very large
    • Note: Warehouse is “read only”
      • Why?
    • Can be a significant strategic resource for a company
      • Can yield a high ROI
  • Examples
    • ???
c data warehouses cont1
C. Data Warehouses (cont.)
  • Implementation issues
    • People may be reluctant to share information
    • “ETL” process is not easy
      • Extraction, transformation, load
    • Expensive
d data mining
D. Data Mining
  • Provides a means to extract patterns and relationships from large amount of data (e.g., a data warehouse)
  • Mining analogy
    • Sift through raw dirt/rock to find something of value
    • Large volumes of data are sifted in an attempt to find something worthwhile
  • Example: market basket analysis
    • Identify products that may be attractive to a customer
      • See next slide: buyer suggestions
d data mining cont
D. Data mining (cont.)

4. Identify previously unknown patterns

  • e.g., What are characteristics of customers likely to default on a bank loan?“Target knows before it shows”
  • How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did
  • How Companies Learn Your Secrets: NYTimes
  • e.g., Suppose you discovered that beer and diapers*were often found in the same purchase?
    • “Market basket analysis”
    • What could you do with that information to improve sales of one, the other or both?

*This is a common example, not an actual case.

e data marts
E. Data Marts

5. Data marts

  • Warehouses can be overwhelming/difficult to implement …

 Some organizations create “data marts”

  • A subset of a data warehouse
    • Simpler, scaled-down version
    • Focuses on/Integrates a specific area (e.g., Sales department)
    • Provides useful decision making tools

Haggen photo from: ferndale.php

MiniMart photo from:

data mining business intelligence
Data Mining – Business Intelligence
  • A few videos to watch and think about …
big data
Big Data
  • Big data[1][2] is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. (Wikipedia)
  • (Image)
global big data 2 5 exobytes day
Global Big Data: + 2.5 exobytes/day
  • The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s[15]
  • As of 2012, every day 2.5 quintillion (2.5×1018) bytes of data were created.[16]
  • (Wikipedia)
  • (Image)
big data1
Big Data
  • The next frontier in data?
  • Some terms:
    • Hadoop (distributed file organization)
    • Distributed databases and server clusters
    • Cassandra (No only SQL DBMS)
    • MapReduce (breaking computation into smaller pieced, then combining the results of each computation)