Data management data warehouses
1 / 28

Data Management & Data Warehouses - PowerPoint PPT Presentation

  • Uploaded on

Data Management & Data Warehouses. MIS 320 Kraig Pencil Summer 2014. Game Plan. Introduction Why use a relational database? Database management systems Data warehouses Data mining Data marts. A. Why use a relational database?.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Data Management & Data Warehouses' - deliz

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Data management data warehouses

Data Management & Data Warehouses

MIS 320

Kraig Pencil

Summer 2014

Game plan
Game Plan

  • Introduction

  • Why use a relational database?

  • Database management systems

  • Data warehouses

  • Data mining

  • Data marts

A why use a relational database
A. Why use a relational database?

1. A database sounds great, but why don’t we just store all our data in one big table in an Excel spreadsheet?

  • Example: Can you foresee any hassles or potential difficulties associated with entering/storing order information in the following Excel table?

A why use a relational database1
A. Why use a relational database?

1. Why don’t we just store our data in one spreadsheet table? (cont.)

  • Potential problems

    • May have “redundant” data entry

    • Potential for data entry errors (different/wrong phone numbers)

    • Updates can be a hassle/inefficient (e.g., change phone no)

  • Solution

    • “Normalize” the data …

       Break up the table into a set of linked tables in a data base (instead of having one spreadsheet)

      • See example

Example normalized tables and the advantages of a database
Example: Normalized Tables(and the advantages of a database)

  • Questions:

  • Any unneeded redundancy?

  • Is it now efficient to update customer info?

  • Where is the foreign key?

Example non normalized data table for an auto shop rainer turban fig 4 6
Example: Non-Normalized Data Table for an Auto Shop (Rainer & Turban, Fig 4.6)

Examples of redundancy

B database management systems
B. Database Management Systems

1. What is a “database management system” (DBMS)?

  • SW that allows one to create, store, organize, manage, and use data

    • Example of a DBMS?

      2. Key components

  • Data Definition subsystem

  • Data Manipulation subsystem

  • Application Generation subsystem

  • Data Administration subsystem

  • DBMS engine

Dbms components
DBMS Components

Lab Tutorials 1,2

Lab Tutorials 3,5

Lab Tutorials 4,6

B database management systems1
B. Database Management Systems

3. Examples of DBMS components in Access

Data Definition subsystem

  • Data dictionary (“Design view” for a table)

    Data Manipulation subsystem: Move, change, and “ask questions”

  • View of a table (“Datasheet view” for a table)

  • Query-by-example (QBE) tool

  • Structured query language (SQL)

    Application Generation subsystem: the “front end”

  • Design of forms and reports

    Data Administration subsystem

  • Optimize query performance

  • Security settings with password

B database management systems2
B. Database Management Systems

4. What aspects of data need to be specified?

  • Lots of aspects!!!

    • Recall table creation in MS Access (Tutorials 1 & 2)

  • Common data properties

    • Data “type” (number, text, date, etc.)

    • Description

    • Field size

    • Required/not required

    • Etc.

  • An important reference for a database system:

     Data dictionary

    • Stores information about the data in a database

Access example
Access Example:

Data “type” (number, text, date, etc.)


Field size

Required/not required

Information about the “Gender” field is specified in “Field Properties” section

Access example data manipulation subsystem low stock products query
Access Example: Data Manipulation Subsystem(Low Stock Products query)

QBE or SQL may be used to prepare a query. Which approach would be easier for most people?

Access example data administration performance analysis for a database
Access Example: Data Administration(Performance Analysis for a Database)

B database management systems cont
B. Database Management Systems (cont.)

5. DBMS: Example products

  • You are very likely to work with – and possibly help develop a database– using one or more of the following:

    • Small-Midsize DBMS: MicrosoftAccess, dBase, Paradox

    • Mid-to-Large DBMS: MicrosoftSQL Server, Oracle, DB2, Informix, IMS

C data warehouses
C. Data Warehouses

  • Business problem:

    • Difficult for larger organizations to analyze organizational data from multiple sources

    • Solution: Data warehouse

  • Gather/integrate information from existing operational databases into a “warehouse”

    •  Create “Business Intelligence” system

    • See next figure

Create a data warehouse from operational databases
Create a Data Warehouse from Operational Databases

From Haag, et al., MIS for the Information Age, 2004

C data warehouses cont
C. Data Warehouses (cont.)

3. Data warehouse features

  • Designed to support business decision making

    • Not transactions!

    • Supports OLAP

      • On-line Analytical Processing

  • Crosses functional boundaries of an organization

  • Can be very large

  • Note: Warehouse is “read only”

    • Why?

  • Can be a significant strategic resource for a company

    • Can yield a high ROI

  • Examples

    • ???

  • C data warehouses cont1
    C. Data Warehouses (cont.)

    • Implementation issues

      • People may be reluctant to share information

      • “ETL” process is not easy

        • Extraction, transformation, load

      • Expensive

    D data mining
    D. Data Mining

    • Provides a means to extract patterns and relationships from large amount of data (e.g., a data warehouse)

    • Mining analogy

      • Sift through raw dirt/rock to find something of value

      • Large volumes of data are sifted in an attempt to find something worthwhile

    • Example: market basket analysis

      • Identify products that may be attractive to a customer

        • See next slide: buyer suggestions

    D data mining cont
    D. Data mining (cont.)

    4. Identify previously unknown patterns

    • e.g., What are characteristics of customers likely to default on a bank loan?“Target knows before it shows”

    • How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did

    • How Companies Learn Your Secrets: NYTimes

    • e.g., Suppose you discovered that beer and diapers*were often found in the same purchase?

      • “Market basket analysis”

      • What could you do with that information to improve sales of one, the other or both?

        *This is a common example, not an actual case.

    E data marts
    E. Data Marts

    5. Data marts

    • Warehouses can be overwhelming/difficult to implement …

       Some organizations create “data marts”

    • A subset of a data warehouse

      • Simpler, scaled-down version

      • Focuses on/Integrates a specific area (e.g., Sales department)

      • Provides useful decision making tools

    Haggen photo from: ferndale.php

    MiniMart photo from:

    Data marts subsets of data warehouse
    Data Marts: Subsets of Data Warehouse

    Data mining business intelligence
    Data Mining – Business Intelligence

    • A few videos to watch and think about …







    Big data
    Big Data

    • Big data[1][2] is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. (Wikipedia)

    • (Image)

    Global big data 2 5 exobytes day
    Global Big Data: + 2.5 exobytes/day

    • The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s[15]

    • As of 2012, every day 2.5 quintillion (2.5×1018) bytes of data were created.[16]

    • (Wikipedia)

    • (Image)

    Big data1
    Big Data

    • The next frontier in data?


    • Some terms:

      • Hadoop (distributed file organization)

      • Distributed databases and server clusters

      • Cassandra (No only SQL DBMS)

      • MapReduce (breaking computation into smaller pieced, then combining the results of each computation)