Data management data warehouses
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

Data Management & Data Warehouses PowerPoint PPT Presentation


  • 88 Views
  • Uploaded on
  • Presentation posted in: General

Data Management & Data Warehouses. MIS 320 Kraig Pencil Summer 2014. Game Plan. Introduction Why use a relational database? Database management systems Data warehouses Data mining Data marts. A. Why use a relational database?.

Download Presentation

Data Management & Data Warehouses

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Data management data warehouses

Data Management & Data Warehouses

MIS 320

Kraig Pencil

Summer 2014


Game plan

Game Plan

  • Introduction

  • Why use a relational database?

  • Database management systems

  • Data warehouses

  • Data mining

  • Data marts


A why use a relational database

A. Why use a relational database?

1. A database sounds great, but why don’t we just store all our data in one big table in an Excel spreadsheet?

  • Example: Can you foresee any hassles or potential difficulties associated with entering/storing order information in the following Excel table?


A why use a relational database1

A. Why use a relational database?

1. Why don’t we just store our data in one spreadsheet table? (cont.)

  • Potential problems

    • May have “redundant” data entry

    • Potential for data entry errors (different/wrong phone numbers)

    • Updates can be a hassle/inefficient (e.g., change phone no)

  • Solution

    • “Normalize” the data …

       Break up the table into a set of linked tables in a data base (instead of having one spreadsheet)

      • See example


Example normalized tables and the advantages of a database

Example: Normalized Tables(and the advantages of a database)

  • Questions:

  • Any unneeded redundancy?

  • Is it now efficient to update customer info?

  • Where is the foreign key?


Example non normalized data table for an auto shop rainer turban fig 4 6

Example: Non-Normalized Data Table for an Auto Shop (Rainer & Turban, Fig 4.6)

Examples of redundancy


B database management systems

B. Database Management Systems

1. What is a “database management system” (DBMS)?

  • SW that allows one to create, store, organize, manage, and use data

    • Example of a DBMS?

      2. Key components

  • Data Definition subsystem

  • Data Manipulation subsystem

  • Application Generation subsystem

  • Data Administration subsystem

  • DBMS engine


Dbms components

DBMS Components

Lab Tutorials 1,2

Lab Tutorials 3,5

Lab Tutorials 4,6


B database management systems1

B. Database Management Systems

3. Examples of DBMS components in Access

Data Definition subsystem

  • Data dictionary (“Design view” for a table)

    Data Manipulation subsystem: Move, change, and “ask questions”

  • View of a table (“Datasheet view” for a table)

  • Query-by-example (QBE) tool

  • Structured query language (SQL)

    Application Generation subsystem: the “front end”

  • Design of forms and reports

    Data Administration subsystem

  • Optimize query performance

  • Security settings with password


B database management systems2

B. Database Management Systems

4. What aspects of data need to be specified?

  • Lots of aspects!!!

    • Recall table creation in MS Access (Tutorials 1 & 2)

  • Common data properties

    • Data “type” (number, text, date, etc.)

    • Description

    • Field size

    • Required/not required

    • Etc.

  • An important reference for a database system:

     Data dictionary

    • Stores information about the data in a database


Access example

Access Example:

Data “type” (number, text, date, etc.)

Description

Field size

Required/not required

Information about the “Gender” field is specified in “Field Properties” section


Access example data manipulation subsystem low stock products query

Access Example: Data Manipulation Subsystem(Low Stock Products query)

QBE or SQL may be used to prepare a query. Which approach would be easier for most people?


Access example application generation subsystem employer information form

Access Example: Application Generation Subsystem(Employer Information Form)


Access example data administration performance analysis for a database

Access Example: Data Administration(Performance Analysis for a Database)


B database management systems cont

B. Database Management Systems (cont.)

5. DBMS: Example products

  • You are very likely to work with – and possibly help develop a database– using one or more of the following:

    • Small-Midsize DBMS: MicrosoftAccess, dBase, Paradox

    • Mid-to-Large DBMS: MicrosoftSQL Server, Oracle, DB2, Informix, IMS


C data warehouses

C. Data Warehouses

  • Business problem:

    • Difficult for larger organizations to analyze organizational data from multiple sources

    • Solution: Data warehouse

  • Gather/integrate information from existing operational databases into a “warehouse”

    •  Create “Business Intelligence” system

    • See next figure


Create a data warehouse from operational databases

Create a Data Warehouse from Operational Databases

From Haag, et al., MIS for the Information Age, 2004


C data warehouses cont

C. Data Warehouses (cont.)

3. Data warehouse features

  • Designed to support business decision making

    • Not transactions!

    • Supports OLAP

      • On-line Analytical Processing

  • Crosses functional boundaries of an organization

  • Can be very large

  • Note: Warehouse is “read only”

    • Why?

  • Can be a significant strategic resource for a company

    • Can yield a high ROI

  • Examples

    • ???


  • C data warehouses cont1

    C. Data Warehouses (cont.)

    • Implementation issues

      • People may be reluctant to share information

      • “ETL” process is not easy

        • Extraction, transformation, load

      • Expensive


    D data mining

    D. Data Mining

    • Provides a means to extract patterns and relationships from large amount of data (e.g., a data warehouse)

    • Mining analogy

      • Sift through raw dirt/rock to find something of value

      • Large volumes of data are sifted in an attempt to find something worthwhile

    • Example: market basket analysis

      • Identify products that may be attractive to a customer

        • See next slide: Amazon.com buyer suggestions


    Data management data warehouses

    Data Mining: Example of pattern discovered via mining


    D data mining cont

    D. Data mining (cont.)

    4. Identify previously unknown patterns

    • e.g., What are characteristics of customers likely to default on a bank loan?“Target knows before it shows”

    • How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did

    • How Companies Learn Your Secrets: NYTimes

    • e.g., Suppose you discovered that beer and diapers*were often found in the same purchase?

      • “Market basket analysis”

      • What could you do with that information to improve sales of one, the other or both?

        *This is a common example, not an actual case.


    E data marts

    E. Data Marts

    5. Data marts

    • Warehouses can be overwhelming/difficult to implement …

       Some organizations create “data marts”

    • A subset of a data warehouse

      • Simpler, scaled-down version

      • Focuses on/Integrates a specific area (e.g., Sales department)

      • Provides useful decision making tools

    Haggen photo from: www.callhugh.com/ ferndale.php

    MiniMart photo from: http://www.ae.gatech.edu/research/controls/pictures/f020801_gtar/Mini%20Mart.JPG


    Data marts subsets of data warehouse

    Data Marts: Subsets of Data Warehouse


    Data mining business intelligence

    Data Mining – Business Intelligence

    • A few videos to watch and think about …

    • http://www.youtube.com/user/SASsoftware?v=C14GVhNt7Do&feature=pyv&ad=4782573666&kw=CRM

    • http://www.youtube.com/user/ibm?#p/c/13/fFdITHMuy2w

    • http://www.youtube.com/user/SASsoftware?v=2677nWVNg9M&feature=pyv&ad=4782551166&kw=business%20analytics

    • http://www.youtube.com/watch?v=El_lSd6G5WU

    • http://www.youtube.com/watch?v=uP89kaDU40c

    • http://www.youtube.com/user/SASsoftware?v=C14GVhNt7Do&feature=pyv&ad=4782573666&kw=CRM#p/u/35/ecqk0JUKvAI


    Big data

    Big Data

    • Big data[1][2] is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. (Wikipedia)

    • (Image)


    Global big data 2 5 exobytes day

    Global Big Data: + 2.5 exobytes/day

    • The world's technological per-capita capacity to store information has roughly doubled every 40 months since the 1980s[15]

    • As of 2012, every day 2.5 quintillion (2.5×1018) bytes of data were created.[16]

    • (Wikipedia)

    • (Image)


    Big data1

    Big Data

    • The next frontier in data?

    • http://www.eweek.com/c/a/Data-Storage/Big-Data-Analytics-Is-Just-Starting-to-Reach-Its-Potential-10-Reasons-Why-457684/?kc=EWKNLEAU07102012STR1

    • Some terms:

      • Hadoop (distributed file organization)

      • Distributed databases and server clusters

      • Cassandra (No only SQL DBMS)

      • MapReduce (breaking computation into smaller pieced, then combining the results of each computation)


  • Login