slide1 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
DATA RESOURCE MANAGEMENT PowerPoint Presentation
Download Presentation
DATA RESOURCE MANAGEMENT

Loading in 2 Seconds...

play fullscreen
1 / 42

DATA RESOURCE MANAGEMENT - PowerPoint PPT Presentation


  • 194 Views
  • Uploaded on

DATA RESOURCE MANAGEMENT. Data Hierarchy in a Computer System. Entitities and Attributes. Problems with the Traditional File Environment. Data redundancy Program-Data dependence Lack of flexibility Poor security Lack of data-sharing and availability. Figure 7-3.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'DATA RESOURCE MANAGEMENT' - jacob


Download Now An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide1

DATA

RESOURCE

MANAGEMENT

slide4

Problems with the Traditional File Environment

  • Data redundancy
  • Program-Data dependence
  • Lack of flexibility
  • Poor security
  • Lack of data-sharing and availability
slide5

Figure 7-3

Traditional File Processing

slide6

Database Management System (DBMS)

  • Creates and maintains databases
  • Eliminates requirement for data definition statements
  • Acts as interface between application programs and physical data files
  • Separates logical and physical views of data
slide8

Components of DBMS

  • Data definition language: Specifies content and structure of database and defines each data element
  • Data manipulation language:
  • Manipulates data in a database
  • Data dictionary:Stores definitions of data elements, and data characteristics
slide10

Figure 7-6

Relational Data Model

slide11

Three Basic Operations in a Relational Database

  • Select:Creates subset of rows that meet specific criteria
  • Join:Combines relational tables to provide users with information
  • Project:Enables users to create new tables containing only relevant information
slide12

Figure 7-7

Three Basic Operations in a Relational Database

ensuring database integrity
Ensuring Database Integrity
  • Database integrity involves the maintenance of the logical and business rules of the database.
  • There are two kinds of “DB Integrity” that must be addressed:
    • Entity Integrity
    • Referential Integrity
entity integrity
Entity Integrity
  • Entity integrity deals with within-entity rules.
  • These rules deal with ranges and the permission of null values in attributes or possibly between records
examples of entity integrity
Examples of Entity Integrity
  • Data Type Integrity: very common and most basic. Checks only for “data type” compatibility with DB Schema, such as: numeric, character, logical, date format, etc.
  • Commonly referred to in GIS manuals as:
    • Range and List domains
      • Ranges - acceptable Numeric ranges for input
      • List - acceptable text entries or drop-down lists.
enforcing integrity
Enforcing Integrity
  • Not a trivial task!
  • Not all database management systems or GIS software enable users to “enforce data integrity” during attribute entry or edit sessions.
  • Therefore, the programmer or the Database Administrator must enforce and/or check for “Integrity.”
referential integrity
Referential Integrity
  • Referential integrity concerns two or more tables that are related.
  • Example: IF table A contains a foreign key that matches the primary key of table B THEN values of this foreign key either match the value of the primary key for a row in table B or must be null.
  • Necessary to avoid: Update anomaly, Delete anomaly.
slide20

Querying Databases:

Elements of SQL

Basic SQL Commands

  • SELECT: Specifies columns
  • FROM: Identifies tables or views
  • WHERE: Specifies conditions
using sql structured query language
Using SQL- Structured Query Language
  • SQL is a standard database protocol, adopted by most ‘relational’ databases
  • Provides syntax for data:
    • Definition
    • Retrieval
    • Functions (COUNT, SUM, MIN, MAX, etc)
    • Updates and Deletes
sql examples
SQL Examples
  • CREATE TABLE SALESREP
    • Item definition expression(s)
      • {item, type, (width)}
  • DELETE table
    • WHERE expression
data retrieval
Data Retrieval
  • SELECT list FROM table WHERE condition
  • list - a list of items or * for all items
    • WHERE - a logical expression limiting the number of records selected
    • can be combined with Boolean logic: AND, OR, NOT
    • ORDER may be used to format results
update tables
UPDATE tables
  • SET item = expression
  • WHERE expression
  • INSERT INTO table
  • VALUES …..
database normalization
Database Normalization
  • Normalization: The process of structuring data to minimize duplication and inconsistencies.
  • The process usually involves breaking down a single Table into two or more tables and defining relationships between those tables.
  • Normalization is usually done in stages, with each stage applying more rigorous rules to the types of information which can be stored in a table.
normalization
Normalization
  • Normalization: a process for analyzing the design of a relational database
    • Database Design - Arrangement of attributes into entities
  • It permits the identification of potential problems in your database design
  • Concepts related to Normalization:
    • KEYS and FUNCTIONAL DEPENDENCE
ex database normalization 1
Ex: Database Normalization (1)
  • Sample Student Activities DB Table
  • Poorly Designed
    • Non-unique records
      • John Smith
  • Test the Design by developing sample reports and queries
ex database normalization 2
Ex: Database Normalization (2)
  • Created a unique “ID” for each Record in the Activities Table
  • Required the creation of an “ID” look-up table for reporting (Students Table)
  • Converted the “Flat-File into a Relational Database
ex database normalization 3
Ex: Database Normalization (3)
  • Wasted Space
  • Redundant data entry
  • What about taking a 3rd Activity?
  • Query Difficulties - trying to find all swimmers
  • Data Inconsistencies - conflicting prices
ex database normalization 4
Ex: Database Normalization (4)
  • Students table is fine
  • Elimination of two columns and an Activities Table restructuring, Simplifies the Table
  • BUT, we still have Redundant data (activity fees) and data insertion anomalies.

Problem: If student #219

transfers we lose all references

to Golf and its price.

ex database normalization 5
Ex: Database Normalization (5)
  • Modify the Design to ensure that “every non-key field is dependent on the whole key”
  • Creation of the Participants Table, corrects our problems and forms a union between 2 tables.

This is a Better Design!

the normal forms
The Normal Forms
  • A series of logical steps to take to normalize data tables
  • First Normal Form
  • Second
  • Third
  • Boyce Codd
  • There’s more, but beyond scope of this
first normal form 1nf
First Normal Form (1NF)
  • All columns (fields) must be atomic
    • Means : no repeating items in columns

Solution: make a separate table for each set of attributes with a primary key (parser, append query)

Customers

CustomerID

Name

Orders

OrderID

Item CustomerID

OrderDate

second normal form 2nf
Second Normal Form (2NF)
  • In 1NF and every non-key column is fully dependent on the (entire) primary key
    • Means : Do(es) the key field(s) imply the rest of the fields? Do we need to know both OrderID and Item to know the Customer and Date? Clue: repeating fields

Solution: Remove to a separate table (Make Table)

Orders

OrderID

CustomerID

OrderDate

OrderDetails

OrderID

Item

third normal form 3nf
Third Normal Form (3NF)
  • In 2NF and every non-key column is mutually independent
    • means : Calculations
  • Solution: Put calculations in queries and forms

OrderDetails

OrderID

Item

Quantity

Price

Put expression in text control or in query:

=Quantity * Price

slide36

Data Warehousing and Datamining

Data warehouse

  • Supports reporting and query tools
  • Stores current and historical data
  • Consolidates data for management analysis and decision making
what is a data warehouse
What is a Data Warehouse?

"A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process".

Bill Inmon (1990)

"A Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated.…”

Anonymous

data mining
Data Mining
  • ON-LINE ANALYTICAL PROCESSING (OLAP):ability to manipulate, analyze large volumes of data from multiple perspectives
  • MINING: Seeking relationships that are not known in advance. A function of the software and data organization.
dw characteristics
DW Characteristics
  • Subject Oriented:Data that gives information about a particular subject instead of about a company's ongoing operations.
  • Integrated: Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole.
  • Time Variant: All data in the data warehouse is identified with a particular time period.
data acquisition
Data Acquisition
  • The process of moving company data from the source systems into the warehouse.
  • Often the most time-consuming and costly effort.
  • Performed with software products known as ETL (Extract/Transform/Load) tools.
  • Over 50 ETL tools on market.
data cleansing
Data Cleansing
  • Typically performed in conjunction with data acquisition.
  • A complicated process that validates and, if necessary, corrects the data before it is inserted.
  • AKA "data scrubbing" or "data quality assurance".