Organizing Data and Information - PowerPoint PPT Presentation

Organizing data and information l.jpg
1 / 62

Organizing Data and Information. Chapter 5. Topics in Ch. 5. General data management concepts and terms Two major approaches to data management Three database models The common functions performed by database management systems Three popular end-user database management systems

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Organizing Data and Information

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Organizing data and information l.jpg

Organizing Data and Information

Chapter 5

Topics in ch 5 l.jpg

Topics in Ch. 5

  • General data management concepts and terms

  • Two major approaches to data management

  • Three database models

  • The common functions performed by database management systems

  • Three popular end-user database management systems

  • Recent database developments.

Data and information l.jpg

Data and Information

What is data?

  • A collection of raw unprocessed facts, figures, and symbols

    What is information?

  • Data that is organized, meaningful, and useful

    How are data and information related?

  • Computers process data into information

Slide4 l.jpg

What is a database?

Includes a collection of data organized so you can access, retrieve, and use the data

Database software allows you to

Create a computerized database

Add, change, and delete data

Sort and retrieve data from the database

Create forms and reports for the data in the database

Database software also called a database management system (DBMS)

Hierarchy of data l.jpg

Hierarchy of data

  • Bit

    • A single binary digit with either a one or zero state

  • Character

    • A byte, which is the basic building block of information

  • Field

    • A combination of one or more characters

    • The smallest unit of data a user accesses

    • A field name uniquely identifies each field

  • Record

    • A collection of related fields

  • Files or Tables

    • A collection of related records

  • Database

    • A collection of related files or tables

The hierarchy of data l.jpg

The Hierarchy of Data

Data entities attributes and keys l.jpg

Data Entities, Attributes, and Keys

Data entities attributes and keys8 l.jpg

Data entities, Attributes, and Keys

  • Entity

    • Generalized class of people, places, or things for which data is collected, stored, and maintained

    • E.g. at a university, students, library books, and courses; the entity in figure 5.2 is employee.

  • Attributes

    • A characteristic of an entity

    • E.g., employee number, last name, first name, hire date, and department number in figure 5.2

  • Key

    • A field or set of fields that identifies a record

      • A primary key: a field or a set of fields the uniquely defines a record: e.g. SSN

      • A secondary key: an alternative key that can be used to access records: e.g., last name

The traditional versus the database approach to data management l.jpg

The Traditional versus the Database Approach to Data Management

Traditional approach file management approach l.jpg

What is a file management approach?

Each department or area within an organization has its own set of files

Records in one file often do not relate to the records in any other file

Traditional approach (File management approach)

The traditional approach l.jpg

The Traditional Approach

Weakness of the traditional approach l.jpg

Weakness of the “Traditional” Approach

  • Data redundancy

    • Wastes resources such as storage space and people’s time

    • The files often store the same fields in multiple files

    • Requires a larger storage capacity

    • Compromises data integrity

      • Data integrity

        • The degree to which data is correct

        • When a database contains errors, it loses integrity

        • Very important because computers and people use information to make decisions and take actions

  • Program-data dependence

    • Programs and data developed and organized for one application are incompatible with programs and data organized differently for another application

  • Isolated data

    • It is often difficult to access data stored in many files across several departments

Database approach l.jpg

Database Approach

What are the strengths of the database approach l.jpg

Reduced data redundancy

Improved data integrity

Shared data

Reduced development time

Easier modification and updating

so on ( Table 5.1 p. 175)

A high cost involved with acquiring and implementing a database

Increased vulnerability

What are the strengths of the database approach?

What are the weakness of the database approach?

Slide15 l.jpg

How do a database application and a file processing application differ in the way they might store data?

File processing example

Database example

Data modeling database models l.jpg

Data Modeling & Database Models

Data design issues l.jpg

Data Design Issues

  • Content: What data should be collected?

  • Access: What data should be given to what users?

  • Logical structure: How will the data be organized to make sense to a particular user?

  • Physical organization: Where will the data actually be located?

Data modeling l.jpg

Data Modeling

  • Logical design

    • How data are grouped together and how that are related to one another

  • Physical design

    • Combining or splitting some of the groups identified in the logical design

  • Planned data redundancy

  • Data model

    • A diagram used by a database designer to show the logical relationships among the entities in the database

    • Entity-relationship (ER) diagrams

Entity relationship er diagrams l.jpg

Entity-Relationship (ER) Diagrams

  • Fig 5.5

Correspondence with er model l.jpg

Correspondence with ER Model

  • Relations (tables) correspond with entity types and with many-to-many relationship types

  • Rows correspond with entity instances and with many-to-many relationship instances

  • Columns correspond with attributes

  • NOTE: The word relation (in relational database) is NOT the same same the word relationship (in ER model)

Key fields l.jpg

Key Fields

  • Keys are special fields that serve two main purposes:

    • Primary keys are unique identifiers of the relation in question. Examples include employee numbers, social security numbers, etc. This is how we can guarantee that all rows are unique

    • Foreign keys are identifiers that enable a dependent relation (on the many side of a relationship) to refer to its parent relation (on the one side of the relationship)

Basic er symbols l.jpg

Basic ER Symbols

Entity symbol

Attribute symbol

Relationship symbol

Slide23 l.jpg

(a) CUSTOMER entity type with simple attributes

(b) CUSTOMER relation

Slide24 l.jpg

(a) CUSTOMER entity type with composite attribute

(b) CUSTOMER relation with address detail

Slide25 l.jpg

Mapping the relationship

Foreign key

Database models l.jpg

Database Models

  • Every database and DBMS is based on a specific data model

  • Consists of rules and standards that define how the database organizes data

  • Defines how users view the organization of the data

  • Three types of database models

    • Hierarchical models

      • Data is organized in a top-down or inverted tree structure

    • Network models

      • An extension of the hierarchical model in which a member may have many owners

    • Relational models

      • Stores data in tables that consist of rows and columns

      • Each row has a primary key and each column has a unique name

Operations of a relational database l.jpg

Operations of a relational database

  • Selection operation

    • The operation extracts data from a row (record)

  • Projection operation

  • Join operation

Slide31 l.jpg

What is an example of a selection operation?

selection operation results

Database management system l.jpg

Database Management System

  • A DBMS is a data storage and retrieval system which permits data to be stored non-redundantly while making it appear to the user as if the data is well-integrated

  • Software that allows you to create, access, and manage a database

  • Available for many sizes and types of computers.

Database management system33 l.jpg











shared data

Database Management System

DBMS manages data resources like an operating system manages hardware resources

Database management systems dbmss l.jpg

Provide a user view

Schema: the logical description of an entire database

Subschema: a file that contains a description of part of the database and identifies which users can view or modify the data items in that part of the database

Provide tools to create & modify the database

Data Definition Language (DDL)

Data dictionary: a detailed description of all the data used in the database

Store & retrieve data

Manipulate data and produce reports

SQL (Structured Query Language)

Data manipulation language (DML)

Database Management Systems (DBMSs)

Schema l.jpg


  • A description of the entire database

  • Subschema

    • Shows only some of the records and their relationships in the database

  • Data Definition Language (DDL)

    • Command used to create a database

    • Commands that define a database, including creating, altering, and dropping tables and establishing constraints

    • Schemas and subschemas are described using a DDL.

Fig 5 11 l.jpg

Fig 5.11

Fig 5 12 l.jpg

Fig 5.12

Slide38 l.jpg

Data Dictionary

System tables that store metadata

Contains data about each file in the database and each field within those files

Sometimes called metadata

Users usually can view some of these tables

Users are restricted from updating them

Slide39 l.jpg

What is a default value for a field?

A value that the DBMS initially displays in a field

Data dictionary allows you to specify a default value for a field

Data dictionary benefits l.jpg

Data Dictionary Benefits

  • Reduced data redundancy

  • Faster program development

  • Easier data & information modification

  • Increased data reliability

Slide41 l.jpg

What is a query?

A request for specific data from the database

Four commonly used methods to access data

Query language

Query by example


Report generator

Slide42 l.jpg

What is Structured Query Language (SQL)?

The standard data manipulation language for relational databases

A query language that allows you to manage, update, and retrieve data

Has special keywords and rules that you include in SQL statements (Data manipulation language)

projection operation

join operation

selection operation

Slide43 l.jpg

Query by Example (QBE)

query results

wizard for querying the database

  • The program retrieves records that match criteria you enter in the form fields

  • Uses a graphical user interface

  • Available in MS Access

  • MS Access translates QBE to SQL and vice versa

Slide44 l.jpg

What is a form?

Sometimes called a data entry form

A window on the screen that provides areas for entering or changing data in a database

Concurrency control l.jpg

Concurrency Control

  • Problem – in a multi-user environment, simultaneous access to data can result in interference and data loss

  • Solution – Concurrency Control

    • A method of dealing with situation in which two or more people need to access the same record in a database at the same time.

    • Serializability –

      • Finish one transaction before starting another

    • Locking Mechanisms

      • The most common way of achieving serialization

      • Data that is retrieved for the purpose of updating is locked for the updater

      • No other user can perform update until unlocked

Popular dbms l.jpg

Popular DBMS

  • Popular mainframe-based DBMSs

    • Oracle

    • IBM DB/2

    • Microsoft SQL Server

  • Popular end-user DBMSs

    • MS Access

    • Lotus Approach

    • Inprise’s dBASE

Selecting a dbms l.jpg

Selecting a DBMS

  • Database size

  • Number of concurrent users

  • Performance

  • Integration

  • Features

  • Vendor

  • Cost

Database developments l.jpg

Database Developments

  • Distributed database

  • Data warehouses

    • Data marts

    • Data mining

  • Online analytical processing (OLAP)

Distributed database l.jpg

Distributed Database

  • Distributed Database: A single logical database that is spread physically across computers in multiple locations that are connected by a data communications link

    • Replicated database: a database that holds a duplicate set of frequently used data

  • Decentralized Database: A collection of independent databases on non-networked computers

They are NOT the same thing!

Reasons for distributed database l.jpg

Reasons forDistributed Database

  • Business unit autonomy and distribution

  • Data sharing

  • Data communication costs

  • Multiple application vendors

  • Database recovery

  • Transaction and analytic processing

Advantages of distributed database over centralized databases l.jpg

Advantages ofDistributed Database over Centralized Databases

  • Increased reliability/availability

  • Local control over data

  • Modular growth

  • Lower communication costs

  • Faster response for certain queries

Disadvantages of distributed database compared to centralized databases l.jpg

Disadvantages ofDistributed Database compared to Centralized databases

  • Software cost and complexity

  • Processing overhead

  • Data integrity exposure

  • Slower response for certain queries

Slide53 l.jpg

  • Data Warehouse:

    • A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision-making processes

    • Subject-oriented: e.g. customers, patients, students, products

    • Integrated: Consistent naming conventions, formats, encoding structures; from multiple data sources

    • Time-variant: Can study trends and changes

    • Nonupdatable: Read-only, periodically refreshed

  • Data Mart:

    • A data warehouse that is limited in scope, a single aspect of a company’s business, e.g. finance, inventory, or personnel

Slide54 l.jpg

An enterprise data warehouse

Need for data warehousing l.jpg

Comparison of operational and informational systems

Need for Data Warehousing

  • Integrated, company-wide view of high-quality information (from disparate databases)

  • Separation of operational and informational systems and data (for improved performance)

Slide56 l.jpg

Data Warehouse vs. Data Mart

D ata warehouses l.jpg

Data Warehouses

  • Data mining: an information analysis tool that involves the automated discovery of patterns and relationships among data

    • E.g.: identifies common characteristics of customers who buy the same products from your company

    • Predicts which customers are likely to leave your company and go to a competitor

    • Reveals the difference between a typical customer this month versus last month

On line analytical processing olap l.jpg

On-Line Analytical Processing (OLAP)

  • Used to store and deliver data warehouse information

  • The use of a set of graphical tools that provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques

  • Relational OLAP (ROLAP)

    • Traditional relational representation

  • Multidimensional OLAP (MOLAP)

    • Cube structure

  • OLAP Operations

    • Cube slicing – come up with 2-D view of data

    • Drill-down – going from summary to more detailed views

Slide59 l.jpg

Figure 11-22: Slicing a data cube

Slide60 l.jpg

Summary report

Figure 11-23:

Example of drill-down

Drill-down with color added

Open database connectivity odbc l.jpg

Open Database Connectivity (ODBC)

  • Standards that make it easier to transfer and access data among different databases.

  • Advantages of ODBC

    • ODBC can be used to export, import, or link tables between different applications

  • For example, a table in an Access database can be exported to a Paradox database or a spreadsheet. Table and data can also be imported using ODBC.

  • Linking allows an application, such as a database, to use an object in another application, such as a spreadsheet, without actually importing the object into the application. An advantage of linking is that as the object is updated in its original location, the changes will be reflected in the document to which it is linked.

Object relational database management systems l.jpg

Object-Relational Database Management Systems

  • A DBMS capable of manipulating audio, video, and graphical data

  • Hypertext

  • Hypermedia

  • Spatial data technology

    • Uses an object-relational database to store & access data via location. For example, NASA’s database of satellite photos of the Earth could be studied & manipulated to identify sources of pollutants.

  • Login