Chapters IV, XV, and XVI Database Management Systems Edited by: Dr. El-Masry
What’s the Difference Between a Database and a Spreadsheet??
Flat-File Environment Data User 1 Transactions Program 1 A,B,C User 2 Transactions Program 2 X,B,Y User 3 Transactions Program 3 L,B,M
Data Redundancy & Flat-File Problems(Advantages of DataBase Systems) • Data Storage - creates excessive storage costs of paper documents and/or magnetic form • Data Updating - any changes or additions must be performed multiple times • Currency of Information - potential problem of failing to update all affected files • Task-Data Dependency - user’s inability to obtain additional information as his or her needs change
Database Approach (Figure 4-2) Database User 1 Transactions Program 1 A, B, C, X, Y, L, M D B M S User 2 Transactions Program 2 User 3 Transactions Program 3
Elements of the Database Approach Database Administrator System Development Process System Requests Applications User Programs DBMS Transactions Data Definition Language Host Operating System U S E R S Transactions User Programs Data Manipulation Language Transactions User Programs Query Language Physical Database User Queries
4 Elements of the DB Approach • Users • Database Administrator (DBA) • DBMS—software that runs the DB • Physical database
Users • Access DB in two ways: • User programs • Queries (SQL)
DBA • Personal responsible for maintaining the DB: 1- Responsible for maintaining the data dictionary • Ex: What is meant by “customer” • Complete description of entities and their attributes in the DB 2- Responsible for the backup and recovery of the database 3- Responsible for the logical security of the database (establishing an effective password system).
Physical Database • The lowest level of the DB • The magnetic spots on the hard drive
DBMS Features • User Programs - make the presence of the DBMS transparent to the user • Direct Query - allows authorized users to access data without programming • Application Development - user created applications • Backup and Recovery - copies database • Database Usage Reporting - captures statistics on database usage (who, when, etc.) • Database Access - authorizes access to sections of the database
Internal Controls and DBMS • The purpose of the DBMS is to provide controlled accessto the database. • The DBMS is a special software system programmed to know which data elements each user is authorized to access and deny unauthorized requests of data.
Data Definition Language (DDL) • DDL is a programming language used to define the database to the DBMS (Pg. 111). • The DDL identifies the names and the relationship of all data elements, records, and files that constitute the database. • Viewing Levels (abstract): • internal view - physical arrangement of records (1) • conceptual view - representation of database (1) (ERD is conceptual view) (schema) • user view - the portion of the database each user views (many) (subschema) • Refer to the SQL DDL handout.
Data Manipulation Language (DML) • DML is the proprietary programming language that a particular DBMS uses to insert, update, process, store, and delete data (Pg. 111). • Entire user programs may be written in the DML, or selected DML commands can be inserted into universal programs, such as COBOL and FORTRAN. • Refer to the SQL DML handout.
Data Query Language (DQL) • The query capability permits end users and professional programmers to access data in the database without the need for conventional programs (Pg. 111). • IBM’s Structured Query Language (SQL) is a fourth-generation language that has emerged as the standard query language. • Refer to the SQL DQL handout.
Steps in Designing a Database • Phase I: Prepare the conceptual model (ERD) (Pgs. 554-556) • Identify the entities • Identify the relationships between the entities • Prepare the ER diagram (ERD) • Phase II: Specify the logical design (relational schema) • Form the tables and relationships (associations) between tables • Show the links between the primary keys and foreign keys. • Phase III: Normalize the tables • Final tables in third normal form should be free of anomalies. • Phase IV: Implement the physical design • Develop the physical structures • Populate the tables • Query the tables • Access Lab. in weeks 8 and 9.
Phase 1 Draw an ERD to capture the process. Prepare the Conceptual Model
ER-Diagram Symbols Relationship Attribute Entity Primary Key
An Entity ...is an • individual object • concept • event ...may be a specific • tangible object • intangible object • Entity Class is a collection of entities with similar attributes.
Attributes A property of an entity that we choose to record (of interest to an organization). CUSTOMER (entity) PRODUCT (entity) customer # product # name description address finish telephone no. price balance qty. on hand
Cardinalities (Maximum cardinalities method) (Pg. 566) Entity Relationship Entity 1 1 Sales- person Car Assigned 1 M Places Order Customer M M Vendor Inventory Supplies
Example of a Relationship LinkingTwo Entities M 1 ORDER CUSTOMER places Order Number Item # Name Number
ERD Exercises • Hospital ERD Exercise (handout) • University ERD Exercise (handout)
Relational Model Data Linkages (>1 table) • No explicit pointers are present. The data are viewed as a collection of independent tables. • Relations are formed by an attribute that is common to both tables in the relation. • Assignment of foreign keys: • if 1 to many association, the primary key on the one side is embedded as the foreign key on the many side. • if many to many association, create a new relation with the primary keys of the two entities as its primary key • if 1 to 1 association, either of the tables’ primary keys may be the foreign key and is embedded in the other table • Remark: Mandatory versus optional entities distinction: If there is a mandatory/optional relationship, then primary on the mandatory side becomes foreign on the optional side.
Line items Party to Pays for Made to Received from Increases Received by ER-Diagram using REA Model R E A M M M Sales 1 Inventory Sales person M M 1 Customer 1 M M Cash Collections 1 M Cash Cashier M 1 34
Phase 2 Create a list with all the relational tables and connect them. Specify the Logical Design (Relational schema)
Logical Data Structures • A particular method used to organize records in a database is called the database’s structure. • The objective is to develop this structure efficiently so that system developers can build the database in less time. Also, data can be accessed quickly and easily using the schema. • Types of structures include for example: • hierarchical (AKA the tree structure) • Relational (the one used in this class) • Important Rule: In a relational schema, arrows linking the primary key and the foreign key will always point from foreign key to primary key. • Example: Pg. 597 • Relational Schema Exercises: • Hospital schema exercise (handout) • University schema exercise (handout)
The Relational Model • The relational model portrays data in the form of two dimensional tables: • relation - the database table • attributes (data elements) - form columns • tuples (records) - form rows • data - the intersection of rows and columns
Properly Designed Relational Tables • No repeating values - All occurrences at the intersection of a row and column are a single value. • The attribute values in any column must all be of the same class. • Each column in a given table must be uniquely named. • Each row in the table must be unique in at least one attribute, which is the primary key.
Phase 3 Normalize the tables Transform unnormalized tables into tables in 3rd normal form .
Why Do We Normalize?(Three Types of Anomalies Pgs. 114-116) • Insertion Anomaly: A new item cannot be added to the table until at least one entity uses a particular attribute item. • Deletion Anomaly: If an attribute item used by only one entity is deleted, all information about that attribute item is lost. • Update Anomaly: A modification on an attribute must be made in each of the rows in which the attribute appears. • Anomalies can be corrected by creating relational tables.
Advantages of Relational Tables • Removes all three anomalies • Various items of interest (customers, inventory, sales) are stored in separate tables. • Space is used efficiently. • Very flexible. Users can form ad hoc relationships.
The Normalization Process • A process which systematically splits unnormalized complex tables into smaller tables that meet two conditions: • all nonkey (secondary) attributes in the table are dependent on the primary key • all nonkey attributes are independent of the other nonkey attributes • When unnormalized tables are split and reduced to third normal form, they must then be linked together by foreign keys. • Refer to handout on normalization • Normalization handout exercises
Steps in Normalization Table with repeating groups Remove repeating groups First normal form 1NF Remove partial dependencies Second normal form 2NF Remove transitive dependencies Third normal form 3NF Remove remaining anomalies Higher normal forms
Auditors/Accountants and Data Normalization • The update anomaly can generate conflicting and obsolete database values. • The insertion anomaly can result in unrecorded transactions and incomplete audit trails. • The deletion anomaly can cause the loss of accounting records and the destruction of audit trails. • Auditors and accountants should have an understanding of the data normalization process and be able to determine whether a database is properly normalized.
Phase 4 Implement in Access (or other DB program) Implement the Physical Design
Physical Database Design • Transition from theoretical to physical aspects of database • IS IT • Develop the physical structures • Decisions about software and hardware • Implementation • Establish internal controls • Populate the database with data • Produce physical user views (multiple) • Query the database • Access Lab. in weeks 8 and 9.