Chapter 4 Logical Database Design and the Relational Model

1 / 123

# Chapter 4 Logical Database Design and the Relational Model - PowerPoint PPT Presentation

Chapter 4 Logical Database Design and the Relational Model. Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration Gonzaga University Spokane, WA 99258 [email protected] Objectives. Define terms List five properties of relations

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Chapter 4 Logical Database Design and the Relational Model' - sheila

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Chapter 4Logical Database Design andthe Relational Model

Jason C. H. Chen, Ph.D.

Professor of MIS

Gonzaga University

Spokane, WA 99258

[email protected]

Objectives
• Define terms
• List five properties of relations
• State two properties of candidate keys
• Define first, second, and third normal form
• Describe problems from merging relations
• Transform E-R and EER diagrams to relations
• Create tables with entity and relational integrity constraints
• Use normalization to convert anomalous tables to well-structured relations

Problem Solving for Modeling a Database Project

User interview &

Integrated Model

ER or EER or OO

?????

?????

?????

Normalization

(3NF)

IMPLEMENTATION

Study and Analyze

w/Team

Relation
• Definition: A relation is a named, two-dimensional table of data
• Table is made up of rows (records), and columns (attribute or field)
• Not all tables qualify as relations
• Requirements:
• Every relation has a unique name.
• Every attribute value is atomic (not multivalued, not composite)
• Every row is unique (can’t have two rows with exactly the same values for all their fields)
• Attributes (columns) in tables have unique names
• The order of the columns is irrelevant
• The order of the rows is irrelevant

NOTE: all relations are in 1st Normal form

Correspondence with ER Model
• Relations (tables) correspond with entity types and with many-to-many relationship types
• Rows correspond with entity instances and with many-to-many relationship instances
• Columns correspond with attributes
• NOTE: The word relation (in relational database) is NOT the same as the word relationship (in ER model)
Key Fields
• Keys are special fields that serve two main purposes:
• Primary keys are unique identifiers of the relation in question. Examples include employee numbers, social security numbers, etc. This is how we can guarantee that all rows are unique
• Foreign keys are identifiers that enable a dependent relation (on the many side of a relationship) to refer to its parent relation (on the one side of the relationship)
• Keys can be simple (a single field) or composite (more than one field)
• Surrogate key
• Keys usually are used as indexes to speed up the response to user queries (More on this in Ch. 5)

Primary Key

Foreign Key (implements 1:N relationship between customer and order)

Combined, these are a composite primary key (uniquely identifies the order line)…individually they are foreign keys (implement M:N relationship between order and product)

Figure 4-3 Schema for four relations (Pine Valley Furniture Company)

Fig. 4-3: Schema for four relations (Pine Valley Furniture)

Graphical and Text Representations

CUSTOMER(Customer_ID,

City,State,Zip)

ORDER(Order_ID,

Order_Date,Customer_ID)

ORDER_LINE(Order_ID,

Product_ID,Quantity)

PRODUCT(Product_ID,

Product_Description,

Product_Finish,Unit_Price,

On_Hand)

EmpID Name DeptName Salary

100 Margaret Simpson Marketing 48,000

140 Allen Beeton Accounting 52,000

110 Chris Lucero Info. System 43,000

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000

Fig. 4-1: EMPLOYEE1 Relation with sample data

EMPLOYEE1

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

Java 8/12/200X

Fig. 4-2: Eliminating multi-valued attributes

(a) Table with repeating groups or multi-valued attributes (Un-Normalized)

EMPLOYEE

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

Java 8/12/200X

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

Figure 4-2 (a) Table with repeating groups – how to “remove” them (and solve the problem)

Figure 4-2 (b) EMPLOYEE2 relation

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

Fig. 4-2: Eliminating multi-valued attributes

(b) EMPLOYEE2 Relation (Normalized)

EMPLOYEE2

Constraints
• Domain Constraints
• Allowable values for an attribute.
• A domain definition contains: domain name, data type, size, meaning, and allowable values/range (if applicable).
• Entity Integrity
• No primary key attribute may be null.
• Referential Integrity
• A relationship between primary key and foreign key.
• Operational Constraints
• Business rules (see Chapter 3)
Integrity Constraints
• Referential Integrity – rule that states that any foreign key value (on the relation of the many side) MUST match a primary key value in the relation of the one side. (Or the foreign key can be null)
• For example: Delete Rules
• Restrict – don’t allow delete of “parent” side if related rows exist in “dependent” side
• Cascade – automatically delete “dependent” side rows that correspond with the “parent” side row to be deleted

e.g., DROP TABLE tablename CASCADE CONSTRAINTS; (p.110 of Oracle 11g)

• Set-to-Null – set the foreign key in the dependent side to null if deleting from the parent side  not allowed for weak entities

pk

fk

pk

Referential integrity constraints are drawn via arrows from dependent to parent table

cpk/pk

fk

fk

pk

Referential integrity constraints are implemented with foreign key to primary key references

Figure 4-6 SQL table definitions

Well-Structured Relations
• A well-structured relation contains minimal redundancy and allows users to insert, modify, and delete the rows in a table without errors or inconsistencies.
• The following anomalies should be removed for a well-structured relation:
• Insertion Anomaly
• Deletion Anomaly
• Modification Anomaly

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

Is EMPLOYEE2 a Well-Structured relation?

EMPLOYEE2

EMPLOYEE2

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

Insertion Anomaly: Inserting a new row, the user must supply values for both EmpID (PK) and CourseTitle (CK and FK). This is an (insertion) anomaly, since the user should be able to enter employee data without knowing (supplying) course (title) data.

EMPLOYEE2

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

Deletion Anomaly: Deleting the employee number 140, it results in losing not only the employee’s information but also the course had an offering that completed on that date.

EMPLOYEE2

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

Modification Anomaly: If the employee number 100 gets a salary increase, we must record the increase in each of the rows for that employee (two occurences); otherwise the data will be inconsistent.

EMPLOYEE2

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

EMPLOYEE1

EMP_COURSE

EmpID Name DeptName Salary

100 Margaret Simpson Marketing 48,000

140 Allen Beet Accounting 52,000

110 Chris Lucero Info. System 43,000

190 Lorenzo Davis Finance 55,000

150 Sususan Martin Marketing 42,000

EmpIDCourse Date

Title Completed

100 SPSS 6/19/200X

100 Surveys 10/7/200X

140 Tax Acc 12/8/200X

110 SPSS 1/12/200X

110 C++ 4/22/200X

150 SPSS 6/19/200X

150 Java 8/12/200X

??

Fig. 4-7: EMP_COURSE: Normalized Relations from EMPLOYEE2

EMPLOYEE2

Is there any ano-maly?

User interview &

Integrated Model

ER or EER or OO

Relations Transformation

(Seven Cases)

Transformation to Relations

?????

Normalization

(3NF)

IMPLEMENTATION

Problem Solving for Modeling a Database Project

Study and Analyze

w/Team

Seven Cases of Transforming EE-R Diagrams into Relations

1. Map Regular Entities

2. Map Weak Entities

3. Map Binary Relationships

4. Map Associative Entities

5. Map Unary Relationships

6. Map Ternary (and n-ary) Relationships

7. Map Supertype/Subtype Relationships

Transforming EE-R Diagrams into Relations

1. Map Regular Entities to Relations

• E-R attributes map directly onto the relation (Fig. 4-8)
• Composite attributes: Use only their simple, component attributes (Fig. 4-9).
• Multi-valued Attribute : Becomes a separate relation with a foreign key taken from the superior entity (Fig. 4-10).

Fig. 4-8: Mapping the regular entity CUSTOMER

(a) CUSTOMER entity type

(b) CUSTOMER relation

[Same name on relation and entity type]

Figure 4-9 Mapping a composite attribute

(a) CUSTOMER entity type with composite attribute

(b) CUSTOMER relation with address detail

[Use only their simple, component attributes]

EMPLOYEE

EmployeeName

EmployeeID

EmployeeID

Skill

Fig. 4-10: Mapping an entity with a multivalued attribute

(a) Employee entity type with multivalued attribute

[Two relations created with one containing all of the attributes except the multi-valued attribute, and the second one contains the pk (on the first one) and the multi-valued attribute]

Multivalued attribute becomes a separate relation with foreign key

(b) Mapping a multivalued attribute

One–to–many relationship between original entity and new relation

Break ! (Ch. 4 - Part I)

In class exercise

#1-I (a) (p.193), apply Figure 2-8.

HW

# 1-I (c), apply

Figure 2-11.a

Fig. 3-8:

Figure 2-8

User interview &

Integrated Model

ER or EER or OO

Relations Transformation

(Seven Cases)

Transformation to Relations

?????

Normalization

(3NF)

IMPLEMENTATION

Problem Solving for Modeling a Database Project

Study and Analyze

w/Team

Seven Cases of Transforming EE-R Diagrams into Relations

1. Map Regular Entities

2. Map Weak Entities

3. Map Binary Relationships

4. Map Associative Entities

5. Map Unary Relationships

6. Map Ternary (and n-ary) Relationships

7. Map Supertype/Subtype Relationships

Transforming EER Diagrams into Relations

2. Mapping Weak Entities

• Becomes a separate relation with a foreign key taken from the superior entity
• Primary key composed of:
• Partial identifier of weak entity
• Primary key of identifying relation (strong entity) (Fig. 4-11)

Fig. 4-11: Example of mapping a weak entity

(a) Weak entity DEPENDENT

Fig. 4-11: (b) Relations resulting from weak entity

[Becomes a separate relation with a foreign key taken from the superior entity]

PK

NOTE: the domain constraint for the foreign key should NOT allow null value if DEPENDENT is a weak entity

CPK

FK

Question: Why need all FOUR attributes to be a CPK?

Transforming EE-R Diagrams Into Relations

3. Map Binary Relationships

• One-to-Many - Primary key on the one side becomes a foreign key on the many side (Fig. 4-12).
• Many-to-Many - Create a new relation with the primary keys of the two entities as its primary key (Fig. 4-13).
• One-to-One - Primary key on the mandatory side becomes a foreign key on the optional side (Fig. 4-14).

Fig. 4-12: Example of mapping a 1:M relationship

(a) Relationship between customers and orders

Note the mandatory one

[Primary key on the one side becomes a foreign key on the many side]

Fig. 4-12: (b) Mapping the relationship

Again, no null value in the foreign key…this is because of the mandatory minimum cardinality

Foreign key

Figure 4-13 Example of mapping an M:N relationship

a) Completes relationship (M:N)

The Completes relationship will need to become a separate relation

Composite primary key (cpk)

Figure 4-13 Example of mapping an M:N relationship (cont.)

b) Three resulting relations

New intersection relation

Foreign key

Foreign key

mandatory

optional

Figure 4-14 Example of mapping a binary 1:1 relationship

a) In_charge relationship (1:1)

Often in 1:1 relationships, one direction is optional.

Fig. 4-14: (b) Resulting relations

mandatory

Same domain as Nurse_ID

PK

optional

FK

[Primary key on the mandatory side becomes a foreign key on the optional side]

Transforming EE-R Diagrams Into Relations

4. Map Associative Entities

• Identifier Not Assigned
• Default primary key for the association relation is composed of the primary keys of the two entities (as in M:N relationship) (Fig. 4-15)
• Identifier Assigned
• It is natural and familiar to end-users.
• Default identifier may not be unique. (Fig. 4-16).

Composite primary key formed from the two foreign keys

Figure 4-15 Example of mapping an associative entity (cont.)

b) Three resulting relations

[Default primary key for the association relation is the primary keys of the two entities]

PK

cpk

fk

fk

PK

Figure 4-16: Mapping an associative entity

(a) Associative entity (SHIPMENT)

(b) Three resulting relations

Primary key differs from foreign keys

Transforming EE-R Diagrams Into Relations

5. Map Unary Relationships

• One-to-Many
• Recursive foreign key in the same relation (Fig. 4-17).
• A recursive FK is a FK in a relation that references the PK values of that same relation. It must have the same domain as the PK.
• Many-to-Many - Bill-of-materials: Two relations:
• One for the entity type.
• One for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity. (Fig. 4-18).

EMPLOYEE

EmployeeID

EmployeeName

ManagerID

EmployeeDateOfBirth

Figure 4-17 Mapping a unary 1:N relationship

A recursive FK is a FK in a relation that references the PK values of that same relation.

It must have the same domain as the PK.

(a) EMPLOYEE entity with unary relationship

(b) EMPLOYEE relation with recursive foreign key

ITEM

ItemNo

ComponentNo

Quantity

ItemNo

ItemDescription

ItemUnitCost

COMPONENT

Figure 4-18: Mapping a unary M:N relationship

One for the entity type.

One for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity.

(a) Bill-of-materials relationships (M:N)

(b) ITEM and COMPONENT relations

Transforming EE-R Diagrams Into Relations

6. Map Ternary (and n-ary) Relationships

• One relation for each entity and one for the associative entity.
• Associative entity has foreign keys to each entity in the relationship
• (Fig. 4-19).

Figure 4-19 Mapping a ternary relationship

a) PATIENT TREATMENT Ternary relationship with associative entity

A patient may receive a treatment one in the morning, then the same treatment in the afternoon.

Figure 4-19 Mapping a ternary relationship (cont.)

b) Mapping the ternary relationship PATIENT TREATMENT

Figure 4-19 Mapping a ternary relationship (cont.)

b) Mapping the ternary relationship PATIENT TREATMENT

It would be better to create a surrogate key like Treatment#

How to create it with Oracle?

But this makes a very cumbersome key…

Remember that the primary key MUST be unique

This is why treatment date and time are included in the composite primary key

Transforming EER Diagrams into Relations

7. Mapping Supertype/Subtype Relationships

• One relation for supertype and for each subtype
• Supertype attributes (including identifier and subtype discriminator) go into supertype relation
• Subtype attributes go into each subtype; primary key of supertype relation also becomes primary key of subtype relation
• 1:1 relationship established between supertype and each subtype, with supertype as primary table (Fig. 4-20).

Figure 4-21

Mapping Supertype/subtype relationships to relations

SELECT*

FROMEMPLOYEE,

SALARIED_EMPLOYEE

WHREEmployee_Number=

S_Employee_Number;

These are implemented as one-to-one relationships

Display a table that contains all the attributes for SALARIED_EMPLOYEE

Break ! (Ch. 4 - Part II)

Figure 3-6(b)

In class exercise

Transform it to relations (NOT 3NF)

#2-III-a , (p.193, apply Figure 3-6.b

HW

#2-III-d, (p.193), apply

Figure 3-10

Partial Specialization

Fig. 3-8:

User interview &

Integrated Model

ER or EER or OO

Relations Transformation

(Seven Cases)

Transformation to Relations

Normalization

Normalization

(3NF)

IMPLEMENTATION

Problem Solving for Modeling a Database Project

Study and Analyze

w/Team

Seven Cases of Transforming EE-R Diagrams into Relations

1. Map Regular Entities

2. Map Weak Entities

3. Map Binary Relationships

4. Map Associative Entities

5. Map Unary Relationships

6. Map Ternary (and n-ary) Relationships

7. Map Supertype/Subtype Relationships

Next Topic
• Next topic is the most important topic (theory) in this database management class.
• What is it?
• Normalization
Data Normalization
• The process of decomposing relations with anomalies to produce smaller, well-structured and stable relations
• Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data
Well-Structured Relations
• A relation that contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies
• Goal is to avoid (minimize) anomalies
• Insertion Anomaly – adding new rows forces user to create duplicate data
• Deletion Anomaly – deleting rows may cause a loss of data that would be needed for other future rows
• Modification Anomaly – changing data in a row forces changes to other rows because of duplication

General rule of thumb: a table should not pertain to more than one entity type

Example – Figure 4.2b

Question – Is this a relation?

Answer – Yes: unique rows and no multivalued attributes

Question – What’s the primary key?

a composite key

Anomalies in this Table
• Insertion – can’t enter a new employee without having the employee take a class
• Deletion – if we remove employee 140, we lose information about the existence of a Tax Acc class
• Modification – giving a salary increase to employee 100 forces us to update multiple records

Why do these anomalies exist?

• Because are two themes (entity types – what are they?) in this one relation (two themes, entity types, were combined). This results in duplication, and an unnecessary dependency between the entities
Functional Dependencies and Keys
• Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute.
• Candidate Key
• A unique identifier. One of the candidate keys will become the primary key
• E.g. perhaps there is both credit card number and SS# in a table…in this case both are candidate keys
• Each non-key field is functionally dependent on every candidate key
Figure 4-23: Representing Functional Dependencies (cont.)

(a) Functional dependencies in EMPLOYEE1 Fig. 4-2a)

(b) Functional dependencies in EMPLOYEE2 (Fig. 4-2b)

First Normal Form
• No multivalued attributes
• Every attribute value is atomic (singled-value)
• Fig. 4-2a is not in 1st Normal Form (multivalued attributes)  it is not a relation
• Fig. 4-2b is in 1st Normal form (but not in a well-structured relation)
• All relations are in 1st Normal Form
• The following example is not from the text will be illustrated for Normalization process.

Figure 4-2 (a)

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

Java 8/12/200X

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

Figure 4-2 (b)

Second Normal Form
• 1NF and every non-key attribute is fully functionally dependent on the primary key.
• Every non-key attribute must be defined by the entire key (either a single PK or a CK), not by only part of the key.
• No partial functional dependencies.
• Fig. 4-2b is NOT in 2nd Normal Form

Remove Partial Dependencies

Remove …

Figure: 4-22 Steps in normalization

Table with Multivalued

attributes

Remove Multivalued Attributes

First normal

form (1NF)

Second normal

form(2NF)

Third normal

form (3NF)

Remove remaining anomalies resulting from multiple candidate keys

Boyce-Codd normal

form (BC-NF)

Remove Multivalued Dependencies

Fourth normal

Form (4NF)

Remove Remaining Anomalies

Fifth normal

form (5NF)

A Process of 1NF to 2NF (EMPLOYEE2 - - 1NF)

Dependency on entire primary key

EmpID

CourseTitle

Name

DeptName

Salary

DateCompleted

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

Dependency on only part of the key

(b) Functional Dependencies in EMPLOYEE2

EmpID

CourseTitle

Name

DeptName

Salary

DateCompleted

EmpID, CourseTitle  DateCompleted

EmpID  Name, DeptName, Salary

Functional Dependencies in EMPLOYEE2

Dependency on entire primary key

Dependency on only part of the key (partial dep.)

Therefore, NOT in 2nd Normal Form!!

EMPLOYEE1

EMP_COURSE

2NF

3NF ?

EmpID

Name

DeptName

Salary

EmpID

CourseTitle

DateCompleted

Summary on Normalization: from 1NF to 2NF

EMPLOYEE2

EmpID

CourseTitle

Name

DeptName

Salary

DateCompleted

Partial Depend.

EmpID Name DeptName Salary Course Date

Title Completed

100 Margaret Simpson Marketing 48,000 SPSS 6/19/200X

100 Margaret Simpson Marketing 48,000 Surveys 10/7/200X

140 Allen Beeton Accounting 52,000 Tax Acc 12/8/200X

110 Chris Lucero Info. System 43,000 SPSS 1/12/200X

110 Chris Lucero Info. System 43,000 C++ 4/22/200X

190 Lorenzo Davis Finance 55,000

150 Susan Martin Marketing 42,000 SPSS 6/16/200X

150 Susan Martin Marketing 42,000 Java 8/12/200X

EMPLOYEE1

EMP_COURSE

EmpIDCourse Date

Title Completed

100 SPSS 6/19/200X

100 Surveys 10/7/200X

140 Tax Acc 12/8/200X

110 SPSS 1/12/200X

110 C++ 4/22/200X

150 SPSS 6/19/200X

150 Java 8/12/200X

EmpID Name DeptName Salary

100 Margaret Simpson Marketing 48,000

140 Allen Beet Accounting 52,000

110 Chris Lucero Info. System 43,000

190 Lorenzo Davis Finance 55,000

150 Sususan Martin Marketing 42,000

Summary on Normalization

EMPLOYEE2 (1NF)

3NF

Is there any ano-maly (ies)?

Third Normal Form
• 2NF and no transitive dependencies (functional dependency between non-key attributes.)
• Other examples

Remove Partial Dependencies

Remove Transitive Dependencies

Figure: 4-22 Steps in normalization

Table with Multivalued

attributes

Remove Multivalued Attributes

First normal

form (1NF)

Second normal

form(2NF)

Third normal

form (3NF)

Remove remaining anomalies resulting from multiple candidate keys

Boyce-Codd normal

form (BC-NF)

Remove Multivalued Dependencies

Fourth normal

Form (4NF)

Remove Remaining Anomalies

Fifth normal

form (5NF)

Extra example-1: Relation with transitive dependency

(a) SALES relation with simple data

SALES

What Anomalies might be in SALES relation?

WHY? Because it is …

Not in the 3NF (why?)

(transitive dependency)

• Insertion anomaly ?
• Deletion anomaly ?
• Modification anomaly ?

SALES

N

Extra example-1: Relation with transitive dependency

CustID  Name

CustID  Salesperson

CustID  Region

All this is OK

(2nd NF)

BUT

CustID  Salesperson  Region

implies

Extra example-1:Relation with transitive dependency

CustID  Name

CustID  Salesperson

CustID  Region

and

Salesperson  Region

All this is OK

(2nd NF)

CustID  Region

Transitive dependency

(not in 3rd NF)

Extra example-1:(b) Relations in 3NF

Remove a transitive dependency

Extra example-1: Removing a transitive dependency

(a) Decomposing the SALES relation

SALES1

S_PERSON

Snum Origin Destination Distance

409 Seattle Denver 1,537

618 Chicago Dallas 1,058

723 Boston Atlanta 1,214

824 Denver Los Angeles 1,150

629 Minneapolis St. Louis 587

Extra example-2:Relation transitive dependencies

?NF

Insertion anomaly?

Deletion anomaly?

Modification anomaly?

SHIPMENT

Snum

Origin

Destination

Distance

Snum Origin Destination

409 Seattle Denver

618 Chicago Dallas

723 Boston Atlanta

824 Denver Los Angeles

629 Minneapolis St. Louis

OriginDestination Distance

Seattle Denver 1,537

Chicago Dallas 1,058

Boston Atlanta 1,214

Denver Los Angeles 1,150

Minneapolis St. Louis 587

Extra example-2:Relation transitive dependencies

?NF

EMPLOYEE1

EMP_COURSE

2NF

3NF ?

EmpID

Name

DeptName

Salary

EmpID

CourseTitle

DateCompleted

Summary on Normalization: from 1NF to 2NF

EMPLOYEE2

EmpID

CourseTitle

Name

DeptName

Salary

DateCompleted

Partial Depend.

Remove Multivalued Attributes

Remove Partial Dependencies

Remove Transitive Dependencies

Figure: 4-22 Steps in normalization

Table with Multivalued

attributes

First normal

form (1NF)

Second normal

form(2NF)

Third normal

form (3NF)

Remove remaining anomalies resulting from multiple candidate keys

Boyce-Codd normal

form (BC-NF)

Remove Multivalued Dependencies

Fourth normal

Form (4NF)

Remove Remaining Anomalies

Fifth normal

form (5NF)

User interview &

Integrated Model

Logical Model

(ERD or E/ERD)

(Seven) Relations Transformation

NORMALIZATION (up to 3NF)

Implementation

(w/Physical Model)

Steps of Database Development

User view-1

User view-2

User view-3

User view-N

Conceptual Schema (Model)

(more relations produced)

(more tables created)

END of CHAPTER 4

In class exercise

(p.193)

#3- a,b,c,d

HW (using Visio)

(p.193-194)

#7 - a,b,c,d,e,f

Bonus

#8 – a,b,c,d

In-class Quiz next class

Merging Relations(View Integration)
• In a project development process, there may be a number of separate E-R diagrams and user views created and some of them may be redundant.
• Therefore, some relations should be merged to remove the redundancy.
Merging Relations(View Integration - An example)

No_Years)

EMPLOYEE(EmployeeID, Name, Address, Phone, Jobcode, No_Years)

Merging Relations(Problems on View Integration)

Issues to watch out for when merging entities from different ER models:

• Synonyms: Different names, same meaning.
• Homonyms: Same name, different meanings.
• Transitive Dependencies:
• dependencies–even if relations are in 3NF prior to merging, they may not be after merging
• Supertype/Subtype
• May be hidden prior to merging
Problems on View Integration
• Synonyms: Different names, same meaning.

STUDENT1(StudentID, Name)

• Homonyms: Same name, different meanings.

Problems on View Integration
• Synonyms: Different names, same meaning.

STUDENT1(StudentID, Name)

• Homonyms: Same name, different meanings.

Problems on View Integration
• Transitive Dependencies

STUDENT1(StudentID, Major)

the result is ...

??NF

and after removing transitive dependency

STUDENT_Major(StudentID, Major)

Problems on View Integration
• Supertype/Subtype

PATIENT2(PatientID, Room_No)

Two subtypes are hidden prior to merging

RESIDENT_PATIENT(PatientID, RoomNo)

OUTPATIENT(PatientID, DateTreated)

Enterprise Keys
• Primary keys that are unique in the whole database, not just within a single relation
• Corresponds with the concept of an object ID in object-oriented systems

Figure 4-31 Enterprise keys

a) Relations with enterprise key

b) Sample data with enterprise key

Figure 4-31 Enterprise keys

c) Relations after adding PERSON relation

d) Sample data after adding PERSON relation

Logical Database Design

You have just learned and completed one of the most important concepts and theories, integrity constraints and normalization, for developing a quality of database.

User interviews &

Integrated Model

ER or EER or OO

Relations Transformation

(Seven Cases)

Transformation to Relations

Normalization

Normalization

(3NF)

IMPLEMENTATION

Problem Solving for Modeling a Database Project

Study and Analyze

w/Team

MVC_Hospital HW

Logical Design Phase

Draw a entity-relationship diagram (enterprise model) for Mountain View community Hospital, based on the narrative description of the case and this handout (but the entities are from the five (5) figures shown above). You should create a file and turn in with a hardcopy (called MVC_Hospital_DD.docx) contains the following materials:

1. Read and employ materials from chapters 2,3 and 4.

2. Include entities, associations (with detail cardinality), and attributes.

3. Determine and draw the order of entering data

Next phase -- implementation, create SQL script file for table structure and data base (values).

MVC_Hospital

Create two script files:

1. a script file (MVC_Hospital_Lastname_Firstname.SQL) that contains a set of commands of DROP, CREATE, and INSERT that performs the same functions as in the script file of Northwoods.sql

2. Second script file (MVC_Hospital_QUERIES_Lastname_Firstname.SQL) containing a set of SQL commands that answer the questions. Test the query one/time successfully.

Note that you may need other SQL commands and create database views (see pptx file for introducing VIEWS) for the purpose of answering questions easily. You may need to read other references related the SQL from the text book (e.g., Chapter 7 of McFadden).

3. Spool (2) and save it in the file MVC_Hospital_Spool_Lastname_Firstname.LST. Finally, you create a new file (*.docx) containing all work done from Part I and save them in the file MVC_Hospital_Complete_Lastname_Firstname.docx. The file should contain your class information and personal information.

The Normalization Example in the Text Book

Figure 4-24 INVOICE (Pine Valley Furniture Company)

Figure 4-25 INVOICE Data

Table with multivalued attributes, not in 1st normal form

Note: this is NOT a relation. WHY?

Figure 4-26 INVOICE relation (1NF)

Table with no multivalued attributes and unique rows

Note: this is relation, but not a well-structured one. WHY?

Anomalies in this Table

Insertion–if new product is ordered for order 1007 of existing customer, customer data must be re-entered, causing duplication

Deletion–if we delete the Dining Table from Order 1006, we lose information concerning this item\'s finish and price

Update–changing the price of product ID 4 requires update in several records

• Why do these anomalies exist?
• Because there are multiple themes (entity types) in one relation. This results in duplication and an unnecessary dependency between the entities

110

Figure 4-27 Functional dependency diagram for INVOICE

Order_ID  Order_Date, Customer_ID, Customer_Name, Customer_Address

Product_ID  Product_Description, Product_Finish, Unit_Price

Order_ID, Product_ID  Order_Quantity

Therefore, NOT in 2nd Normal Form

Figure 4-28 Partial Dependencies were Removed (2NF)

Getting it into Second Normal Form

Partial dependencies are removed, but there are still transitive dependencies

Figure 4-29 Transitive Dependencies were Removed (3NF)

Two Relations Remained

Getting it into Third Normal Form

Figure 4-29 Transitive Dependencies were Removed (3NF)

Two Relations Remained

Getting it into Third Normal Form

Getting it into Third Normal Form

Boyce-Codd Normal Form (BCNF)
• A table is in Boyce-Codd normal form (BCNF) if every determinant in the table is a candidate key.
• (A determinant is any attribute whose value determines other values with a row.)
• If a table contains only one candidate key, the 3NF and the BCNF are equivalent.
• BCNF is a special case of 3NF.
Other Normal Forms (from Appendix B)
• Boyce-Codd NF
• All determinants are candidate keys…there is no determinant that is not a unique identifier
• 4th NF
• No multivalued dependencies
• 5th NF
• No “lossless joins”
• Domain-key NF
• The “ultimate” NF…perfect elimination of all possible anomalies

Remove Multivalued Attributes

Remove Partial Dependencies

Remove Transitive Dependencies

(every determinant is

a candidate key)

Remove remaining anomalies resulting from multiple candidate keys

Figure: 4-22 Steps in normalization

Table with Multivalued

attributes

First normal

form (1NF)

Second normal

form(2NF)

Third normal

form(3NF)

Boyce-Codd normal

form (BC-NF)

Remove Multivalued Dependencies

Fourth normal

Form (4NF)

Remove Remaining Anomalies

Fifth normal

form (5NF)

Relational Definitions
• Tuple
• Attribute
• View
Relational Concepts
• Relational Algebra
• Relational Calculus
• Relational Operations
• SELECT
• PROJECT
• JOIN
• Equijoin - Join field appears twice.
• Natural Join - Join field appears once.
Dr. Edgar F. (Ted) Codd
• To fellow pilots and aircrew in the Royal Air Force during World War II and the dons at Oxford: These people were the source of my determination to fight for what I believed was right during the ten or more years in which government, industry, and commerce were strongly opposed to the relational approach to database management.