Loading in 5 sec....

Functional Dependency Armstrong’s Axioms Closure of FDs Closure of attribute(X) – Find KeysPowerPoint Presentation

Functional Dependency Armstrong’s Axioms Closure of FDs Closure of attribute(X) – Find Keys

- 180 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Functional Dependency Armstrong’s Axioms Closure of FDs Closure of attribute(X) – Find Keys' - callum-mcknight

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Chapter 15Functional Dependencies and Normalization for Relational DatabasesChapter 16Relational Database Design Algorithms and Further Dependencies

- Functional Dependency
- Armstrong’s Axioms
- Closure of FDs
- Closure of attribute(X) – Find Keys
- Minimal set of FDs
- Normalization
- Lossless Join Decomposition

Outline

- 1 Informal Design Guidelines for Relational Databases
- 1.1Semantics of the Relation Attributes
- 1.2 Redundant Information in Tuples and Update Anomalies
- 1.3 Null Values in Tuples
- 1.4 Spurious Tuples

- 2 Functional Dependencies (FDs)
- 2.1 Definition of FD
- 2.2 Inference Rules for FDs
- 2.3 Equivalence of Sets of FDs
- 2.4 Minimal Sets of FDs

Outline

3 Normal Forms Based on Primary Keys

3.1 Normalization of Relations

3.2 Practical Use of Normal Forms

3.3 Definitions of Keys and Attributes Participating in Keys

3.4 First Normal Form

3.5 Second Normal Form

3.6 Third Normal Form

4 General Normal Form Definitions (For Multiple Keys)

5 BCNF (Boyce-Codd Normal Form)

- What is relational database design?
- The grouping of attributes to form "good" relation schemas

- Two levels of relation schemas
- The logical "user view" level
- The storage "base relation" level

- Design is concerned mainly with base relations
- What are the criteria for "good" base relations?

1.2 Redundant Information in Tuples and Update Anomalies

- Information is stored redundantly
- Wastes storage
- Causes problems with update anomalies
- Insertion anomalies
- Deletion anomalies
- Modification anomalies

Functional Dependency:

Attribute B is functionally dependent on attribute A if at each point in time, each value of A has only one value of B associated with it.

Examples:

- SSN ENAME
- PNUMBER{PNAME, PLOCATION}
- {SSN, PNUMBER}HOURS

functional dependency

A functional dependency (X Y)

between two sets of attributes X and Y that are

subsets of R specifies a constraint on the

possible tuples that can form a relation state r

of R. The constraint is that, for any two tuples

t1 and t2 in r that have t1[X] = t2[X], we must

also have t1[Y] = t2[Y].

--X is a candidate keyof R—this implies that XYfor any subset of attributes Y of R

--If X Yin R, this does not say whether ornot Y Xin R.

Inference Rules:

F = {SSN {ENAME, BDATE, ADDRESS, DNUMBER},

DNUMBER {DNAME,DMGRSSN}}

Can infer additional FDs such as:

SSN {DNAME, DMGRSSN},

SSN SSN,

DNUMBER DNAME

Inference rules (Armstrong’s Axioms)

- Given a set of FDs F, we can infer additional FDs that hold whenever the FDs in F hold
- Armstrong's inference rules:
- IR1. (Reflexive) If Y subset-of X, then X -> Y
- IR2. (Augmentation) If X -> Y, then XZ -> YZ
- (Notation: XZ stands for X U Z)

- IR3. (Transitive) If X -> Y and Y -> Z, then X -> Z

- IR1, IR2, IR3 form a sound and complete set of inference rules
- These are rules hold and all other rules that hold can be deduced from these

Inference rules (Armstrong’s Axioms) -continued

IR4 (decomposition, or projective rule):

{X YZ} |= X Y.

IR5 (union, or additive rule):

{X Y, X Z} |= X YZ.

IR6 (pseudotransitive rule)

{X Y, WY Z } |= WX Z.

Armstrong’s Axioms

- IR1 – Reflexive rule
X Y, then X Y

ssn, name ssn

- IR2 – Augmentation rule
X Y |= XZ YZ

ssn name |= ssn, dob name,dob

- IR3 – Transitive rule
X Y, Y Z |= X Z

Ssn phone#

Phone# zip

Ssn zip

PROOF OF IR4 (USING IR1 THROUGH IR3)

IR4 (decomposition, or projective, rule):

{X YZ} |= X Y.

- X YZ (given).
- YZ Y (using IR1 and knowing that YZ Y).
- X Y (using IR3 on 1 and 2).

PROOF OF IR5 (USING IR1 THROUGH IR3)

IR5 (union, or additive, rule)

{X Y, X Z} |= X YZ.

XY (given).

XZ (given).

X XY (using IR2 on 1 by augmenting with X; notice that XX = X).

XY YZ (using IR2 on 2 by augmenting with Y).

X YZ (using IR3 on 3 and 4).

PROOF OF IR6 (USING IR1 THROUGH IR3)

IR6 (pseudotransitive rule)

{X Y, WY Z } |= WX Z.

X Y (given).

WY Z (given).

WX WY (using IR2 on 1 by augmenting with W).

WX Z (using IR3 on 3 and 2).

Exercise

a) {w y, x z} |= {wx y}

{w y, x z} |= {wx y}

w y Given

wx xy Aug(X)

xy y Reflexive

wx y Transitive

STUID DOB

C# C_TITLE

STUID, C# DOB

Closure of FDs (F+)

Set of FDs that could be derived from FDs using Armstrong’s Axioms

ssn name

ssn dob

=> Ssn name, dob

closure sets with respect to F (Example)

F =

{ SSNENAME,

NUMBER{NAME,PLOCATION},

{SSN, PNUMBER}HOURS

}

{ SSN }+ = { SSN, ENAME }

{ PNUMBER }+ =

{ PNUMBER, PNAME, PLOCATION }

{ SSN, PNUMBER }+ =

{ SSN, PNUMBER, ENAME, PNAME, PLOCATION, HOURS }

Inference Rules for FDs (3)

- Closure of a set F of FDs is the set F+ of all FDs that can be inferred from F
- Closure of a set of attributes X with respect to F is the set X+ of all attributes that are functionally determined by X
- X+ can be calculated by repeatedly applying IR1, IR2, IR3 using the FDs in F

Closure of attribute X (X+)

- Closure of a set of attributes X with respect to F is the set X+ of all attributes that are functionally determined by X
Algorithm 16.1Determining X+, the closure of X

under F

X+ := X

Repeat

old X+ := X+;

for each functional dependency Y Z in F do

if X+ Y then X+ := X+ U Z;

Until ( X+ = old X+ );

p 310

ExampleR(A,B,C,D,M,F,G)

FDs:

A B

B CD

C M

D FG

A+ = A

= AB (A B)

= ABCD (B CD)

= ABCDM (C M)

= ABCDFGM (DFG)

A IS A KEY OF THIS RELATION

R(STUID,F,M,L,STREET,CITY,STATE,ZIP,C#,CNAME,GRADE)

STUID F,M,L,STREET, CITY,STATE,ZIP

C# CNAME

ZIP CITY, STATE

STUID,C# GRADE

(STUID)+ = STUID

= STUID, F,M,L,STREET, CITY,STATE,ZIP

(C#)+ = C#

= C#, CNAME

(STUID,C#)+ = STUID,C#

=STUID,C#, F,M,L,STREET, CITY,STATE,ZIP

= STUID,C#, F,M,L,STREET, CITY,STATE,ZIP ,CNAME

= STUID,C#, F,M,L,STREET, CITY,STATE,ZIP ,CNAME,GRADE

Algorithm 16.2 (a): Finding a Key K for R Given a set F of Functional Dependencies

Input: A universal relation R and a set of functional dependencies F on the attributes of R.

1.Set K := R;

2.For each attribute A in K {

Compute (K - A)+ with respect to F;

If (K - A)+ contains all the attributes in R,

then set K := K - {A};

}

Find key of R( Functional DependenciesABCDFGMN)

FDs:

AC B

B CD

C MN

D FG

Is (AC) a candidate key?

(AC)+ = ABCDFGMN

A+ = A

C+ = CMN

(AC)+ = AC

= ABC (AC B)

= ABCD (B CD)

= ABCDMN (C MN)

= ABCDMNFG (D FG)

Find key of R( Functional DependenciesABCDFGMN)

FDs:

AC B

B CD

C MN

D FG

A C

Is (AC) a candidate key?

(AC)+ = ABCDFGMN

A+ = AC

= ABC (AC B)

= ABCD (B CD)

= ABCDMN (C MN)

= ABCDMNFG (D FG)

Find key of R( Functional DependenciesABCDFGMN)

FDs:

A B

B CD

C MN

D FG

ABCDFGMN

ACDFGMN A B

AFGMN A B, B CD

AFG A B, B C, C MN

A A B, B D, D FG

Find key of R(ABCDFGMN) Functional Dependencies

FDs:

AC B

B CD

C MN

D FG

A C

Is (AC) a candidate key?

(AC)+ = ABCDFGMN

A+ =

C+ =

Example: Functional Dependencies

Patient(Patient#, Pat_Name, Pat_Adr, Med_Exam_Description, Doctor_Number,

Med_Exam#, Date_Test, Test_Result, Doctor_Name)

- Key:

Example: Functional Dependencies

Hotel_Bill(Guest_Id, Room#, Room_Rate, Date_Checkin, Total_Amount, Guest_Fname,

Guest_Lname, InvoiceNumber, Guest_Address)

Key:

16.1.3 Minimal Sets of Functional Dependencies Functional Dependencies

F is minimal if:

- Every dependency in F has a single attribute for its right-hand side.
- We cannot replace any dependency X A in F with a dependency Y A, where Y is a proper subset of X, and still have a set of dependencies that is equivalent to F.
- We cannot remove any dependency from F and still have a set of dependencies that is equivalent to F.

Algorithm 16.2 Functional Dependencies Finding a minimal cover F for a set of functional dependencies E (p550)

Input: a set of functional dependencies E.

1. Set F := E.

2. Replace each functional dependency X {A1, A2,…, An} in F by the n functional dependencies XA1,

XA2, . . ., XAn .

3. For each functional dependency XA in F

for each attribute B that is an element of X

if ((F - {XA}) D {(X - {B}) A}) is equivalent to F,

then replace XA with (X - {B}) A in F.

4. For each remaining functional dependency XA in F

if (F - {XA}) is equivalent to F,

then remove XA from F.

Example Functional Dependencies

1) The set of functional dependencies F of relation R(ABCDEFGHIJKLM) is

A A,B,C,D,E

B C,D,E,F

C E,H

D,E F

D F,G

K L

L K

a) Find a minimal cover of the set F.

b) Find a candidate key for the relation R.

Normalization: Functional Dependencies The process of decomposing complex data structures into simple relations according to a set of dependency rules.

– reduce anomalies (update, insert, delete)

first proposed by Codd (1972)

Assumption: all non-key fields will be updated frequently

- tend to penalize retrieval

Problems with update anomalies Functional Dependencies

- Insertion anomalies
- Deletion anomalies
- Modification anomalies

EXAMPLE OF AN UPDATE ANOMALY Functional Dependencies

- Consider the relation:
- EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)

- Update Anomaly:
- Changing the name of project number P1 from “Billing” to “Customer-Accounting” may cause this update to be made for all 100 employees working on project P1.

EXAMPLE OF AN INSERT ANOMALY Functional Dependencies

- Consider the relation:
- EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)

- Insert Anomaly:
- Cannot insert a project unless an employee is assigned to it.

- Conversely
- Cannot insert an employee unless a he/she is assigned to a project.

EXAMPLE OF AN DELETE ANOMALY Functional Dependencies

- Consider the relation:
- EMP_PROJ(Emp#, Proj#, Ename, Pname, No_hours)

- Delete Anomaly:
- When a project is deleted, it will result in deleting all the employees who work on that project.
- Alternately, if an employee is the sole employee on a project, deleting that employee would result in deleting the corresponding project.

15.3 Normalization of Relations (1) Functional Dependencies

- Normalization:
- The process of decomposing unsatisfactory "bad" relations by breaking up their attributes into smaller relations

- Normal form:
- Condition using keys and FDs of a relation to certify whether a relation schema is in a particular normal form

Normalization of Relations (2) Functional Dependencies

- 2NF, 3NF, BCNF
- based on keys and FDs of a relation schema

- 4NF
- based on keys, multi-valued dependencies : MVDs; 5NF based on keys, join dependencies : JDs (Chapter 11)

- Additional properties may be needed to ensure a good relational design (lossless join, dependency preservation; Chapter 11)

Definitions of Keys and Attributes Participating in Keys (2)

- If a relation schema has more than one key, each is called a candidate key.
- One of the candidate keys is arbitrarily designated to be the primary key, and the others are called secondary keys.

- A Prime attribute must be a member of some candidate key
- A Nonprime attribute is not a prime attribute—that is, it is not a member of any candidate key.

Problems: (2)

Decomposition by Analysis and Synthesis techniques (2)

Decomposition by analysis

Decomposition by synthesis

First Normal Form(1NF): (2)

A relation is in first normal form if and only if every attribute is single-valued for each tuple.

-- 1NF disallows having a set of values, a tuple of values, or a combination of both as an attribute value for a single tuple. In other words, 1NF disallows "relations within relations" or "relations as attributes of tuples." The only attribute values permitted by 1NF are single atomic (or indivisible) values.

First Normal Form (2)

- Disallows
- composite attributes
- multivalued attributes
- nested relations; attributes whose values for an individual tuple are non-atomic

Second Normal Form (1) (2)

- Uses the concepts of FDs, primary key
- Definitions
- Prime attribute: An attribute that is member of the primary key K
- Full functional dependency: a FD Y -> Z where removal of any attribute from Y means the FD does not hold any more

- Examples:
- {SSN, PNUMBER} -> HOURS is a full FD since neither SSN -> HOURS nor PNUMBER -> HOURS hold
- {SSN, PNUMBER} -> ENAME is not a full FD (it is called a partial dependency ) since SSN -> ENAME also holds

Second Normal Form (2) (2)

- A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is fully functionally dependent on the primary key
- R can be decomposed into 2NF relations via the process of 2NF normalization

Second Normal Form(2NF): (2)

A relation is in second normal form if and only if it is in first normal form and all the nonkey attributes are fully functionally dependent on the key.

Grade_report(SID, C#, stu_name, course_title)

not in 2NF (there are partial dependencies)

Full functional dependency (2)

A functional dependency X Y is a full functional dependency if removal of any attribute A from X means that the dependency does not hold any more; that is, for any attribute AX, (X - {A}) does not functionally determine Y.

(definition from the book) (2)

- Second normal form (2NF) is based on the concept of full functional dependency. A functional dependency X Y is a full functional dependency if removal of any attribute A from X means that the dependency does not hold any more; that is, for any attribute A X, (X - {A}) does not functionally determine Y. A functional dependency X Y is a partial dependency if some attribute A X can be removed from X and the dependency still holds; that is, for some A X, (X - {A}) Y.

- Second normal form (2)
- is violated when a non-key field is a fact about a subset of a key.
- It is only relevant when the key is composite, i.e., consists of several fields

Function Dependencies (F):

Warehouse Warehouse_address

Part, Warehouse quantity

William Kent (CACM1982)

3.4 Third Normal Form (1) (2)

- Definition:
- Transitive functional dependency: a FD X -> Z that can be derived from two FDs X -> Y and Y -> Z

- Examples:
- SSN -> DMGRSSN is a transitive FD
- Since SSN -> DNUMBER and DNUMBER -> DMGRSSN hold

- SSN -> ENAME is non-transitive
- Since there is no set of attributes X where SSN -> X and X -> ENAME

- SSN -> DMGRSSN is a transitive FD

Third Normal Form (2) (2)

- A relation schema R is in third normal form (3NF)if it is in 2NFandno non-prime attribute A in R is transitively dependent on the primary key
- R can be decomposed into 3NF relations via the process of 3NF normalization
- NOTE:
- In X -> Y and Y -> Z, with X as the primary key, we consider this a problem only if Y is not a candidate key.
- When Y is a candidate key, there is no problem with the transitive dependency .
- E.g., Consider EMP (SSN, Emp#, Salary ).
- Here, SSN -> Emp# -> Salary and Emp# is a candidate key.

Normal Forms Defined Informally (2)

- 1st normal form
- All attributes depend on the key

- 2nd normal form
- All attributes depend on the whole key

- 3rd normal form
- All attributes depend on nothing but the key

General Normal Form Definitions (For Multiple Keys) (1) (2)

- The above definitions consider the primary key only
- The following more general definitions take into account relations with multiple candidate keys
- A relation schema R is in second normal form (2NF) if every non-prime attribute A in R is fully functionally dependent on every key of R

General Normal Form Definitions (2) (2)

- Definition:
- Superkey of relation schema R - a set of attributes S of R that contains a key of R
- A relation schema R is in third normal form (3NF) if whenever a FD X -> A holds in R, then either:
- (a) X is a superkey of R, or
- (b) A is a prime attribute of R

- NOTE: Boyce-Codd normal form disallows condition (b) above

A relation is in third normal form if it is in second normal form and no nonkey attribute is transitively dependent on the key.

Course_info(C#, course_title, Instructure_name, Inst_phone)

Not in 3NF – there is a transitive dependency

C# course_title, Instructure_name

Instructure_name Inst_phone

From the book (2)

- Third normal form (3NF) is based on the concept of transitive dependency. A functional dependency X Y in a relation schema R is a transitive dependency if there is a set of attributes Z that is neither a candidate key nor a subset of any key of R, and both X Z and ZY hold.
- Codd’s original definition, a relation schema R is in 3NF if it satisfies 2NFand no nonprime attribute of R is transitively dependent on the primary key

Third normal form (2)is violated when a non-key field is a fact about another non-key field

Function Dependencies (F):

Employee Department

Department Location

William Kent ( CACM 1982)

Summary of Normal Forms Based on Primary Keys and corresponding Normalization.

Unnormalized & 1NF relation corresponding Normalization.

University Database corresponding Normalization.

(Third Normal Form relations)

- Philip Bernstein corresponding Normalization., “Synthesizing Third Normal Form Relations from Functional Dependencies,” ACM Transactions on Database Systems, 1976, pp 277-298. (University of Toronto)
To find 3NF relations

- Find Minimal Cover of FDs

- Each FD— convert to 3NF relation

- If key of the Universal DB is not in any relations then create a relation from the key

A) INVOICE(INVOICE#, CUST_ID, CUST_NAME, BILL_DATE, TOT_AMOUNT, SALES_REP#, SALES_REP_NAME)

Key: Normal Form:

3NF relations:

B) PRJ_EMP(PROJECT#, MANAGER_ID, PROJECT_TITLE, EMP_ID, EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

Key: Normal Form:

3NF relations:

Boyce-Codd Normal Form(BCNF): EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

A relation is in Boyce-Codd normal form if and only if every determinant is a candidate key (1974).

(every relation in BCNF is also in 3NF)

Determinant– attribute(s) on the LHS of FD

A BC (A is determinant)

BCNF (Boyce-Codd Normal Form) EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

- A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD X -> A holds in R, then X is a superkey of R
- Each normal form is strictly stronger than the previous one
- Every 2NF relation is in 1NF
- Every 3NF relation is in 2NF
- Every BCNF relation is in 3NF

- There exist relations that are in 3NF but not in BCNF
- The goal is to have each relation in BCNF (or 3NF)

Boyce-Codd Normal Form EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

A relation TEACH that is in 3NF but not in BCNF EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

Achieving the BCNF by Decomposition (1) EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

- Two FDs exist in the relation TEACH:
- fd1: { student, course} -> instructor
- fd2: instructor -> course

- {student, course} is a candidate key for this relation and that the dependencies shown follow the pattern in Figure 10.12 (b).
- So this relation is in 3NF but not in BCNF

- A relation NOT in BCNF should be decomposed so as to meet this property, while possibly forgoing the preservation of all functional dependencies in the decomposed relations.
- (See Algorithm 11.3)

Multivalued Dependency EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

Let R be a relation having attributes or sets of attributes A,B,C. There is a multivalued dependence of attribute B on attribute A if and only if the set of B values associated with a given A value is independent of the C values.

A > B,C

B -/-> C or C -/-> B

Fourth Normal Form(4NF): EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

A relation is in fourth normal form if and only if it is in Boyce-Codd normal form and there are no nontrivial multivalued dependencies.

Employee EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)> Skill, Language

Language --/ skill

William Kent (1982)

Fifth Normal Form(5NF): EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

A relation is in fifth normal form if no remaining nonloss projections are possible, except the trivial one in which the key appears in each projection.

Fifth normal EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ) form when its information content cannot be reconstructed from several smaller record types

William Kent (CACM 1982)

Domain-Key Normal Form(DKNF): EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

A relation is in domain-key normal form if every constraint is a logical consequence of domain constraints or key constraints.

Exercise!! EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

Classify each of the following relations as 1NF, 2NF, 3NF, BCNF, 4NF, or 5NF. (State any assumptions you make)

a) COURSE(COURSE#,C_TITLE,INS_NAME,TEXT_BOOK,PUBLISHER)

b) ENROLLMENT(STUID,COURSE#,STUNAME,COURSE_TITLE,

GRADE)

c) PART(PART#,PART_NAME,DEPARTMENT,PRICE)

d) EMPLOYEE( EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)EMPID,EMPNAME,DATE_HIRED,JOB_TITLE,

JOB_LEVEL,PROJ#,PROJ_TITLE,DUEDATE)

e) BOOK(ISBN,TITLE,AUTHOR,AUTHOR_INSTITUTE,PRICE,

PUBLISHER_NAME,PUBLISHER_ADR)

f) CUSTOMER(CUSTID,CUSTNAME, PHONE#,

STREET,CITY,STATE,ZIP)

Lossless Join Decomposition EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

(R1, R2) is a decomposition of R

F is a set of functional dependencies

(R1, R2) is lossless join with to F if and only if

R1 R2 R1 – R2

or R1 R2 R2 – R1

And R1 U R2 = R

Dependency preservation EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

- Dependency preservation: Let Fibe the set of dependencies F+ that include only attributes in Ri.
- dependency preserving
(F1 F2 … Fn)+ = F+

- Otherwise, checking updates for violation of functional dependencies may require computing joins, which is expensive.

- dependency preserving

Example -- dependency preservation: EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

R = (A, B, C)F = {A B, B C)

Two ways to decompose this relation:

1) R1 = (A, B), R2 = (B, C)

- Lossless-join decomposition:
R1 R2 = {B} and R2 -R1 = C

- Dependency preserving
2) R1 = (A, B), R2 = (A, C)

- Lossless-join decomposition:
R1 R2 = {A} and R1 -R2 = B

Not dependency preserving (cannot check B C without computing R1 x R2)

3NF – always dependency preservation

BCNF – not always

Exercise EMP_LNAME, EMP_FNAME, MANAGER_NAME, EXPECT_END_PROJECT_DATE, EMP_HOURLY_RATE,HOURS_WORK_PER_PRJ)

1) The set of functional dependencies F of relation R(ABCDEFGHIJKLM) is

A A,B,C,D,E

B C,D,E,F

C E,H

D,E F

D F,G

K L

L K

a) Find a minimal cover of the set F.

b) Find a candidate key for the relation R.

c) Find a lossless join decomposition of R into 3NF (also give a proof of lossless join).

2) The set of functional dependencies F of relation R(ABCDEFGHIJKL) is

A ---> B,C,D,E,K

BC ---> D,E,F

C ---> K,H

D,B ---> G

D ---> F,G

K ---> L

a) Find a minimal cover of the set F.

b) Find a candidate key for the relation R.

c) Find a lossless join decomposition of R into 3NF (also give a proof of lossless join).

- REFERENCES R(ABCDEFGHIJKL) is
- E.F. Codd, "A Relational Model of Data for Large Shared Data Banks", Comm. ACM 13 (6), June 1970, pp. 377-387. The original paper introducing the relational data model.
- E.F. Codd, "Normalized Data Base Structure: A Brief Tutorial", ACM SIGFIDET Workshop on Data Description, Access, and Control, Nov. 11-12, 1971, San Diego, California, E.F. Codd and A.L. Dean (eds.). An early tutorial on the relational model and normalization.
- E.F. Codd, "Further Normalization of the Data Base Relational Model", R. Rustin (ed.), Data Base Systems (Courant Computer Science Symposia 6), Prentice-Hall, 1972. Also IBM Research Report RJ909. The first formal treatment of second and third normal forms.
- C.J. Date, An Introduction to Database Systems (third edition), Addison-Wesley, 1981. An excellent introduction to database systems, with emphasis on the relational.
- R. Fagin, "Multivalued Dependencies and a New Normal Form for Relational Databases", ACM Transactions on Database Systems 2 (3), Sept. 1977. Also IBM Research Report RJ1812. The introduction of fourth normal form.
- R. Fagin, "Normal Forms and Relational Database Operators", ACM SIGMOD International Conference on Management of Data, May 31-June 1, 1979, Boston, Mass. Also IBM Research Report RJ2471, Feb. 1979. The introduction of fifth normal form.
- W. Kent, "A Primer of Normal Forms", IBM Technical Report TR02.600, Dec. 1973. An early, formal tutorial on first, second, and third normal forms.
- T.-W. Ling, F.W. Tompa, and T. Kameda, "An Improved Third Normal Form for Relational Databases", ACM Transactions on Database Systems, 6(2), June 1981, 329-346. One of the first treatments of inter-relational dependencies.

Exercise 14.18 (p498) R(ABCDEFGHIJKL) is

a) {w y, x z} |= {wx y}

{w y, x z} |= {wx y}

w y Given

wx xy Aug(X)

xy y Reflexive

wx y Transitive

STUID DOB

C# C_TITLE

STUID, C# DOB

Download Presentation

Connecting to Server..