Midterm 3 revision 2
This presentation is the property of its rightful owner.
Sponsored Links
1 / 67

Midterm 3 Revision 2 PowerPoint PPT Presentation


  • 77 Views
  • Uploaded on
  • Presentation posted in: General

CS157A Lecture 20. Midterm 3 Revision 2. Prof. Sin-Min Lee Department of Computer Science. Normal Forms. BCNF. 5NF. 2NF. 4NF. 3NF. 1NF. Functional dependencies. Multivalued dependencies. Join dependencies. F + : dependencies induced by Armstrong’s Axioms.

Download Presentation

Midterm 3 Revision 2

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Midterm 3 revision 2

CS157A Lecture 20

Midterm 3 Revision 2

Prof. Sin-Min Lee

Department of Computer Science


Normal forms

Normal Forms

BCNF

5NF

2NF

4NF

3NF

1NF

Functional

dependencies

Multivalued

dependencies

Join

dependencies


F dependencies induced by armstrong s axioms

F+: dependencies induced by Armstrong’s Axioms

  • Axioms for reasoning about FD’s

  • (i), (ii) and (iii) are Armstrong’s axioms

F+is the set of dependencies which can be deduced from F by applying three inference rules:

(i) reflexivityif Y  X then X ® Y

(ii) augmentation if X ® Y then XZ ® YZ

(iii) transitivityif X ® Y and Y ® Z then X ® Z


Completeness of armstrong axioms

Completeness of Armstrong Axioms

  • Idea

    If X ® Y cannot be deduced using Armstrong’s axioms then there is a relational instance for R in which all the dependencies in F are true,

    but X ® Y does not hold

R=LMNO

X=L

F={L ® M , M ® N, O ® N}

then X+ = LMN

L ® O cannot

be deduced in F*

L

M

N

O

L ® O cannot

be deduced in F+

Counterexample:


Closure of a set of attributes

Closure of a Set of Attributes

  • Let U be a set of attributes and F be a set of functional dependencies on U.

  • Suppose that X  U is a set of attributes.

  • Definition:X+ = { A | F XA}

  • We would like to compute X+

|=


Midterm 3 revision 2

Note that A, B, C, are attributes

We refer to the set {A,B} simply as AB

Using reflexivity, we

can generate all

trivial dependencies

R= ( A, B, C )

F = {A  B, B  C }

F+= {A  A, B  B, C  C,

AB  AB, BC  BC, AC  AC, ABC  ABC,

AB  A, AB  B,

BC  B, BC  C,

AC  A, AC  C,

ABC  AB, ABC  BC, ABC  AC,

ABC  A, ABC  B, ABC  C,

A  B,… (1) ( given )

B  C,… (2) ( given )

A  C,… (3) ( transitivity on (1) and (2) )

AC  BC, … (4) ( augmentation on (1) )

AC  B,… (5) ( decomposition on (4) )

A  AB,… (6) ( augmentation on (1) )

AB  AC, AB  C, B  BC,

A  AC, AB  BC, AB  ABC, AC  ABC, A  BC, A  ABC }


Midterm 3 revision 2

Algorithm to compute closure of attributes X+ under F

closure := X ;

Repeat

for eachU  VinFdo

begin

ifU  closure

thenclosure := closure  V ;

end

Until (there is no change in closure)


Midterm 3 revision 2

R( A, B, C, G, H, I )

F= {A  B, A  C, CG  H, CG  I, B  H }

To compute AG+

closure = AG

closure = ABG ( A  B )

closure = ABCG ( A  C )

closure = ABCGH ( CG  H )

closure = ABCGHI ( CG  I )

Is AG a candidate key?

AG R

A+ R ?

G+ R ?


Example

Example

  • R(ABCDE)

  • F={ABC, CEB, DA, BCE}

    • {A}+ =

    • {A,B}+ =

    • {B,D}+=


Midterm 3 revision 2

Theorem:

R- a relation schema

F- set of functional dependencies on R

The decomposition of R into relations with attribute sets

R1, R2is a lossless-join decomposition iff

( R1  R2 )  R1  F+

OR

( R1  R2 )  R2  F+

i.e., R1  R2 is a superkey for R1 or R2.

(the attributes common to R1 and R2 must contain a key for

either R1 or R2 ).


Example1

Example

  • R(A,B,C,D,E)

  •  = {R1(A,D), R2(A,B),R3(B,E), R4(C,D,E), R5(A,E)}

  • FD1. AC FD2. BC FD3. CD

  • FD4. DEC FD5. CE A.

  • Decide whether the decomposition is lossless.


Bcnf decomposition

BCNF Decomposition

Suppose R is not in BCNF, A is an attribute, and X  A is a FD where X  A =  that violates the condition.

Remove A from R

Create a new relational schema XA

Repeat this process until all the relations are in BCNF

  • It is a lossless join decomposition.

  • But not necessary dependency preserving


Midterm 3 revision 2

Key is C

CSJDPQV

SDP

JS

SDP

CSJDQV

SDP

JS

CJDQV

JS


Midterm 3 revision 2

Key is C

SDP

CSJDPQV

JS

SDP

JP C

CSJDQV

SDP

JS

CJDQV

JS

The result is in BCNF

Does not preserve JPC, we can add a schema:

CJP

Each of SDP, JS, CJDQV, CJP is in BCNF, but there is

redundancy in CJP.


Midterm 3 revision 2

Possible refinement

CSJDPQV

Key is C

SDP

SDP

CSJDQV

SDP

SDQ

SDQ

CSJDV

SDQ

SD is a key in SDP and SDQ,

There is no dependency between P and Q

we can combine SDP and SDQ into one schema

Resulting in SDPQ, CSJDV


Overview

Overview

  • It is possible to decompose any relational schema into a set of relational schemas with the following properties:

  • 1)Attribute Preserving

  • 2)FDs preserving

  • 3)Lossless Join


The decomposition algorithm

The Decomposition Algorithm

  • Step 1. Find a minimal cover G for F

  • Step 2. For each left-hand side X that appears in G, create a relation schema with attributes

    {X  A1  A2, ... ,  Am}

    where X  A1, X  A2, ... , X  Am are all dependencies in G with X as left-hand side.

  • Step 3. If none of the relation schemas contains a key of R, create one more relation schema that contains attributes that form a key for R.


Example2

Example

  • Consider R(A, B, C, D, E) and

    F={AB  C, A  BE, C E}

  • Step 1.minimal cover

    Fmin={AC, AB, CE}

  • Step 2. R1(A,B,C), R2(C,E)

  • Step 3. Key: {A,D}

    we have R3(A,D)

  •  Final Result:

    R1(A, B,C), R2(C,E), and R3(A,D)


Bcnf decomposition1

BCNF Decomposition

  • Property LJ1:

  • A decomposition D={R1, R2} of R has the lossless join property with respect to a set of functional dependencies F on R if and only if either

    • the FD ((R1  R2)  (R1 - R2)) is in F+, or

    • the FD ((R1  R2)  (R2 - R1)) is in F+


Bcnf decomposition2

BCNF Decomposition

  • Property LJ2:

  • If a decomposition D={R1, R2, ..., Rm} of R has the lossless join property with respect to a set of functional dependencies F on R, and if a decomposition D1={Q1, Q2, ... ,Qk} of Ri has the lossless join property with respect to the projection of F on Ri, then the decomposition

    D2={R1, R2, ... Ri-1, Q1, Q2, ..., Qk, Ri+1, ..., Rm} of R has the lossless join property with respect to F


Bcnf decomposition algorithm

BCNF Decomposition - Algorithm

1.Set D  {R}

2.While there is a relation schema Q in D that is not in BCNF do

begin

choose a relation schema Q in D that is not in BCNF;

find a functional dependency X  Y in Q that violates BCNF;

replace Q in D by two schemas

(Q-Y) and (X  Y)

end;


Bcnf decomposition example

BCNF Decomposition - Example

Consider R(A,B,C,D) and

F={A  B, B  C, D  B}

Decompose R into BCNF relations.

Step 1.D={R(A,B,C,D)}

Step 2. Loop 1.R is not in BCNF because A  B and A is not a superkey

decompose R into R1(A, C, D), and R2(A, B)

Loop 2. R1 is not in BCNF because A  C and A is not a superkey

decompose R1 into R11(A, D) and R12(A, C)

Result:D={R11(A,D), R12(A,C), R2(A,B)}


Overview1

Overview

  • Given a relation schema R(A1, A2, ... , An).

  • If R is not in the third normal form (3NF), we wan to decompose it into a set of relation schema D= { R1, R2, ... ,Rm }, where each Ri is in 3NF, such that the following conditions are held:

  • Attribute preservation condition.

  • Dependency preservation condition

  • Lossless join condition


Attribute preservation condition

Attribute Preservation Condition

  • Attribute preservation condition states that the union of attributes of Ri equal to the set of attributes of R.

  •  For example: Given R(A, B, C, D) and the decomposition

    D1={ R1(A,B), R2(B,C) and R3(A,C,D)}. D1 satisfies the attribute preservation condition.


Attribute preservation condition1

Attribute Preservation Condition

  • Given R(A, B, C, D) and the decomposition

    D2={R1(A, B), R2(B,C), R3(A, C)},

  • The attribute preservation condition is violated because D is missing (not preserved in the decomposition).


Dependency preservation condition

Dependency Preservation Condition

  • We say that a decomposition D={R1, R2, ... , Rm} of R is dependency preserving with respect to F if the union of the projections of F on each Ri in D is equivalent to F. That is:

    ((F(R1)  ... F(Rm))+ = F+

  • Given a set of dependencies F on R, the projection of F on Ri, denoted by F(Ri) where Ri is a subset of R, is the set of dependencies X  Y in F+ such that the attributes in X  Y are all contained in Ri.


Dependency preservation condition1

Dependency Preservation Condition

  • Given R(A, B, C, D) and F = { A  B, B  C, C  D}

  •  Let D1={R1(A,B), R2(B,C), R3(C,D)}

    F(R1)={A  B}

    F(R2)={B  C}

    F(R3)={C  D}

    FDs are preserved.


Dependency preservation condition2

Dependency Preservation Condition

  • Given R(A, B, C, D) and F = { A  B, B  C, C  D}

  •  Let D2={R1(A,B}, R2(B,C), R3(A, D)}, then FDs are not preserved.


Dependency preservation condition3

Dependency Preservation Condition

  • Given R(A, B, C, D) and F = { A  B, B  C, C  D}

  •  Let D2={R1(A,B}, R2(B,C), R3(A, D)}, then FDs are not preserved.


Dependency preservation condition4

Dependency Preservation Condition

  • We want to preserve the dependencies because each dependency in F represents a constraint on a database.


Dependency preservation condition5

Dependency Preservation Condition

  • If one of the dependencies is not represented by the dependencies on some individual relation Ri of the decomposition, we will not be able to enforce this constraint by looking only at an individual relation, instead, to enforce the constraint, we will have to join two or more of the relations in the decomposition and then check that functional dependency hold in the result of the join operation. This is very inefficient.


Multivalued dependencies and fourth normal form

Multivalued Dependencies and Fourth Normal Form

  • Formal Definition of Multivalued Dependency


Multi valued dependency

Multi-Valued Dependency

  • Problem: multi-valued (or binary join) dependency

    • Definition: If every instance of schema R can be (losslessly) decomposed using attribute sets (X, Y) such that:

r =  X (r)

 Y (r)

then a multi-valued dependency exists

Ex:Person= SSN,PhoneN(Person)

 SSN,ChildSSN(Person)


Fourth normal form

Fourth Normal Form

  • A schema is in fourth normal form if for every non-trivial multi-valued dependency:

R = X

Y

either:

- X  Y or Y  X (trivial case) or

- X Y is a superkey of R

(i.e., X  Y R )


4th normal form

4th Normal Form

  • No multivalued dependencies and BCNF

  • Create separate tables for each separate functional dependency


Example3

Example

SalesForce (State, SalesPerson)

Delivery (State, Delivery)


Beyond 4th normal form

Beyond 4th Normal Form

  • 5th Normal Form

    • Project-Join Normal Form

  • Domain Key Normal Form (DKNF)


Midterm 3 revision 2

Assume the relation R contains the following two tuples

R(A B C D)

( 1 2 3 4 ) …

( 1 5 6 7 ) …

  What other tuples must R contain so that A ->-> B and A ->-> C hold for R ?

Answer: The tuples that must be included due to the two multi-valued dependency are:

(1 2 6 7)

(1 5 3 4)

(1 2 6 4)

(1 5 3 7)

(1 2 3 7) second round

(1 5 6 4) second round


Example4

Example

  • Consider the following relation and determinants.

    R(a,b,c,d)a,c -> b,da,d -> b

  • To be in BCNF, all valid determinants must be a candidate key. In the relation R, a,c->b,d is the determinate used, so the first determinate is fine.

  • a,d->b suggests that a,d can be the primary key, which would determine b. However this would not determine c. This is not a candidate key, and thus R is not in BCNF.


Tuple relational calculus

Tuple Relational Calculus

  • based on specifying a number of tuple variables

  • a tuple variable refers to any tuple


Simple example 1

Simple example 1

  • To find all employees whose salary is greater than $50,000

    • {t| EMPLOYEE(t) and t.Salary>5000}

      • where

      • EMPLOYEE(t) specifies the range of tuple variable t

    • The above operation selects all the attributes


Simple example 2

Simple example 2

  • To find only the names of employees whose salary is greater than $50,000

    • {t.FNAME, t.NAME| EMPLOYEE(t) and t.Salary>5000}

  • The above is equivalent to

  • SELECT T.FNAME, T.LNAME

  • FROM EMPLOYEE T

  • WHERE T.SALARY > 5000


Elements of a tuple calculus

Elements of a tuple calculus

  • In general, we need to specify the following in a tuple calculus expression:

    • Range Relation (I.e, R(t)) = FROM

    • Selected combination= WHERE

    • Requested attributes= SELECT


Elements of formula

Elements of formula

  • A formula is made of Predicate Calculus atoms:

    • an atom of the from R(ti)

    • ti.A op tj.B op{=, <,>,..}

    • F1 And F2 where F1 and F2 are formulas

    • F1 OR F2

    • Not (F1)

    • F’=(t) (F) or F’= (t) (F)

      •  Y friends (Y, John)

      • X likes(X, ICE_CREAM)


More example

More Example

  • For every project located in ‘Stafford’, retrieve the project number, the controlling department number, and the last name, birthrate, and address of the manger of that department.


Domain relational calculus drc

Domain Relational Calculus (DRC)

  • Another type of formal predicate calculus-based language

  • QBE is based on DRC

  • The language shares a lot of similarities with the tuple calculus


Midterm 3 revision 2

DRC

  • The only difference is the type of variables:

    • variables range over singles values from domains of attributes

  • An expression of DRC is:

    • {x1, x2,…,xn|COND(x1,x2,…,xn, xn+2,…,xn+m)}

      • where x1,x2,…,xn+m are domain var range over attributers

      • COND is a condition (or formula)


Examples

Examples

  • Retrieve the birthdates and address of the employee whose name is ‘John B. Smith’

  • {uv| (q)(r)(s) (EMPLOYEE(qrstuvwxyz) and q=‘John’ and r=‘B’ and s=‘Smith’


Alternative notation

Alternative notation

  • Ssign the constants ‘John’, ‘B’, and ‘Smith’ directly

  • {uv|EMPLOYEE (‘John’, ’B’, ’Smith’ ,t ,u ,v ,x ,y ,z)}


More example1

More example

  • Retrieve the name and address of all employees who work for the ‘Reseach’ department

  • {qsv | ( z) EMPLOYEE(qrstuvwxyz) and ( l) ( m) (DEPARTMENT (lmno) and l=‘Research’ and m=z))}


More example2

More example

  • List the names of managers who have at least on e dependent

  • {sq| ( t) EMPLOYEE(qrstuvwxyz) and (( j)( DEPARTMENT (hijk) and (( l) | (DEPENTENT (lmnop) and t=j and t=l))))}


Characteristics of a decomposition

Characteristics of a Decomposition

  • Two important characteristics of a decomposition:

    • lossless join: necessary, otherwise original relation cannot be recreated, even if tables are not modified

    • dependency preserving: allows us to check that inserts/updates are correct without joining the sub-relations


Lossless join

Lossless Join


Checking

Checking

  • Check for a lossless join using the algorithm from class (with the a-s and b-s)

  • Check for dependency preserving using an algorithm shown today


Dependency preservation

Dependency Preservation

  • R=ABC

  • Decomposition {AB, AC}

  • Dependencies {AB, BC}.

  • Is it lossless?

  • Does this decomposition preserve BC?


Dependency preservation cont d

Dependency Preservation (cont’d)


Definitions

Definitions

  • We define S (F) to be the set of dependencies XY in F+ such that X and Y are in S.

  • We say that a decomposition R1...Rn of R is dependency preserving if for all instances r of R that satisfy the FDs of R:

    (R1(F) U ... U Rn (F))+ = F+

  • Note that one inclusion clearly holds always.

  • This definition implies an exponential algorithm to check if a decomposition is dependency preserving

  • We give a polynomial algorithm


Algorithm

Algorithm

  • Let R be a relation, decomposed into R1, R2,…,Rn

  • Let F be a set of functional dependencies

  • To check whether R1,…,Rn preserves all the functional dependencies in F, run the algorithm on the next slide for each X -> Y in F

  • If the answer is “Yes” for all FDs, then the decomposition preserves F

  • If the answer is “No” for at least one FD, then the decomposition does not preserve F


Testing dependency preservation

Testing Dependency Preservation

  • To check if the decomposition preserves XY:

Z:=X

while changes to Z occur do

for i=1 to n do

Z:= Z  ((Z  Ri)+  Ri)

if YZ

return “yes”

else

return “no”


Example 1

Example (1)

  • R=ABCD

  • F = {A -> B, B -> C, C -> D, D -> A}

  • R1=AB, R2=BC, R3=CD

  • Is this decomposition dependency preserving?


Example 2

Example (2)

  • R = ABCDE

  • F = {A -> ABCDE, BC -> A, DE -> C}

  • Suppose we decompose R into ABDE and DEC.

  • Is the decomposition dependency preserving?


Normal forms1

Normal Forms

  • The basic idea: if a relation is in one of these forms, then it avoids certain problems (e.g., redundancy)

  • Normal Forms:

    • BCNF: Every dependency X->A in F+ must be (1) trivial or (2) X is a super-key

    • 3NF: Every dependency X->A in F+ must be (1) trivial, (2) X is a super-key or (3) A is an attribute of a key


Example5

Example

  • Reminder F+ = {X -> X+ | exist Y->Z in F st Y in X and Z not in X}

  • Suppose that R = ABC. For each of the following values of F, decide whether R is in BCNF/3NF:

    • F = {}

    • F = {A -> B}

    • F = {A -> B, A -> C}

    • F = {A -> B, B -> C}

    • F = {A -> B, BC -> A}


Decomposition into 3nf

Decomposition into 3NF

  • Given a relation R with functional dependencies F

  • Step 1: Find a non-redundant cover G of F

  • Step 2: For each FD XA in G, create a schema XA

  • Step 3: If no schema created so far contains a key, add a key as a schema

  • Step 4: Remove schemas that are contained in other schemas

  • The result is a decomposition into 3NF that is dependency preserving and has a lossless join


Example6

Example

  • Find a decomposition into 3NF for the relation R = ABCDEFGH, with the functional dependencies F = {AB, ABCDE, EFGH, ACDFEG}


Example7

Example

  • Non-redundant cover G = {AB, ACDE, EFG, EFH}

  • Key ACDF

  • Schema: AB, ACDE, EFG, EFH, ACDF


  • Login