Normalization hussam eldin najib cs department an najah national university
This presentation is the property of its rightful owner.
Sponsored Links
1 / 74

Normalization Hussam Eldin Najib CS Department An-Najah National University PowerPoint PPT Presentation


  • 74 Views
  • Uploaded on
  • Presentation posted in: General

Normalization Hussam Eldin Najib CS Department An-Najah National University. Introduction. The purpose of normalization is to design “good” database schemas; normalized schemas avoid certain types of anomaly

Download Presentation

Normalization Hussam Eldin Najib CS Department An-Najah National University

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Normalization hussam eldin najib cs department an najah national university

NormalizationHussam Eldin NajibCS Department An-Najah National University

HUSSAM NAJIB- CS DEPT- AN-NAJAH NATIONAL UN.


Introduction

Introduction

  • The purpose of normalization is to design “good” database schemas; normalized schemas avoid certain types of anomaly

  • If we use methodologies based on developing a definition of the schema in terms of the ER model and then mapping such schema in a relational schema, most of these anomalies do not arise

  • Very often, however, the normalization process results in a large number of relations with few attributes each; notice as this may result in performance inefficiency, due to the increased need of join operations


Normalization hussam eldin najib cs department an najah national university

Redundancy and anomalies

job

budget

name

salary

project

Red 2 DB 300 technician

Green 3 WEB 500 designer

Green 3 CAD 1500 designer

Black 7 CAD 1500 director

Black 7 WEB 500 consultant

Black 7 DB 300 consultant

Mori 6 DB 300 director

Mori 6 CAD 1500 designer

White 6 CAD 1500 designer

White 6 DB 300 designer


Redundancy and anomalies

Redundancy and anomalies

1) redundancies:

  • The information concerning the salary of each employee is repeated several times

  • The information concerning the budget of each project is repeated several times

  • The values of columns ‘name’ and ‘project’ are repeated and therefore each one of these two columns cannot be taken as key

  • The key is thus (name, project): we do not have repetitions of information related to the relationship project-employee


Redundancy and anomalies1

Redundancy and anomalies

2) updates:

  • Because the information about the salary of each employee is repeated multiple times, if the salary of an employee is updated, such update must be propagated to all the tuples concerning such employee

  • The same applies to updates to the budget of projects


Redundancy and anomalies2

Redundancy and anomalies

3) deletions:

  • Suppose that an employee resigns from the company or does not participate on any of the company project, then data about some projects can be lost if this employee was the only one working on these projects

  • The same happens for employee data if a project is deleted

  • If the key is (project, name) in both cases of deletions the key would have null values, which is a situation not admitted


Redundancy and anomalies3

Redundancy and anomalies

4) insertions:

  • Because the key is (project, name) it is not possible to insert the data of an employee if such employee has not been assigned to at least a project.

  • The same holds for projects. It is not possible to insert the data about a project, if this project does not have at least an employee working on it

  • Accepting the insertion of a project (without an employee) or an employee (without a project) results in introducing null values (not compatible with the key)


Redundancy and anomalies4

Redundancy and anomalies

  • Such anomalies do not in general arise if one follows a good database design methodology: design of an ER schema and mapping of this schema on the relational model

  • It can anyhow happen that deficiencies in the specification or errors in the conceptual shema may result in relations with anomalies

  • The most frequent cases are when dealing with database that are not well documented or when trying to understand the nature of the data from documents that describe the information based on paper forms


Redundancy and anomalies5

Redundancy and anomalies

  • ER correct

  • Mapping onto the relational model

    employees (name,salary)

    projects (project,budget)

    Works-in (name,project,job)

jobs

name

project

(0,n)

(0,n)

projects

employees

budget

salary

job


Problem

Problem

  • Given a relational schema:

    • How can we determine if there the schema has anomalies by only analyzing the schema?

    • In the case in which the schema has anomalies, how can we transform this schema in a schema which is equivalent and free of anomalies?


Approach

Approach

  • Formalization of some knowledge about the domain through some special integrity constraints

    • Functional dependencies

  • Characterization of the properties of the schemas that do not have anomalies

    • Normal forms

  • Algorithms to translate schemas into normalized schemas

    • There is one such algorithm for each normal form


Functional dependencies

Functional dependencies

  • A functional dependency is an integrity constraint for the relational model

  • In the employees-projects example, each employee has a unique salary

    • In the table, given an employee, each time such employee appears in a tuple, the value of the salary column is the same

  • We can say that the value of the column ‘name’ determines the value of the column ‘salary’, that is:

    • There exists a function that associates with each value of the domain of the column ‘name’ one and only one value from the domain of the column ‘salary’


Functional dependencies1

Functional dependencies

  • let R(A1,A2,....An) be a relational schema, and let X and Y be subsets of {A1,A2,....An}; we say that X functionally determines Y, denoted by X  Y, if for each state r (or extension) of R, it is not possible that two tuples exist in r having equal values for all attributes in X, and different values for the attributes in Y

    (in other words, if t1, t2 r and t1.X=t2.X, then t1.Y=t2.Y)

  • In our example:

    name  salary project  budget

    name project  job


Functional dependencies example

Functional dependencies - Example

  • Consider a database with the following schema:

    Members (Name, Address, Balance)

    Orders (Order_no, Name, Item, Quantity)

    Suppliers(Sname, Saddress, Item, Price)

  • Suppose that each customer has a unique address and balance; suppose that there are no two customers with the same name. Such assumptions can be formalized by the following functional dependency

    Name  Address Balance


Functional dependencies example1

Functional dependencies - Example

  • In the Orders relation we can assume that the order_no determines all other attributes, in that different orders have different order numbers; thus

    Order_no  Name Item Quantity

  • In the Suppliers relation, if we assume that a supplier has a unique address and that a supplier supplies a given item at a unique price, we can observe the following dependencies:

    Sname  Saddress

    Sname Item  Price


Functional dependencies example2

Functional dependencies - Example

  • A question is whether dependencies such as:

    Saddress  Sname or Address  Name

    are valid dependencies

  • The reply depends from the data semantics:

    if it is possible that two different customers have the same address, then the dependency

    Address  Name

    is not valid

    (given an address, we can find two different customers with different names associated with this address)


Functional dependencies example3

Functional dependencies - Example

  • In addition some trivial dependencies hold, such as:

    1) Name  Name

    2) Sname Item  Item

  • such dependencies state that:

    1) if two tuples have the same value for Name, they have the same value for Name

    2) two tuples having the same values for Item and Sname have the same value for Item


Functional dependencies example4

Functional dependencies - Example

  • Other non trivial dependencies can be derived from the dependencies we have already specified

  • Sname Item  Saddress Price

    • This dependency states that two tuples, having the same values for Sname and Item, must have the same values for both attributes Saddress and Price

    • This dependency is obtained by observing that Sname by itself functionally determines Saddress, whereas Sname and Item together functionally determine Price

    • Thus if we fix a value of Sname and a value of Item, we find a single value for Saddress and for Price


Functional dependencies closure

Functional dependencies - Closure

  • Let F be a set of functional dependencies defined over a relational schema R, and let X  Y be a functional dependency, we say that F logically implies X  Y, denoted as F |= X  Y, if each relation r of schema R that verifies the functional dependencies in F, also verifies X  Y

  • Example: {A  B , B  C} |= A  C

  • The closure of a set F of functional dependencies, denoted as F+, is defined as the set of functional dependencies logically implied by F

    F+ = {X  Y | F |= X  Y} 


Functional dependencies closure1

Functional dependencies - Closure

Example

  • Notation: in what follows we will use the compact notation A  B C to denote {A B, A  C}

  • R(A,B,C) and F = {AB, BC}

  • F+ consists of all the functional dependencies XY that verify one of the following three conditions

    • X contains A,

    • X contains B but not A and Y does not contain A,

    • X  Y is the dependency C  C

  • F+ contains for example ABC  BC, AB  BC, A  C, BC  B


Functional dependencies closure2

Functional dependencies - Closure

  • A set of inference rules has been defined that allows the computation of all the dependencies in F+

  • Such rules are sound and complete, that is, they generate all and only the dependencies in F+

  • Such set of rules is known as Armstrong’s axioms (from the name of the researcher that defined it in 1974)


Armstrong axioms

Armstrong axioms

  • Let Dbe a set of attributes and F be a set of functional dependencies defined over the attributes in D, the inference rules are as follows:

    • A1: (reflexivity) If Y  X  D, then X  Y is logically implied by F

      such rule generates trivial dependencies, that is dependencies in which the set of attributes in the right hand side of the dependency is a subset of the set of attributes on the left hand side of the dependency; the use of such rule does not depend from F

    • A2: (additivity ) If X  Y, and Z  D, then XZ  YZ is logically implied by F

    • A3: (transitivity ) If X  Y, and Y  Z, then

      X  Z is logically implied by F


Rules derived from armstrong axioms

Rules derived from Armstrong axioms

  • It is possible to derive additional rules from the Armstrong axioms

    a)Union rule

    {X  Y, X  Z} |= X  YZ

    b) Pseudotransitivity rule

    {X  Y, WY  Z} |= XWZ

    c) Decomposition rule

    If Z  Y, XY |= X  Z


Normalization hussam eldin najib cs department an najah national university

Keys

  • The notion of key, that we have already seen, can be formalized by using the functional dependencies

  • Let R be a relational schema with attributes A1 A2.....An and F be a set of functional dependencies. Let X be a subset of R. X is a key of R if:

    • X  A1 A2.....An is in F+

      that is, the functional dependency of each attribute of R from X is either given or logically follows from the given functionally dependencies

    • There is no Y, proper subset of X, such that

      Y  A1 A2.....An is in F+

      (minimality condition)

  • We use the term superkey to denote a superset of the key


Keys example

Keys - Example

  • Person(SSN, FName, LName, BirthDate)

  • F={SSN  FName LName BirthDate}

    (such dependency specifies that the SSN is unique for each person, that is, no two or more tuples exist that have the same value for SSN and different values for the other attributes)

  • It is easy to see that SSN is a key for Person, where as (SSN, LastName) is not, in that a proper subset exists (SSN) that functionally determines all the attributes


Keys example1

Keys - Example

  • R(A,B,C) and F = {AB, BC}

  • The key is A

    because: AA, AB, AC (for the transitivity rule) or equivalently AABC

  • In addition no proper subset X (not including A) of the attributes of R exists such that

    XABC


Closure of a set of attributes

Closure of a set of attributes

  • Let F be a set of functional dependencies defined on a set D of attributes; let X be a subset of D. The closure of X with respect to F is denoted by X+ and is defined as the set of attributes

    {A / F |= XA}  =

    {A / XA follows from F by application of the Armstrong axioms} 

  • XYF+ if and only if Y X+ with respect to F


Closure of a set of attributes1

Closure of a set of attributes

  • The computation of F+ is very expensive 

    • In the worst case it is exponential in the number of the attributes of the schema

  • Computing the transitive closure with respect to a set X of attributes is less expensive

  • Very often we are not interested in computing F+ but only in verifying if F+ includes a certain dependency


Closure of a set of attributes algorithm

Closure of a set of attributes - algorithm

  • Input: A finite set D of attributes, a set F of functional dependencies, and a set X  D

  • Output: X+, closure of X with respect to F

  • Approach: A sequence of attribute sets X(0), X(1),....... is computed by using the following rules

    • X(0) is X

    • X(i+1) = X(i) È A , where A is a set of attributes such that:

      • A dependency YZ is in F

      • A is in Z

      • Y  X(i)

  • because X=X(0)  ...  X(i)  …  D and D is finite, at the end we obtain an index i such that X(i)=X(i+1)

  • If follows that X(i)=X(i+1)=X(i+2)=....... The algorithm then terminates when X(i)=X(i+1)

  • X+ is X(i), where i is such that X(i)=X(i+1)


Closure of a set of attributes example

Closure of a set of attributes - example

  • Let F be the following set and suppose we would like to compute the (BD)+

    ABCCABCDACDB

    DEGBECCGBDCEAG

  • X(0)=BD

  • X(1): the dependency DEG is used

    X(1)=BDEG 

  • X(2): the dependency BECis used

    X(2)=BCDEG

  • X(3): we consider CA, BCD, CGBD e CEAG

    X(3)=ABCDEG

  • (BD)+=X(3)


Equivalence of sets of functional dependencies

Equivalence of sets of functional dependencies

  • Let F and G be setof functional dependencies, we say that F and G are equivalent if F+=G+

  • To determine the equivalence we proceed as follows:

    • For each dependency YZ in F, we test if YZ is in G+ by using the attribute closure algorithm to compute Y+ and then verifying if Z  Y+

    • If some dependency YZ in F is not in G+, then F+G+

    • We then check that each functional dependency in G is in F+


Minimal sets

Minimal sets

  • A set F of functional dependencies is minimal if:

    • The right hand side of each dependency in Fis a single attribute

    • For no dependency XA in F the set F \ {XA} is equivalent to F

    • For no dependency XA in F and no subset Z of X, the set

      F \ {XA} U {ZA} is equivalent to F

  • Intuitively:

    • condition (2) assures that F does not contain redundant dependencies

    • condition (3) assures that the left hand side of each dependency does not have redundant attributes

    • Because on the right hand side of each rule, there is a single attribute (because of condition (1)), certainly there is no rule having redundancies in the right hand side

  • Each set F of functional dependencies is equivalent to a minimal set F'


Minimal sets1

Minimal sets

  • To determine the minimal set we proceed as follows:

    • We generate F' from F by replacing each dependency XY where Y has the form A1....An with a set of dependencies of the form XA1, ....., XAn

    • We consider each dependency in F' having more that one attribute on the left hand side; if it is possible to eliminate an attribute from the left hand side and still have an equivalent set of functional dependencies, such attribute is eliminated (approach: B in X is redundant with respect to the dependency XA if (X\{B})+ computed with respect to F' contains A)

    • Let F" be the set generated by the previous step. We consider each dependency XA in F" according to some order and if F"\ {XA} is equivalent to F", we remove XA from F" (approach: computer X+ with respect to F"\{XA} and if X+ includes A, then XA is redundant)


Minimal sets2

Minimal sets

  • Note that at step (3) the order according to which the functional dependencies are analyzed may result in different minimal sets:

    ABACBACABC

    it is possible to eliminate BA and AC, or BC but not all three

  • Note that at step(3) the order according to which the attributes are analyzed influences the obtained result

    ABC AB BA

    we can eliminate either A or B from ABC but not both


Minimal sets3

Minimal sets

  • Let F be the following set

    ABCCABCDACDB

    DEGBEC CGBDCEAG

  • By simplifying the right hand sides we obtain

    ABC CA BCD ACDB DE DG

    BEC CGB CGD CEACEG

  • Notice that:

    • CEA is redundant in that it is implied by CA

    • CGB è is redundant in that CG+ = CGADBE

      CGD, CA, and ACDB imply CGB as it can be verified by computing (CG)+

    • In addition ACDB can be replaced by CDB because CA

  • result:

    ABC CA BCD CDB DE DG

    BEC CGD CEG


Minimal sets4

Minimal sets

  • It is possible to obtain an other minimal set by eliminating:

    CEA,CGD and ACDB

  • The resulting minimal set is:

    ABCCABCDDE

    DGBECCGBCEG


Boyce codd normal form

Boyce-Codd normal form

  • A relational schema R with a set F of functional dependencies is in Boyce-Codd normal form (BCNF) if:

    • For each X A in F, such that A is not in X, then X is a superkey of R (that is X is a key or includes the key)

  • In other words: all non trivial dependencies for R are dependencies in which a key functionally determines all the other attributes


Boyce codd normal form1

Boyce-Codd normal form

In our example, the key is (name, project)

  • name, project  function verifies the BCNF,

  • name  salary and

    project  budget

    are partial dependencies that cause anomalies

  • The redundancies and anomalies arise because of dependencies X  Y that introduce repetitions in the relation


Boyce codd normal form2

Boyce-Codd normal form

The schema CSZ with dependencies F={CS Z, Z C} is not in BCNF

Key CS: CS+ = CSZ

Key ZS: ZS+=ZSC

The dependency Z C violates the BCNF:

Z+ = {Z,C} and thus because Z+ does not include S, Z is not a key


Boyce codd normal form3

Boyce-Codd normal form

The schemas

Members (Name, Address, Balance)

Orders (Order_no, Name, Item, Quantity)

with the functional dependencies

Name  Address Balance

Order_no  Name Item Quantity

are in BCNF


Third normal form

Third normal form

  • there exists a normal form weaker than BCNF; such normal form is known as third normal form (3NF)

  • an attribute A of a relational schema is called

    prime if it is an element of some key of R (note: R can have several keys)

  • Example: CSZ with F = {CS Z, Z C}

    all attributes are prime in that in both CS and SZ are keys

  • Example: ABCD with F = {AB C, B D, BC A}

    the keys are AB and BC, thus A,B,C are prime, whereas D is not prime


Third normal form1

Third normal form

  • A relation schema R with dependencies F is in third normal form (3NF) if:

    • For each X A in F, such that A is not in X, then:

      (i) X is a superkey of R (that is X is a key or includes a key)

      or

      (ii) A is prime 

  • The main difference is that the 3NF admits functional dependencies that do not have a key on the left hand side, as long as the right hand side only has prime attributes


Example

Example

  • CSZ with F = {CS Z, Z C}

    • is in 3NF in that all attributes are prime

  • SAIP with F ={ SI P, S A}

    • The only key for this relation is SI

    • The attributes prime are {S,I}

    • The dependency SI P does not violate the 3NF in that it verifies condition (i)

    • The dependency S A violate the 3NF in that S is not a superkey and A is not prime


Example1

Example

  • Schema R = (S, I, D, M)

    S = Store I=Item D=Department M=Manager

  • functional dependencies:

    • SI D each item is sold by at most one department per store

    • SD M each department has only one manager

  • the key is SI: it is determined by computing (SI)+

  • the dependency SD  M violates the 3NF in that SD is not a superkey for R and M is not prime


Decomposition of relational schemas

Decomposition of relational schemas

  • The decomposition of a relational schema

    R={A1, A2,...,An} is the substitution of R with a collection C={R1, R2,...,Rk} of subsets of the set of attributes of R such that R = R1 U R2 U... U Rk

  • The various schemas Ri, i=1,...,k, are not typically disjoint

  • If we have a schema which is not normalized, it is possible to decompose it into normalized subschemas


Decomposition of relational schemas examples

Decomposition of relational schemas - examples

  • A decomposition (in BCNF) of the example schema is

    employees (name,salary)

    projects (project,budget)

    worksin (name,project,function)

  • a decomposition (in BCNF) of

    supplier (sname, saddress, item, price)

    with functional dependencies 

    sname saddress and sname item price

    is

    SA (sname, saddress) and SIP(sname , item, price)


Decomposition of relational schemas1

Decomposition of relational schemas

  • It is important however that decompositions have certain properties

  • wrong decompositions may generate relations that, combined with the join, produce relations with additional data

  • We have thus information loss

  • Consider the following example:

  • Branches (name, project, branch)

  • with dependencies:

  • name branch and

  • project  branch


Decomposition of relational schemas2

Decomposition of relational schemas

The key of Branches is:

name, project

project

branch

name

Red DB Milan

Green WEB Turin

Green SE Turin

White CAD Milan

Black SE Turin

If we decompose according to the two dependencies:


Decomposition of relational schemas3

Decomposition of relational schemas

employees

projects

name

branch

project

branch

Red Milan

DB Milan

Green Turin

WEB Turin

White Milan

SE Turin

Black Turin

CAD Milan

The join on the common attribute:

SELECT E.name, P.project, P.branch

FROM employees E, projects P

WHERE E.branch = P. branch


Decomposition of relational schemas4

Decomposition of relational schemas

project

branch

name

Red DB Milan

It creates

tuples that

did not exist!

Red CAD Milan

Green SE Turin

Green WEB Turin

White DB Milan

White CAD Milan

Black SE Turin

Black WEB Turin


Lossless join decompositions

Lossless join decompositions

  • Let R be a relational schema, R1,R2,.....Rk be a decomposition of R, and D a set of functional dependencies, we say that the decomposition is lossless join(with respect to D) if for each relation r of schema R that satisfies D we have:

    that is r is obtained by the natural join of its projections on the schemas Ri (i=1,..,k)

  • If a relation is decomposed, it is important that it can be re-obtained by executing the natural join of the relations into which the initial relation has been decomposed


Dependency preserving decompositions

Dependency preserving decompositions

  • Another important property of a decomposition of a schema R in r=(R1,...,Rk) is that the set F of functional dependencies defined for R be preserved by the dependencies defined on the schemas R1,...,Rk

  • The reason because dependencies are to be preserved is that they represent some integrity constraints

  • R(ABC) AC C-->B

  • Decomposition R1(AC) R2(AB)

    • R1(AC) AC

    • CB ???


Dependency preserving decompositions1

Dependency preserving decompositions

  • Let F be a set of functional dependencies defined over an attribute set D and let Z  D, the projection of F on Z, denoted by PZ(F ), is the set of dependencies XY in F+ such that XY  Z

    Note that XY must not be necessarily be in F, it is sufficient that be in F+

  • A decomposition rpreserves a set F of functional dependencies, if the union all dependencies PRi(F ) (for i=1,....,k) logically implies all the dependencies in F


Dependency preserving decompositions2

Dependency preserving decompositions

  • Schema R=(CSZ)

    with functional dependencies {CSZ, ZC}

    and decomposition (lossless join) in R1=SZ and R2=CZ

  • the projection of {CSZ, ZC} on SZ contains only the trivial dependencies following from the reflexivity rule (that is, SS and ZZ)

  • the projection on CZ contains the trivial dependencies plus the dependency ZC

  • the dependency CSZ is not preserved in that the trivial dependencies and the ZC dependency do not imply the CSZ dependency


Dependency preserving decompositions3

Dependency preserving decompositions

  • Consider the decomposition R1=SZ and R2=CZ 

    R1SZR2CZ

    s1z1c1z1

    s1z2c1z2

  • Execute the join of R1 and R2

    SZC

    s1z1c1

    s1z2c1

  • This join violates the CSZ dependency in that the obtained relation has two tuples that have the same value of C and S, and different values for Z


Dependency preserving and lossless join decompositions

Dependency preserving and lossless join decompositions

  • There are algorithms (that we will not see) to verify whether a decomposition is lossless join

  • An approach to verify whether a decomposition

    r =(R1,......, Rk) preserves a set of dependencies F is the following (deriving from the definition):

    • compute F+ and project it on all Ri (i=1,...,k);

    • compute the union of the dependencies obtained as projection and check if this set implies all dependencies in F


Dependency preserving and lossless join decompositions1

Dependency preserving and lossless join decompositions

  • Such an approach is expensive because the number of dependencies in F+ is exponential in the dimension of F

  • There is an algorithm (that we do not see) to verify whether a decomposition preserves the functional dependencies with a cost which polynomial in the dimension of F


Decomposition in bcnf and 3nf

Decomposition in BCNF and 3NF

  • Each relational schema has a BCNF decomposition that has the lossless join property

  • Each relational schema has a decomposition 3NF decomposition that is lossless join and preserves the dependencies

  • On the other hand it is possible that, given a relational schema, no lossless join BCNF decomposition exists for this schema which preserves the dependencies

  • Example: R=(CSZ) F = {CSZ, ZC}

    the decomposition in R1=SZ and R2=CZ does not preserve the functional dependency CSZ

  • If no decomposition BCNF exists preserving the dependencies, it is preferable to use the 3NF


An algorithm for lossless join bcnf decomposition

An algorithm for lossless join BCNF decomposition

  • Input: A relational schema R and a set F of functional dependencies

  • Output: A decomposition of R which is lossless join, such that each schema obtained from the decomposition is in BCNF with respect to the projections of F on such schemas

  • Method: A decomposition r for R is iteratively built


Algorithm for lossless join decomposition in bcnf

Algorithm for lossless join decomposition in BCNF

  • At each step r is lossless join with respect to F

  • Initially, r contains only R

  • (*) If S is a schema in r and S is not in BCNF,

    • let XA be a dependency that holds for S, such that X is not a superkey of S and A is not in X

    • S is replaced in r with two schemas S1 and S2 such that

      • S1 consists of the attributes in X and A

      • S2 consists of the attributes of S minus A

    • (the projections of F+ on S1 and on S2 are computed)

  • The step (*) is repeated until each schema in r is in BCNF


Example2

Example

  • R=CTHRSG

    C=course, T=teacher , H=hour, S=student G=grade R=room

  • The following dependencies hold:

    • CTeach course has only one teacher

    • HRConly one course can be given in a given classroom and hour

    • HTRa teacher cannot be in two classrooms at the same hour

    • CSGeach student receives a single grade for each exam

    • HSRa student cannot be in two different classrooms at the same time

  • Such schema has only one key, that is, HS


Example3

Example

  • Consider the dependency CSG that violates the BCNF

  • By applying the algorithm, we decompose R in two schemas

    S1=(CSG) and S2=(CTHRS)

    note: given a dependency XA, S1=X U A S2=R-A

  • The projection of F+ on CSG and CTHRS are computed

    • PCSG(F ) = {CSG}key CS

    • PCTHRS(F ) = {CT, HRC, THR, HSR} key HS

  • CSG is in BCNF


Example4

Example

  • CTHRS is not in BCNF

  • Consider the dependency CT that violates the BCNF and let’s decompose CTHRS in CT and CHRS

    • PCT(F ) = {CT}key C

    • PCHRS(F ) = {CHR, HSR, HRC} key HS

      notice that CHR is required in the projection on CHRS but non in CTHRS in that in CTHRS the dependency is implied by CT and THR

  • CT is in BCNF

  • CHRS is not in BCNF

  • Let’s consider the dependency CHR and let’s decompose CHRS in

    • CHR with {CHR, HRC}, with keys {CH, HR}

    • CHS with {HSC}, with keys SH

  • The obtained decomposition is in BCNF


Decomposition tree

Decomposition tree

CTHRSG

Key = HS

C  T

CS  G

HR  C

HS  R

TH  R

CTHRS

Key = HS

C  T

HR  C

HS  R

TH  R

CSG

Key = CS

CS  G

CHRS

Key = HS

HR  C

HS  R

CH  R

CT

Key = C

C  T

CHR

Key = CH, HR

HR  C

CH  R

CHS

Key = HS

HS  C


Example5

Example

  • The final decomposition is:

    • CSG: contains the grade for each student for each course

    • CT: contains the teacher for each course

    • CHR: contains for each course the hour and classroom

    • CHS: contains for each student the courses and the hours  

  • Notice that different decompositions can be obtained depending on the order according to which the dependencies are considered

  • Example: consider the last step

    • CHRS has been decomposed in CHR and CHS based on the use of the CHR dependency

    • If the HRC dependency had been used, one would have obtained CHR and HRS


Example6

Example

  • The THR dependency is not preserved by the obtained decomposition

  • The union of the projections ofF on CSG, CT, CHR, and CHS is (in minimal form) {CSG, HRC, CT, CHR,HSC} and does notimply THR

  • For example the relation CTHRSG 

    CTHRSG

    c1thr1s1g1

    c2thr2s2g2

    does not verify THR, even though its projections on CSG, CT, CHR, and CHS verify all the dependencies obtained by the projection


Algorithm for lossless join decomposition in 3nf

Algorithm for lossless join decomposition in 3NF

  • Input: A relational schema R and a minimal set F of functional dependencies

  • Output: A lossless join decomposition of R such that each schema is in 3NF with respect to F and each dependency is preserved

  • Method: i=0; S=Æ

    • for each functional dependency XA in F do begin

      • i:=i+1;

      • Ri := XA;

      • S := S È {Ri};

        end

    • Optionally: the relations of schemas (XA1),(XA2),...,(XAn) obtained from the dependencies of the form XA1, XA2, ..., XAn can be replaced from the relation of schema (X A1 A2 ... An)


Algorithm for lossless join decomposition in 3nf1

Algorithm for lossless join decomposition in 3NF

  • if no schema Rk (1 £ k £ i) contains a candidate key then begin

    • i:=i+1;

    • Ri:= any candidate key of R

    • S := S È {Ri};

      end

  • for each Ri (RiÎ S)

    • if exists RkÎ S, i  k, such that schema(Rk) É schema(Ri), then S:=S-{Ri};

  • return (S)


Example decomposition in 3nf

Example: decomposition in 3NF

  • R=CTHRSG

  • F = {CT, HRC, HTR,CSG, HSR}

    key HS

  • First part of the algorithm:

    R1=CTR2=CHRR3=HRT

    R4=CGS R5=HRS 

  • Second part: we check that at least a schema exists that includes the key

  • R5 contains the key; therefore we do not need to create the additional relation


Example7

Example

S = (J,K,L) with F = {JKL, LK} keys JK, JL

  • Decomposition in BCNF

    • S is not in BCNF

    • Consider the dependency LK that violates the BCNF

    • We obtain R1=(LK), R2=(JL)

    • Such schema does not preserve the dependency JKL

  • Decomposition in 3NF

    • the schema S is already in 3NF in that

      • JKL has on the left hand side a key

      • LK has on the right hand side a prime attribute


Example8

Example

Schema:

(branch-name,assets,branch-city,loan-number,customer-name,amount)

  • branch-name  assets

    branch-name  branch-city

    loan-number  amount

    loan-number branch-name

  • The key is (loan-number, customer-name)


Example9

Example

  • The schema R is not in BCNF

    1) consider a dependency that violates the BCNF

    branch-name  assets

    • R is replaced by

      • R1 = (branch-name, assets)

      • R2 =(branch-name, branch-city, loan-number, customer-name, amount)

    • R1 is in BCNF with key branch-name

    • R2 is not in BCNF


Example10

Example

2) consider the dependency

branch-name branch-city

  • R2 is replaced by

    • R3 = (branch-name, branch-city)

    • R4 = (branch-name, loan-number, customer-name, amount)

  • R3 is in BCNF with key branch-name

  • R4 is not in BCNF


Example11

Example

3) consider the dependency

loan-number amount

  • R4 is replaced by

    • R5 = (loan-number, amount)

    • R6 = (branch-name, loan-number, customer-name)

  • R5 is in BCNF with key loan-number

  • R6 is not in BCNF

    4) consider the dependency

    loan-number  branch-name

  • R6 is replaced by

    • R7 = (loan-number, branch-name)

    • R8 = (loan-number, customer-name)

  • Final schema: R1, R3, R5, R7, R8


  • Login