Chapter 6 Relational Database Design

Chapter 6 Relational Database Design 6.1 Pitfalls in Relational-Database Design 6.2 Decomposition 6.3 Normalization Using Functional Dependence 6.4 BCNF decomposition 6.5 3NF decomposition 6.6 Fourth Normal Form 6.7 Exercise

Chapter 6 Relational Database Design In general, the goal of a relational-database design is to generate a set of relation schemas that allows us to store information withoutunnecessary redundancy, yet also allows us to retrieve information easily. One approach is to design schemas that are in an appropriate normal form.

6.1 Pitfalls in Relational-Database Design Among the undesirable properties that a bad design may have are. ① Repetition of information ② Inability to represent certain information Example: The information concerning loans is now kept in one single relation, lending, which is defined over the relation schema. Lending-schema=(branch-name, branch-city, assets, customer-name, loan-number, amount)

6.1 Pitfalls in Relational-Database Design Lending Figure A

6.1 Pitfalls in Relational-Database Design Problems: ① add a new loan The loan is made by the Perryridge branch to Adams in the amount of $1500. Let the loan-number be L-31. We add the tuple (Perryridge, Horseneck, 1700000,Adams, L-31, 1500) ⅰRepeating information wastes space. ⅱRepeating information complicates updating the database.

6.1 Pitfalls in Relational-Database Design ② we cannot represent directly the information concerning a branch.(branch-name, branch-city,assets) unless there exists at least one loan at the branch. The problem is that tuples in the lending relation require values for loan-number, amount and customer-name. Solution: introduce null values to handle updates through views.(difficult)

6.2 Decomposition A bad design suggests that we should decompose a relation schema that has many attributes into several schemas with fever attributes. Careless decomposition, however, many lead to another form of bad design.

6.2 Decomposition Example: Consider an alternative design in which Lending-schema is decomposed into the following two schemas: Branch-customer-schema=(branch-name, branch-city, assets,customer-name) Customer-loan-schema=(customer-name,loan-number, amount)

6.2 Decomposition Branch-customer B Customer-loan C

6.2 Decomposition Suppose that we wish to find all branches that have loans with amount less than $1000. We should write. Figure D

∏branch-name(amount<1000(branch-customer customer-loan)) 6.2 Decomposition Problems: Additional tuples: (Downtown, Brooklyn, 9000000,Jones, L-93, 500) (Mianus, Horsereck, 400000,Jones, L-17, 1000) Consider the query, “Find all branches that have made a loan in an amount less than $1000”. Figure A Mianus Round Hill Figure D Mianus Round HillDowntown

If a customer happens to have several loans from different branches, we cannot tell which loan belongs to which branch. Although we have more tuples in branch-customer customer-loan, we actually have less information. We are no longer able, in general, to represent in the database information about which customers are borrowers form which branch. Loss information: 6.2 Decomposition

6.2 Decomposition Lossy decomposition: Because of this loss of information, we call the decomposition of Lending-schema into Branch-customer-schema and customer-loan-schema a lossy decomposition, or a lossy-join decomposition.

r=∏R1(r) ∏R2(r) …… ∏Rn(r) 6.2 Decomposition Lossless-join decomposition: A decomposition that is not a lossy-join decomposition is a lossless-join decomposition. Let C represent a set of constrains on the database. A decomposition {R1,R2……Rn} of a relation schema R is a lossless-join decomposition for R if for all relation r on schema R that are legal under C.

6.3 Normalization Using Functional Dependence Lossless-join Decomposition: We must first present a criterion for determining whether a decomposition is lossy. Let R be a relation schema, and let F be a set of functional dependencies on R. This decomposition is a lossless-join decomposition of R if at least one of the following functional dependencies are in F+. ① R1∩R2→R1 ② R1∩R2→R2

6.3 Normalization Using Functional Dependence Example: We now show that our decomposition of Lending-schema is a lossless-join decomposition by showing a sequence of steps that generate the decomposition ① Branch-schema=(branch-name,branch-city, assets) Loan-info-schema=(branch-name,customer-name,loan-number, amount) Branch-schema∩Loan-info-schema={branch-name} branch-name→{branch-name,branch-city,assets}

6.3 Normalization Using Functional Dependence ② Loan-info-schema Loan-schema=(branch-name,loan-number, amount) Borrower-schema=(customer-name, loan-number) Loan-schema∩Borrower-schema={loan-number} loan-number→{loan-number,branch-name,amount} This step results is a lossless-join decomposition.

6.3 Normalization Using Functional Dependence F={A→B,C→F,E→A,CE→D} Example: Suppose that we decompose the schema R=(A,B,C,D,E,F) into R1={CF,BE,ECD,AB}; R2={ABE,CDEF}. Determine which decomposition is a lossless-join decomposition if the following set F of functional dependencies holds. {ABE }∩{ CDEF}= {E} E→A A→B E→B E→A E→B E→AB E→ABE R2 is a lossless-join decomposition

6.3 Normalization Using Functional Dependence Another Algorithm: F={A→B,C→F,E→A,CE→D} a2 a6 b21 Lossy-join decomposition

6.3 Normalization Using Functional Dependence F={A→B,C→F,E→A,CE→D} a1 a2 Lossless-join decomposition

6.3 Normalization Using Functional Dependence Dependency Preservation: There is another goal in relational-database design: dependency preservation. Let F be a set of functional dependencies on a schema R, and let R1,R2……Rn be a decomposition of R. The restriction of F to Ri is the set Fi of all functional dependencies in F+ that include only attributes of Ri.

6.3 Normalization Using Functional Dependence Let F’=F1∪ F2∪…… ∪ Fn. F’ is a set of functional dependencies on schema R. if F’+= F+ is true, then every dependency in F is logically implied by F’, and, if we verify that F’ is satisfied, we have verified that F is satisfied. We say that a decomposition having the property F’+= F+ is a dependency-preserving decomposition.

6.3 Normalization Using Functional Dependence Normal Form: The process of further normalization—here in after abbreviated to just normalization—is built around the concept of normal forms. A relvar is said to be in a particular normal form if it satisfies a certain prescribed set of conditions.

City Customer-name Customer-city Customer-street Jones Brooklyn Ray Hayes Palo Alto Heroes 6.3 Normalization Using Functional Dependence ① First normal form: A relvar is in 1NF if and only if, in every legal value of that relvar, every tuple contains exactly one value for each attribute.

6.3 Normalization Using Functional Dependence ②Second normal form (definition assuming only one candidate key, which we assume is the primary key): A relvar is in 2NFif and only if it is in 1NF and every nonkey attribute is irreducibly dependent on the primary key. Example: Relation R=(BNO,Title, Author, Price,Lo-No, Card, Name, Dept,Date) F={BNO→Title, Author, Price, Lo-No Card →Name, Dept BNO,Card →Date}

6.3 Normalization Using Functional Dependence Candidate key {BNO,Card} {BNO,Card} →Title BNO →Title∈F So, it not in 2NF Redundancy

6.3 Normalization Using Functional Dependence ③Third normal form(definition assuming only one candidate key, which we further assume is the primary key): A relvar is in 3NFif and only if it is in 2NF and every nonkey attribute is nontransitively dependent on the primary key. Note: “No transitive dependences” implies no mutual dependencies.

SNO →CLASS CLASS →MASTER 6.3 Normalization Using Functional Dependence Example: Relvar student=(SNO, SNAME, SAGE,MASTER,CLASS, SEX) F={SNO→SNAME, SNO →SAGE, SNO →SEX,SNO →CLASS,CLASS →MASTER} Candidate key {SNO} MASTER transitively dependent on SNO.

6.3 Normalization Using Functional Dependence ④ Boyce/codd normal form(BCNF): A relvar is in BCNF if and only if every nontrivial, left-irreducible FD has a candidate key as its determinant. A relation schema R is in BCNF with respect to a set F of functional dependencies if for all functional dependencies in F+ of the form →, where R and R, at least one of the following holds: ⅰ → is a trivial functional dependency (that is  ) ⅱ  is a superkey for schema R

6.3 Normalization Using Functional Dependence Examples: ①customer-schema=(customer-name,customer- street, customer-city) customer-name→{customer-street,customer-city} customer-name is a candidate key customer-schema is in BCNF ② branch-schema=(branch-name,assets,branch-city) branch-name→ {assets,branch-city} branch-schema is a candidate key branch-schema is in BCNF

6.3 Normalization Using Functional Dependence ③loan-info-schema=(branch-name, customer-name,loan-number, amount) loan-number→{amount,branch-name} loan-number is not a candidate key (Downtown,John Bell,L-44,1000) (Downtown,Jane Bell,L-44,1000) loan-number → {amount,branch-name}is not a trivial dependency. loan-info-schema is not in BCNF

6.4 BCNF decomposition loan-info-schema=(branch-name, customer-name,loan-number, amount) loan-info-schema is not in BCNF. Redundancy exist, so we should decompose it. loan-schema=(loan-number, branch-name, amount) borrower-schema=(customer-name,loan-number) loan-number→{amount,branch-name} loan-number is a candidate key of loan-schema loan-schema and borrower-schema are both in BCNF

6.4 BCNF decomposition If R is not in BCNF, we can decompose R into a collection of BCNF schemas R1,R2……Rn. Which not only a BCNF decomposition, but also a lossless-join decomposition.

6.4 BCNF decomposition Algorithm: result:={R}; done:=false; compute F+; while (not done) do if (there is a schema Ri in result that is not in BCNF) then begin let   be a nontrivial functional dependencythat holds on Ri such that  Ri is not in F+ ,and ∩ =; result:=(result-Ri) ∪(Ri- )∪( ,); end else done:=true;

branch-name→{branch-city,assets} F= loan-number→{amount,branch-name} 6.4 BCNF decomposition Example: Lending-schema={branch-name,branch-city, assets,customer-name,loan-number,amount} A candidate key for this schema is: {loan-number,customer-name} BCNF decomposition : branch-name is not a superkey. Thus ,Lending-schema is not in BCNF, so decompose it.

6.4 BCNF decomposition branch-name→{branch-city,assets} {branch-name,branch-city, assets,customer-name,loan-number,amount}{branch-name}+ ① branch-name→ {branch-name,branch-city, assets,customer-name,loan-number,amount} is not in F+ ②branch-name∩{branch-city,assets}=  R1=branch-name∪{branch-cityassets} R2={branch-name,branch-city, assets,customer-name,loan-number,amount}-{branch-city,assets}

loan-number→{amount,branch-name} F’= 6.4 BCNF decomposition Branch-schema={branch-name,branch-city, assets} Loan-info-schema={branch-name,customer-name,loan-number,amount} branch-name is a key for Branch-schema, Thus ,Branch-schema is in BCNF. loan-number is not a key for Loan-info-schema. Thus,Loan-info-schema is not in BCNF Decompose Loan-info-schema

6.4 BCNF decomposition R3= {branch-name,customer-name,loan-number,amount}-{amount,branch-name} loan-number→{amount,branch-name} {branch-name,customer-name,loan-number,amount}{loan-number}+ ① branch-name→{branch-name,customer-name,loan-number,amount} is not in F’+ ②loan-number∩{amount,branch-name}=  R2= loan-number ∪{amount,branch-name}

6.4 BCNF decomposition Loan-schema={branch-name,loan-number,amount} Borrower-schema={customer-name,loan-number} Branch-schema and Borrower-schema are in BCNF. Lending-schema=(branch-name,branch-city, assets,customer-name,loan-number,amount) Branch-schema={branch-name,branch-city, assets} Loan-schema={branch-name,loan-number,amount} Borrower-schema={customer-name,loan-number}

6.4 BCNF decomposition Algorithm: Input: a schema R and R satisfied the FDs F. Output: R is a lossless-join decomposition which satisfies the FDs F and for each subschemas Ri is a BCNF decomposition which satisfies Fi=∏Ri(F)

6.4 BCNF decomposition ① initialize the result closure  to {R}, ={R} ②if all subschemas in  are BCNF then turn to ④. Let us find a subschemas S in  that is not in BCNF. According to the definition of BCNF, there must has a FD X→A and A-X=, X do not include any candidate key of S, S-X-A≠ . ③ Let S={S1,S2}, S1 is XA, S2 is S-A∪(A∩X), replace the S with {S1,S2} turn to ② ④ Stop decomposing and output 

6.4 BCNF decomposition Compute candidate key: If x+=u and y+is a proper subset of x+ (y+ ≠u). then the attributes in R can be classified into four groups: L: the attributes appear only on the left-hand side of FD R: the attributes appear only on the right-hand side of FD LR: the attributes appear both on the left-hand side and left-hand side of FD N: the attributes do not appear on the FD

6.4 BCNF decomposition ③ L+N unknown: ① the classification of R arenot candidate key ② the classification of(L+N) must appear in candidate key ⅰ L+N =x1 x1+=u then haveonly one candidate key ⅱ L+N =x1 x1+≠u find an attribute in R and add it to the x1.repeat!

6.4 BCNF decomposition Example: Relvar R=(O,I,S,Q,B,D) satisfies the following FDs. F={S→D, I →B, IS →Q, B →O} Give a lossless-join decomposition into BCNF of the schema R L: S, I R: D, Q, O LR: B (SI)+=SIDBQO=Uhave only one candidate key. S→D ISS not in BCNF S→D D-S≠  S do not include IS Decompose: R1=SD R2=OISQB

6.4 BCNF decomposition FR2={I→B, IS →Q, B →O} L: S, I R: Q, O LR: B (SI)+=SIBOQ=Uhave only one candidate key. I→B ISI B-I≠  not in BCNF Decompose:R21=IB R22=OISQ FR21={I→B} FR22={IS →Q} =(R1, R21, R22)

6.4 BCNF decomposition Example: Relvar R=(A,B,C,D) satisfies the following FDs. F={A→D, C→D, D →B} Give a lossless-join decomposition into BCNF of the schema R L: A,C R: B LR: D (AC)+=ACDB=Uhave only one candidate key. Decompose: R1=AD R2=ACB FR2={C→B, A →B} ……

banker-name→{branch-name} F= {branch-name,customer-name}→banker-name 6.4 BCNF decomposition Not every BCNF decomposition is dependency preserving. Example: Banker-schema={branch-name,customer-name,banker-name} banker-name is not a key for Banker-schema. Thus, Banker-schema is not in BCNF. Decompose it.

6.4 BCNF decomposition Banker-branch-schema={banker-name,branch-name} Customer-banker-schema={customer-name,banker-name} F1={banker-name→branch-name} F2=  {branch-name,customer-name}→banker-name in F+ {branch-name,customer-name}→banker-name not in {F1∪F2}+ {F1∪F2}+<>F+ and the decomposition is not dependency preserving

6.4 BCNF decomposition ③ Dependency preservation This example demonstrates that not every BCNF decomposition is dependency preserving. Moreover, it demonstrates that we cannot always satisfy all three design goals: ① BCNF ② Lossless join

6.5 3NF decomposition A relation schema R is in 3NF with respect to a set F of functional dependencies if, for all functional dependencies in F+ of the form →, where R and R, at least one of the following holds: ①→ is a trivial functional dependency (that is  ) ② is a superkey for schema R ③ each attribute A in - is contained in a candidate key for R.

Chapter 6 Relational Database Design

Chapter 6 Relational Database Design

Presentation Transcript

Chapter 8: Relational Database Design

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Chapter 8: Relational Database Design

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Chapter 8: Relational Database Design

Chapter 7: Relational Database Design

Chapter 8: Relational Database Design

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design

Chapter 6: Relational Database Design

Chapter Four Database Design (Relational)

Chapter 7: Relational Database Design

Chapter 7: Relational Database Design