1 / 33

Lecture 09: Functional Dependencies, Database Design

Lecture 09: Functional Dependencies, Database Design. Monday, January 24, 2005. Outline. Functional dependencies (3.4) Rules about FDs (3.5) Design of a Relational schema (3.6). Functional Dependencies. Definition: A 1 , ..., A m  B 1 , ..., B n holds in R if:

johnda
Download Presentation

Lecture 09: Functional Dependencies, Database Design

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 09:Functional Dependencies,Database Design Monday, January 24, 2005

  2. Outline • Functional dependencies (3.4) • Rules about FDs (3.5) • Design of a Relational schema (3.6)

  3. Functional Dependencies Definition: A1, ..., Am  B1, ..., Bn holds in R if: t, t’  R, (t.A1=t’.A1  ...  t.Am=t’.Am  t.B1=t’.B1  ...  t.Bn=t’.Bn ) R t if t, t’ agree here then t, t’ agree here t’

  4. In Class: Find All FDs

  5. Answer Course  Dept, Room Dept, Room  Course Student, Dept  Course, Room Student, Course  Dept, Room Student, Room  Dept, Course Do all FDsmake sensein practice ?

  6. Review: Closure Example A, B  C A, D  E B  D A, F  B R(A,B,C,D,E,F) Compute {A,B}+ X = {A, B, } Compute {A, F}+ X = {A, F, }

  7. Review: Inference Example: A, B  CA, D  B B  D Step 1: Compute X+, for every X: A+ = A, B+ = BD, C+ = C, D+ = D AB+ = ABCD, AC+ = AC, AD+ = ABCD ABC+ = ABD+ = ACD+ = ABCD (no need to compute– why ?) BCD+ = BCD, ABCD+ = ABCD Step 2: Enumerate all FD’s X  Y, s.t. Y X+ and XY = : AB  CD, ADBC, ABC  D, ABD  C, ACD  B

  8. Keys • A superkey is a set of attributes A1, ..., An s.t. for any other attribute B, we have A1, ..., An B • In other words: a set of attributes X is a superkey if X+ = all attributes • A key is a minimal superkey • I.e. set of attributes which is a superkey and for which no subset is a superkey

  9. Computing (Super)Keys • Compute X+ for all sets X • If X+ = all attributes, then X is a key • List only the minimal X’s Note: there can be many keys ! • Example: R(A,B,C), ABC, BCAKeys: AB and BC

  10. Examples of Keys • Product(name, price, category, color) name, category  price category  color Key is: {name, category} ; superkeys are all supersets • Enrollment(student, address, course, room, time) student  address room, time  course student, course  room, time Keys are: [in class]

  11. Person name ssn address FD’s for E/R Diagrams Given a relation constructed from an E/R diagram, what is its key? Rule 1: If the relation comes from an entity set, the key of the relation is the set of attributes which is the key of the entity set. Person(address, name, ssn)

  12. FD’s for E/R Diagrams Rule 2: If the relation comes from a many-many relationship, the key of the relation is the set of all attribute keys in the relations corresponding to the entity sets name buys Person Product price name ssn date buys(name, ssn, date)

  13. Product Purchase Store CreditCard Person FD’s for E/R Diagrams Except: if there is an arrow from the relationship to E, then we don’t need the key of E as part of the relation key. sname name card-no ssn Purchase(name , sname, ssn, card-no)

  14. FD’s for E/R Diagrams More rules: • Many-one, one-many, one-one relationships • Multi-way relationships • Weak entity sets (Try to find them yourself, or check book)

  15. FD’s for E/R Diagrams Say: “the CreditCard determines the Person” Product sname Purchase name Store Incomplete(what doesit say ?) card-no CreditCard Person ssn Purchase(name , sname, ssn, card-no) card-no  ssn

  16. Relational Schema Design(or Logical Design) Main idea: • Start with some relational schema • Find out its FD’s • Use them to design a better relational schema

  17. Data Anomalies When a database is poorly designed we get anomalies: Redundancy: data is repeated Updated anomalies: need to change in several places Delete anomalies: may lose data when we don’t want

  18. Relational Schema Design Recall set attributes (persons with several phones): SSN Name, City but not SSN  PhoneNumber • Anomalies: • Redundancy = repeat data • Update anomalies = Fred moves to “Bellevue” • Deletion anomalies = Joe deletes his phone number: what is his city ?

  19. Relation Decomposition Break the relation into two: • Anomalies have gone: • No more repeated data • Easy to move Fred to “Bellevue” (how ?) • Easy to delete all Joe’s phone number (how ?)

  20. name buys Person Product price name ssn Relational Schema Design Conceptual Model: Relational Model: plus FD’s Normalization: Eliminates anomalies

  21. Decompositions in General R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp) R1 = projection of R on A1, ..., An, B1, ..., Bm R2 = projection of R on A1, ..., An, C1, ..., Cp

  22. Decomposition • Sometimes it is correct: Lossless decomposition

  23. Incorrect Decomposition • Sometimes it is not: What’sincorrect ?? Lossy decomposition

  24. Decompositions in General R(A1, ..., An, B1, ..., Bm, C1, ..., Cp) R1(A1, ..., An, B1, ..., Bm) R2(A1, ..., An, C1, ..., Cp) If A1, ..., An B1, ..., Bm Then the decomposition is lossless Note: don’t need A1, ..., An C1, ..., Cp Example: name price, hence the first decomposition is lossless

  25. Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Third Normal Form (3NF) = will discuss Boyce Codd Normal Form (BCNF) = will discuss Others...

  26. Boyce-Codd Normal Form A simple condition for removing anomalies from relations: A relation R is in BCNF if: If A1, ..., An B is a non-trivial dependency in R , then {A1, ..., An} is a superkey for R In English (though a bit vague): Whenever a set of attributes of R is determining another attribute, should determine all the attributes of R.

  27. BCNF Decomposition Algorithm Repeat choose A1, …, Am B1, …, Bn that violates the BNCF condition split R into R1(A1, …, Am, B1, …, Bn) and R2(A1, …, Am, [others]) continue with both R1 and R2Until no more violations Is there a 2-attribute relation that is not in BCNF ? B’s A’s Others In practice, we havea better algorithm (next): R1 R2

  28. Example What are the dependencies? SSN  Name, City What are the keys? {SSN, PhoneNumber} Is it in BCNF?

  29. Decompose it into BCNF SSN  Name, City • Let’s check anomalies: • Redundancy ? • Update ? • Delete ?

  30. Example Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN  name, age age  hairColor Decompose in BCNF (in class):

  31. BCNF Decomposition Algorithm BCNF_Decompose(R) find X s.t.: X ≠X+ ≠ [all attributes] if (not found) then “R is in BCNF” let Y = X+ - Xlet Z = [all attributes] - X+ decompose R into R1(X  Y) and R2(X  Z) continue to decompose R1 and R2

  32. Find X s.t.: X ≠X+ ≠ [all attributes] Example BCNF Decomposition Person(name, SSN, age, hairColor, phoneNumber) SSN  name, age age  hairColor • Iteration 1: Person • SSN+ = SSN, name, age, hairColor Decompose into: P(SSN, name, age, hairColor) Phone(SSN, phoneNumber) Iteration 2: P age+ = age, hairColor Decompose: People(SSN, name, age) Hair(age, hairColor) Phone(SSN, phoneNumber) What isthe key ?

  33. Example • R(A,B,C,D) A  B, B  C • In R: A+ = ABC ≠ ABCD • Split into R1(A,B,C), R2(A,D) • In R1: B+ = BC ≠ ABC • Split into R11(B,C), R12(A,B) • Final decomposition: R11(B,C), R12(A,B), R2(A,D) • What happens if in R we first pick B+ ? Or AB+ ? What arethe keys ?

More Related