1 / 30

Database

Database. Normalization. Designing Good Schemas. We know how to create schemas, but ... how do we create good schemas? what does good mean? Schema quality measurements: semantics of the attributes minimal redundancy minimal frequency of null values. Functional Dependences.

Download Presentation

Database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Normalization

  2. Designing Good Schemas • We know how to create schemas, but ... • how do we create good schemas? • what does good mean? • Schema quality measurements: • semantics of the attributes • minimal redundancy • minimal frequency of null values

  3. Functional Dependences • A column Y of relational table R is functionally dependent up on column X of relational table R if and only if: Each value of X in R associated with each value of Y at any given time

  4. Functional dependences • Y is functional dependent up on X same as values of X identify values of Y • If X  Y then XZYZ • IF XY and Y  Z then XZ • X Y means that Y depend on X or X identify Y

  5. Examples • S#  Ename • {S#, P#}  Hours • If for each value of S#, there are exactly one corresponding value for sname, state, city then: S# Sname Sate City

  6. Example • If {S#, p#}  Qty S# P# QTY

  7. Redundancy Example • Where’s the redundancy?

  8. Redundancy Example

  9. Example FDs Proper FDs Transitive FDs Partial Key FD Partial Key FDs

  10. Normal Forms • Each normal form is a set of conditions on a schema that guarantees certain properties (relating to redundancy and update anomalies) • The two commonly used normal forms are third normal form (3NF) and Boyce-Codd normal form (BCNF)

  11. remove multi-valued attributes remove partial dependencies remove transitive dependencies 1NF 2NF 3NF remove remaining FD anomal dependencies remove multivalue dependencies remove remaining anomalies BCNF 4NF 5NF Normalization 0NF

  12. 1 NF • First normal form is • NO multi-valued attributes • No composite attribute • No nested relation We create new table or new field (telephone, visiting)

  13. 1NF Normalization Proper translation from ER multi-value attributes will achieve 1NF. Still not a good solution,since we have redundancy in Dnumber and Dmgr_ssn.(This will be handled by 2NF.)

  14. 2 NF form • Second normal form that if primary key is multiple attribute and non-key attribute depend on part of primary key Hours Cname Loc S# P# pname

  15. 2NF Normalization Move the partial key and dependent attributes to a new relation.

  16. Transitive Dependencies • X → Y is a transitive dependency (PD) if there exists Z ⊈ any key such that X → Z → Y • TDs can cause redundancy if there are multiple values of X that determine the same value of Z • the value of Y for that value of Z is stored multiple times • 3NF normalization: move (Z,Y) to new relation in which Z is the primary key

  17. 3 NF • The relation in 3NF if it is 2 NF and every non-key attribute is non-transitively dependent on primary key

  18. 3NF Normalization • Create new relation to hold the attributes in the transitive FD. • LHS of transitive FD becomes PK of new relation.

  19. DEPT COURSE SECTION ROOM INSTR I_OFFICE DEPT COURSE SECTION ROOM INSTR I_OFFICE COMP 51 1 WPC122 DOHERTY CSB109 COMP 51 2 WPC219 CLIBURN CSB107 COMP 163 1 WPC122 DOHERTY CSB109 COMP 53 1 WPC130 CSB108 BOWRING COMP 53 2 WPC130 CSB104 CARMAN Transitive Dependency Example I_OFFICE (instructor's office) is determined by the non-PK attribute INSTR

  20. DEPT COURSE SECTION ROOM INSTR I_OFFICE DEPT COURSE SECTION ROOM INSTR INSTR I_OFFICE NF Decomposition: Foreign Keys Decomposition:

  21. Normalization Goal = BCNF = Boyce-Codd Normal Form = all FD’s follow from the fact “key  everything.” • Formally, R is in BCNF if for every nontrivial FD for R, say XA, then X is a superkey. • “Nontrivial” = right-side attribute not in left side. Why? 1. Guarantees no redundancy due to FD’s. 2. Guarantees no update anomalies = one occurrence of a fact is updated, not all. 3. Guarantees no deletion anomalies = valid fact is lost when tuple is deleted. Arthur Keller – CS 180

  22. Boyce-Codd Normal Form • Sample data for Course Section table • Because Prefix  Department, we know that (Prefix, Num, SecNum) could also be a primary key for this table.

  23. Example Students(name, addr, phones, CarLiked) • A student’s phones are independent of the cars they like. • Thus, each of a student’s phones appears with each of the cars they like in all combinations. • This repetition is unlike redundancy due to FD’s, of which name->addr is the only one.

  24. Example • Students(name, addr, CarLiked, manf, favCar) • FD’s: name->addr favCar, carsLiked->manf • Only key is {name, CarsLiked}. • In each FD, the left side is not a superkey. • Any one of these FD’s shows Students is not in BCNF

  25. Boyce-Codd Normal Form • We say a relation R is in BCNF if whenever X ->A is a nontrivial FD that holds in R, X is a superkey. • Remember: nontrivial means A is not a member of set X. • Remember, a superkey is any superset of a key (not necessarily a proper superset).

  26. Example • Students(name, addr, CarsLiked, manf, favCar) • F = name->addr, name -> favCar, CarsLiked->manf • Pick BCNF violation name->addr. • Close the left side: {name}+ = {name, addr, favCar}. • Decomposed relations: • Students1(name, addr, favCar) • Students2(name, CarsLiked, manf)

  27. 3NF and BCNF • 3rd Normal Form (3NF) modifies the BCNF condition so we do not have to decompose in this problem situation. • X ->A violates 3NF if and only if X is not a superkey, and also A is not prime.

  28. SHIPMENT SID FROM_CITY TO_CITY DISTANCE WEIGHT Exercises • The following relation schema is not in third normal form (3NF). Is this an example of a transitive dependencyor a partial key dependency?Give an equivalent schema that is in 3NF.

  29. Exercises • This relation has been proposed to track Pacific alumni: Alumni( SID, LastName, FirstName, Degree, YearAwarded, Phone).Pacific allows students to receive multiple degrees,possibly in different years. Identify all FDs.Give a new schema that is in third normal form.

  30. Exercises • Consider the following relation schema:Movie(title, genre, length, actor, sag_id, studio, studio_addr) • Every movie has a unique title. • A movie may have multiple actors. • Each actor has a unique sag_id. • An actor may appear in multiple movies. • A movie has exactly one studio, but a studio may produce more than one movie. • Each studio has exactly one address. • Identify all functional dependencies. • Normalize the schema to 3NF.

More Related