1 / 55

Lecture 3 Functional Dependency and Normal Forms

CS157B. Lecture 3 Functional Dependency and Normal Forms. Prof. Sin-Min Lee Department of Computer Science. Database Design Process. Application 1. Application 2. Application 3. Application 4. External Model. External Model. External Model. External Model. Application 1.

elvin
Download Presentation

Lecture 3 Functional Dependency and Normal Forms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS157B Lecture 3 Functional Dependency and Normal Forms Prof. Sin-Min Lee Department of Computer Science

  2. Database Design Process Application 1 Application 2 Application 3 Application 4 External Model External Model External Model External Model Application 1 Conceptual requirements Application 2 Conceptual requirements Conceptual Model Logical Model Internal Model Application 3 Conceptual requirements Application 4 Conceptual requirements

  3. Relational Database Model Relations Source: ESRI Advanced ArcInfo

  4. Source: ESRI Advanced ArcInfo

  5. Source: ESRI Advanced ArcInfo

  6. Source: ESRI Advanced ArcInfo

  7. Georelational Database Model

  8. Attribute Relationships Functional Dependency: refers to the relationships between attributes within a relation. If the value of attribute A determines the value of attribute B, then attribute B is functionally dependent upon attribute A.

  9. Source: ESRI Advanced ArcInfo

  10.  Functional Dependencies X -> Y means: • X functionally determines Y • Y depends on X • Values of Y component depend on, determined by values of X component

  11. Functional Dependencies Given t1 and t2: • if t1[X] = t2 [X] then t1[Y] = t2 [Y] (1) • In other words if the values of X are equal, then Y value are equal • Values of X component uniquely (functionally) determine values of Y component iff (1)

  12. Data Normalization • Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data. • The process of decomposing relations with anomalies to produce smaller, well-structured relations. • Primary Objective: Reduce Redundancy,Reduce nulls, • Improve “modify” activities: • insert, • update, • delete, • but not read • Price: degraded query, display, reporting

  13. Normal Forms • First Normal Form (1NF) • Second Normal Form (2NF) • Third Normal Form (3NF) • Boyce-Codd Normal Form (BCNF) • Fourth Normal Form (4NF) • Fifth Normal Form (5NF)

  14. Unnormalized Relations First normal form Functional dependencyof nonkey attributes on the primary key - Atomic values only Second normal form No transitive dependency between nonkey attributes Third normal form Boyce- Codd and Higher Full Functional dependencyof nonkey attributes on the primary key All determinants are candidate keys - Single multivalued dependency Normalization

  15. Unnormalized Relations • First step in normalization is to convert the data into a two-dimensional table • In unnormalized relations data can repeat within a column

  16. Unnormalized Relation

  17. First Normal Form • To move to First Normal Form a relation must contain only atomic values at each row and column. • No repeating groups • A column or set of columns is called a Candidate Key when its values can uniquely identify the row in the relation.

  18. First Normal Form

  19. Second Normal Form • A relation is said to be in Second Normal Form when every nonkey attribute is fully functionally dependent on the primary key. • That is, every nonkey attribute needs the full primary key for unique identification

  20. Second Normal Form

  21. Second Normal Form

  22. Second Normal Form

  23. Third Normal Form • A relation is said to be in Third Normal Form if there is no transitive functional dependency between nonkey attributes • When one nonkey attribute can be determined with one or more nonkey attributes there is said to be a transitive functional dependency. • The side effect column in the Surgery table is determined by the drug administered • Side effect is transitively functionally dependent on drug so Surgery is not 3NF

  24. Third Normal Form

  25. Third Normal Form

  26. Functional Dependency and Keys • Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute. • Candidate Key: Each non-key field is functionally dependent on every candidate key.

  27. Steps in Normalization

  28. Normalization – most used • Four most commonly used normal forms are first (1NF), second (2NF) and third (3NF) normal forms, and Boyce–Codd normal form (BCNF). • Based on functional dependencies among the attributes of a relation. • A relation can be normalized to a specific form to prevent possible occurrence of update anomalies.

  29. First Normal Form • No multi-valued attributes. • Every attribute value is atomic. • Why are the following tables not in 1NF Employee (ssn, Name, Salary, Address, ListOfSkills) Department (Did, Dname, ssn)

  30. Second Normal Form • 1NF and every non-key attribute is fully functionally dependent on the primary key. • Every non-key attribute must be defined by the entire key, not by only part of the key. • No partial functional dependencies. Assuming that we have a composite PK (LicensePlate, OwnerSSN) for the Vechicle Table below, why is the table not in 2NF ? Vehicle (LicensePlate, Brand, Model, PurchasePrice, Year, OwnerSSN, OwnerName)

  31. Third Normal Form & BCNF • 2NF and no transitive dependencies (functional dependency between non-key attributes = BCNF) Why are the following tables not in 3NF or BCNF ? • Why is Employee [ssn, name, salary, did, dname] • Customer

  32. 3NF & BCNF • It is very rare for a Table to be in 3NF and not be in BCNF (violation of BCNF). • Given a Relation R with attributes A, B and C where A and B are together the composite PK, IF A, B -> C and C -> B THEN R is in 3NF and is not in BCNF Example: Student, course -> Instructor Instructor -> Course

  33. Steps in Normalization • 1NF: a table, without multivalued attributes • if not, then decompose • 2NF: 1NF and every non-key attribute is fully functionally dependent on the primary key • if not, then decompose • 3NF: 2NF and no transitive dependencies • if not, then decompose • GENERAL: • Each table should describe a single theme • Modification anomalies are minimized Hint: THE KEY, THE WHOLE KEY AND NOTHING BUT THE KEY

  34. Consider the following scheme from an airline database system: • ( P (pilot) , F (flight# ), D (date), T (scheduled time to depart) ) • We have the following FD's : • F ----> T PDT ----> F FD ----> P • Provide some superkeys: • PDT is a superkey, and FD is a superkey. • Is PDT a candidate key? • PD is not a superkey, nor is DT, nor is PT. • So, PDT is a candidate key. • FD is also a candidate key, since neither F or D are superkeys. EXAMPLE - OBTAIN CANDIDATE KEYS

  35. CLOSURE OF A SET OF FD'S • If F is a set of functional dependencies for a relation R, the set of all functional dependencies that can be derived from F, denoted by F+, is called the CLOSURE of F. • We can use Armstrong's axioms, and the 3 derived rules, to compute the closure of F, F+.

  36. WORKING TO GET THE CLOSURE F+ • GIVEN: scheme (A, B, C, G, H, I) • GIVEN: FD set (A--->B, A--->C, CG--->H, CG--->I, B--->H) • Some members of F+ are • A--->H {Transitivity Rule applied to A--->B and B--->H) • CG--->HI {Union Rule applied to CG--->H and CG--->I} • AG--->I {By Augmentation Rule, AG--->CG; then Transitivity}

  37. THE CLOSURE OF A SET OF ATTRIBUTES • GIVEN: FD set F and a given attribute A (or set of attributes A) • FIND : The set of attributes functionally dependent on A, called the closure of A, and denoted by A+ • IMPORTANT USE FOR THIS: To determine if A is a superkey, we compute A+, the set of attributes functionally dependent on A. If A+ consists of ALL the attributes in the relation, then A is a superkey • HOW DO WE FIND A+? The following algorithm does the trick!

  38. ALGORITHM TO FIND THE CLOSURE OF ATTRIBUTE A, DENOTED BY A+ • result := A; • while { result changes } • for each functional dependency B--->C • begin • if B is contained in result, then result := result U C ' end • endwhile • A+ := result

  39. EXAMPLE TO FIND THE CLOSURE A+ OF AN ATTRIBUTE A • GIVEN: Relation R with attributes W, X, Y, Z and FD's W ---> Z YZ ---> X WZ ---> Y • FIND : WZ+ • PSEUDO TRACE OF THE ALGORITHM: • result := WZ • from first 2 FD's, no change to "result" • from WZ ---> Y, since WZ is contained in result, we • get result := WZY • since YZ is contained in result, we get result := WZYX • Thus, every attribute in R is in WZ+, so WZ is a superkey!

  40. Normalization • Normalization of data - method for analyzing schemas • Unsatisfactory schemas decomposed into smaller ones with desirable properties • Objectives of normalization • good relation schemas disallowing update anomalies

  41. Formal framework • database normalized to any degree (1, 2, 3, 4, 5, etc.) • normalization is not done in isolation • need: • lossless join • dependency preservation • additional normal forms meet other desirable criteria

  42. Normal Forms • 1st, 2nd, 3rd, BCNF consider only FD and key constraints • constraints must not be hard to understand or detect • need not normalize to highest form (e.g. for performance reasons)

  43. 1NF - 1st normal form • part of the formal definition of a relation • disallow multivalued attributes, composite attributes and their combination • In 1NF single (atomic, indivisible) values

  44. Normalize into 1NF? • How to normalize nested relations into 1NF? • Remove nested relation attributes into new relation • propagate PK • combine PK and partial PK • recursively unnest - multilevel nesting • useful in converting hierarchical schemes into 1NF

  45.  Difficulties with 1NF • insert, delete, update •  Determine if describe entity identified by PK? • If not, called non-full FDs • we need full FDs for good inserts, deletes, updates

  46.  Second Normal Form - 2NF • Uses the concepts of FDs, PKs and this definition: • An FD is a Full functional dependency if: given Y -> Z Removal of any attribute from Y means the FD does not hold any more

  47. 2NF • A relation schema R is in 2NF if: • Relation is in 1NF • Every non-prime attribute A in R is fully functionally dependent on the primary key Prime attribute - attribute that is a member of the primary key K • R can be decomposed into 2NF relations via the process of 2NF normalization • Remove partial dependencies • create new relations where partials are full

  48. Simplifying Functional Dependencies through Normalization Normalization: the identification of functional dependencies and the modifications required to structurally change the database to remove undesirable dependencies

  49. Source: ESRI Advanced ArcInfo

  50. Source: ESRI Advanced ArcInfo

More Related