1 / 11

Normalization: Forms and Avoiding Redundancies

Learn about the different normal forms in database normalization and how they help in avoiding/minimizing redundancies. Understand the role of functional dependencies in detecting redundancy. Explore examples and solutions for achieving 1NF, 2NF, 3NF, and Boyce-Codd Normal Form (BCNF). Discover the potential problems with decompositions and the trade-off between redundancy and query performance.

laurabell
Download Presentation

Normalization: Forms and Avoiding Redundancies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Normalization

  2. Normal Forms • If a relation is in a certain normal form(BCNF, 3NF etc.), it is known that certain kinds of redundancies are avoided/minimized. This can be used to help us decide whether decomposing the relation will help. • Role of FDs in detecting redundancy: • Consider a relation R with 3 attributes, ABC. • Given A  BC: There is no redundancy here. • Add B C: Several tuples could have the same B value, and if so, they’ll all have the same C value!

  3. First Normal Form (1NF) • 1NF: Each field is an atomic (scalar) value • All relations are in first normal form (1NF) • Some definitions we’ll need to understand the other normal forms: • A prime attribute is an attribute that is an element of one or more CKs for a relation. • A trivial FD is one in which the RHS is contained in the LHS (e.g., AB  A or B  B). We will ignore such FDs in this lecture.

  4. Second Normal Form (2NF) • Relation R is in 2NF if, for all FDs X  A • A is prime, or • X is not a proper subset of any CK. • Example: R = (part, wh, qty, wh-addr) • FDs: {part, wh}  qty wh  wh-addr • Is R in 2NF?

  5. 2NF (continued) • No, it’s in 1NF only. Only candidate key is {part, wh}, and second FD violates 2NF. • So what? • We’re storing wh-addr once for each part stored at that warehouse. • Solution: R1 ( ) and R2 ( ) • Are R1, R2 in 2NF?

  6. Third Normal Form (3NF) • Relation R is in 3NF if, for all FDs X A • X is a superkey for R, or • A is prime • Example: ED = (enum, ename, sal, dnum, dname, mgr) • FDs: enum  ename, sal, dnum dnum  dname, mgr • Is ED in 2NF? 3NF? • The problem: stored for each employee in a department

  7. 3NF (continued) • One solution: • Emp (enum, ename, sal, mgr) • Dept (dnum, dname, mgr) • Removes the redundancy • A problem remains: • This is an example of a lossy-join decomposition, which is avoidable in 3NF. • Use Emp1 (enum, ename, sal, ) instead of Emp.

  8. Boyce-Codd Normal Form (BCNF) • Relation R is in BCNF if, for all FDs X A • X is a superkey • Example: R = (branch, cust, banker) • FDs: banker  branch cust, branch  banker • Is R in 3NF? BCNF? • Problems: • Bankers must have at least one customer • Branch stored redundantly for each of a banker’s customers

  9. BCNF (continued) • Decomposition: R1 (banker, branch) and R2 (cust, banker) • avoids redundancy and other problems • A problem remains: • In this case, no dependency-preserving decomposition into BCNF is possible.

  10. Problems with Decompositions • There are three potential problems to consider: • Some queries become more expensive. • e.g., find employee and department names • Given instances of the decomposed relations, we may not be able to reconstruct the corresponding instance of the original relation! • First 3NF decomposition attempt • Checking some dependencies may require joining the instances of the decomposed relations. • BCNF decomposition example • Tradeoff: Must consider these issues vs. redundancy.

  11. Summary of Schema Refinement • If a relation is in BCNF, it is free of redundancies that can be detected using FDs. • If a relation is not in BCNF, we can try to decompose it into a collection of BCNF relations: • Lossless-join, dependency preserving decomposition into BCNF is not always possible • Lossless-join decomposition into BCNF is always possible • Lossless-join, dependency preserving decomposition into 3NF is always possible • Decompositions should be carried out and/or re-examined while keeping performance requirements in mind. • Various decompositions of a single schema are possible.

More Related