Understanding Database Normalization: Principles and Decomposition Techniques
This resource provides an in-depth review of database normalization, focusing on the need for normalization to eliminate redundancy and anomalies in relational databases. It covers key concepts such as decomposition into Boyce-Codd Normal Form (BCNF) and Third Normal Form (3NF), including practical algorithms for achieving these forms. The goals of decomposition, including lossless decomposition, preservation of dependencies, and recovering original information, are discussed. Exercises are included to test understanding and application of these principles.
Understanding Database Normalization: Principles and Decomposition Techniques
E N D
Presentation Transcript
Rensselaer Polytechnic Institute CSCI-4380 – Database Systems David Goldschmidt, Ph.D. Database Systems{week 04a}
The need for normalization (review) • Without normalization, problems with relations include: • Unnecessary redundancy • Insert anomalies • Update anomalies • Delete anomalies
Decomposition (review) • Splitting a relation into two (more specific) relations is called decomposition • The objective is to have eachresulting relation be atomic • i.e. each relation should containonly information related to the key
BCNF(review) • A given relation R with set F of functional dependencies is in BCNF if and only if all functional dependencies X Y in F are: • either trivial (i.e. Y X) • or X is a superkey of R • If relation R is not in BCNF, it is possible to use decomposition to transform R to BCNF
Decomposition into BCNF (review) • Given a set F of functional dependencies for relation R( A1, A2, ..., An ) that is not in BCNF: • Convert F to a minimal basis • Find an X Y that violates BCNF • Compute closure X+ • Decompose R into: • R1 containing all attributes of X+ • R2 containing { A1, A2, ..., An } – ( X+ – X ) • Project functional dependencies onto R1 and R2 Repeat!
Lossless decomposition • A decomposition of R into relationsR1, R2, ..., Rn is considered lossless iffor all possible instances of R, weare guaranteed that: • R1⋈ R2⋈ ... ⋈ Rn = R • Note that the order of the natural joinsis not important since ⋈ is both associativeand symmetric
Goals of decomposition • Decomposition (hopefully) achieves: • Elimination of anomalies: remove redundancyand update/insert/delete anomalies • Recoverability of information: can we recoverthe original relation from its decomposition? • Preservation of dependencies:can we reconstruct the originalfunctional dependencies?
Third Normal Form (3NF) • A given relation R with set F of functional dependencies is in 3NF if and only if all functional dependencies X Y in F are: • trivial (i.e. Y X) • or X is a superkey of R • or all attributes of Y are prime attributes • A prime attribute is an attribute that is a member of some key of relation R
Decomposition into 3NF (part 1) • Given a set F of functional dependencies for relation R( A1, A2, ..., An ) that is not in 3NF: • Convert F to a minimal basis • Combine all functional dependencies withthe same left-hand side • Set D = { } • For each functional dependency X Y in F: • If there is no relation in D that contains all attributes in X and Y, then add a relation with attributes X Y to D
Decomposition into 3NF (part 2) • If there are no relations in D that have all attributes of one of the keys of R, then add a new relation to D that contains all attributes in one of the keys of R • Simplify D by removing redundancy: • if R1and R2are in D, but R2 contains allattributes of R1, then remove R1
Algorithm results • The 3NF decomposition algorithm guarantees the following: • The resulting relations are in 3NF • The decomposition is dependency preserving • The decomposition is lossless
Exercises • Given relation R( A, B, C, D, E, F ) andfunctional dependencies AB AC,CE DB, B A, and D AE • What are the keys of R? • What are the superkeys of R that are not keys? • Is relation R in 3NF? If not, decompose R such that it is in 3NF