1 / 23

Normalization

Normalization. A logical design method which minimizes data redundancy and reduces design flaws. Consists of applying various “normal” forms to the database design. The normal forms break down large tables into smaller subsets. Anomalies.

luann
Download Presentation

Normalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Normalization • A logical design method which minimizes data redundancy and reduces design flaws. • Consists of applying various “normal” forms to the database design. • The normal forms break down large tables into smaller subsets.

  2. Anomalies Relations that have redundant data may have problems called update anomalies, which are classified as: Insertion anomalies Deletion anomalies Modification anomalies

  3. Example of Anomalies To insert a new staff with branchNo B007 into the StaffBranch relation; To delete a tuple that represents the last member of staff located at a branch B007; To change the address of branch B003. StaffBranch

  4. Example of Anomalies (Contd ...) Staff Branch

  5. Relationship of Normal Forms

  6. Stages of Normalisation Remove repeating groups First normal form (1NF) Remove partial dependencies Second normal form (2NF) Remove transitive dependencies Third normal form (3NF) Remove remaining functional dependency anomalies Boyce-Codd normal form (BCNF) Remove multivalued dependencies Fourth normal form (4NF) Remove remaining anomalies Fifth normal form (5NF)

  7. Normalization Unnormalized Relations First normal form Functional dependencyof nonkey attributes on the primary key - Atomic values only Second normal form No transitive dependency between nonkey attributes Third normal form Boyce- Codd and Higher Full Functional dependencyof nonkey attributes on the primary key All determinants are candidate keys - Single multivalued dependency

  8. First Normal Form (1NF) • Each attribute must be atomic: • First Normal Form is a relation in which • the intersection of each row and column • contains one and only one value. • No repeating columns within a row. • No multi-valued columns.

  9. 1NF Employee (unnormalized) Employee (1NF)

  10. Second Normal Form (2NF) • Each attribute must be functionally dependent on the primary key. • Functional dependence - the property of one or • more attributes that uniquely determines the • value of other attributes. • Any non-dependent attributes are moved into a • smaller (subset) table. • 2NF improves data integrity. • Prevents update, insert, and delete anomalies.

  11. Functional Dependencies Functional dependency describes the relationship between attributes in a relation. For example, if A and B are attributes of relation R, and B is functionally dependent on A ( denoted A B), if each value of A is associated with exactly one value of B. ( A and B may each consist of one or more attributes.) B is functionally A B dependent on A Determinant Refers to the attribute or group of attributes on the left-hand side of the arrow of a functional dependency

  12. Functional dependency Full functional dependency indicates that if A and B are attributes of a relation, B is fully functionally dependent on A if B is functionally dependent on A, but not on any proper subset of A. A functional dependency AB is partially dependentif there is some attributes that can be removed from A and the dependency still holds.

  13. Functional Dependencies (Contd ...) Trival functional dependencymeans that the right-hand side is a subset ( not necessarily a proper subset) of the left- hand side. For example: staffNo, sName  sName staffNo, sName  staffNo They do not provide any additional information about possible integrity constraints on the values held by these attributes. We are normally more interested in nontrivial dependencies because they represent integrity constraints for the relation.

  14. Identifying the primary key Functional dependencyis a property of the meaning or semantics of the attributes in a relation. When a functional dependency is present, the dependency is specified as a constraint between the attributes. An important integrity constraint to consider first is the identification of candidate keys, one of which is selected to be the primary key for the relation using functional dependency.

  15. Examples STORE(SNAME, ADDR, ZIP, ITEM, PRICE) FDs: SNAME  ADDR ADDR  ZIP SNAME, ITEM  PRICE Finding a key: SNAME does not appear in RHS, so SNAME must be a part of the key. Since SNAME  ADDR  ZIP, we know SNAME  ADDR, ZIP But SNAME alone cannot determine any more. How can we determine ITEM and PRICE? If we have ITEM, then we can determine PRICE So, SNAME, ITEM  SNAME, ADDR, ZIP, ITEM, PRICE. So it satisfies the definition of the key Identifying the primary key (Contd ...)

  16. Inference Rules • Armstrong’s axioms • Relfexivity: If B is a subset of A, them A  B • Augmentation: If A  B, then A, C  B • Transitivity: If A  B and B  C, then A C • Self-determination: A  A • Decomposition: If A  B,C then A  B and A C • Union: If A  B and A  C, then A B,C • Composition: If A  B and C  D, then A,C B,D

  17. Functional Dependence Employee (1NF) Name, dept_no, and dept_name are functionally dependent on emp_no. (emp_no -> name, dept_no, dept_name) Skills is not functionally dependent on emp_no since it is not unique to each emp_no.

  18. 2NF Employee (1NF) Employee (2NF) Skills (2NF)

  19. Data Integrity Employee (1NF) • Insert Anomaly - adding null values. eg, inserting a new department does • not require the primary key of emp_no to be added. • Update Anomaly - multiple updates for a single name change, causes • performance degradation. eg, changing IT dept_name to IS • Delete Anomaly - deleting wanted information. eg, deleting the IT • department removes employee Barbara Jones from the database

  20. Third Normal Form (3NF) • Remove transitive dependencies. • Transitive dependence - two separate entities • exist within one table. • Any transitive dependencies are moved into a • smaller (subset) table. • 3NF further improves data integrity. • Prevents update, insert, and delete anomalies.

  21. Transitive Dependence Employee (2NF) dept_name is functionally dependent on dept_no. dept_no is functionally dependent on emp_no, so via the middle step of dept_no, dept_name is functionally dependent on emp_no. (emp_no -> dept_no , dept_no -> dept_name, thus emp_no -> dept_name)

  22. 3NF Employee (2NF) Department (3NF) Employee (3NF)

  23. Other Normal Forms • Boyce-Codd Normal Form (BCNF) • Strengthens 3NF by requiring the keys in the functional dependencies to be superkeys (a column or columns that uniquely identify a row) • Fourth Normal Form (4NF) • Eliminate trivial multivalued dependencies. • Fifth Normal Form (5NF) • Eliminate dependencies not determined by keys.

More Related