1 / 37

Normalization

Normalization. What is normalization ?. Proposed by Codd in 1972 Takes a relation through a series of steps to certify whether it satisfies a certain normal form Initially Codd proposed three normal forms Boyce- Codd normal form is introduced by Boyce and Codd

gay-torres
Download Presentation

Normalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Normalization

  2. What is normalization ? • Proposed by Codd in 1972 • Takes a relation through a series of steps to certify whether it satisfies a certain normal form • Initially Codd proposed three normal forms • Boyce-Codd normal form is introduced by Boyce and Codd • Based on functional dependencies between attributes of a relation • Later 4th and 5th normal forms were introduced based on multi-valued dependencies and join dependencies

  3. Normalization is the process of efficiently organizing data in a database • There are two goals of the normalization process: • Eliminating redundant data • For example, storing the same data in more than one table • Ensuring data dependencies make sense • Only storing related data in a table • Reduce the amount of space a database consumes and ensure that data is logically stored

  4. Through normalization we want to design for our relational database a set of files that • Contain all the data necessary for the purposes that the database is to serve • Have as little redundancy as possible • Accommodate multiple values for types of data that require them • Permit efficient updates of the data in the database • Avoid the danger of losing data unknowingly

  5. Normalization Avoids • Duplication of Data  • The same data is listed in multiple lines of the database • Insert Anomaly  • A record about an entity cannot be inserted into the table without first inserting information about another entity – Cannot enter a customer without a sales order • Delete Anomaly • A record cannot be deleted without deleting a record about a related entity.  Cannot delete a sales order without deleting all of the customer’s information. • Update Anomaly • Cannot update information without changing information in many places.  To update customer information, it must be updated for each sales order the customer has placed

  6. The Normal Forms • Guidelines for ensuring that databases are normalized • Numbered from 1 through 5 • 1NF, 2NF, 3NF, 4NF and 5NF • In practical applications, • We often see first three normal forms • Occasionally we can see 4th normal form • And 5th normal form is rarely seen

  7. Normalization is a three stage process – • After the first stage, the data is said to be in first normal form • After the second, it is in second normal form • After the third, it is in third normal form

  8. Before Normalization • Begin with a list of all of the fields that must appear in the database.  Think of this as one big table. • Do not include computed fields • One place to begin getting this information is from a printed document used by the system.  • Additional attributes besides those for the entities described on the document can be added to the database.

  9. ORDERS SalesOrderNo, Date, CustomerNo, CustomerName, CustomerAdd, ClerkNo, ClerkName, ItemNo, Description, Qty, UnitPrice

  10. Some definitions: • Functional Dependency • The value of one attribute in a table is determined entirely by the value of the primary key • Partial Dependency • A type of functional dependency where an attribute is functionally dependent on only part of the primary key (primary key must be a composite key). • Transitive Dependency • A type of functional dependency where an attribute is functionally dependent on an attribute other than the primary key.  Thus its value is only indirectly determined by the primary key.

  11. First Normal Form • To disallow multi-valued attributes, composite attributes and complex attributes • Domain of an attribute must include only atomic values (simple and indivisible) • Disallows ‘relations within relations’ or ‘relations as attribute values within tuples’

  12. Example 1: DLOCATION is not an atomic attribute

  13. The domain of DLOCATIONS contain atomic values • The domain of DLOCATIONS contain sets of values (nonatomic)

  14. Techniques to achieve 1NF • Remove the attribute DLOCATIONS that violates 1NF and place it in a separate relation DEPT_LOCATIONS DEPARTMENT DEPT_LOCATIONS

  15. Expand the key so that there will be separate tuple in the original DEPARTMENT for each location disadvantage : introduces redundancy in relation

  16. If the maximum number of values is known for the attribute, replace the attribute by number of atomic attributes disadvantage : introduces null values

  17. Example 2 : • EMP_PROJ( ENO, ENAME, {PROJS ( PNUMBER, HOURS ) } ) • ENO is the primary key and PNUMBER is partial key of relation

  18. Example 3 : PERSON ( IDNO, ENAME, ADDRESS, AGE, PROFESSION, {CAR_LIC}, {PHONE} )

  19. Second Normal Form • The relation should be in first normal form • Based on full functional dependency • A functional dependency X Y is a full functional dependency if removal of any attribute A from X means that the dependency does not hold any more • A є X, (X - {A}) does not functional determine Y

  20. A partial dependency X Y is a partial dependency if some attribute A є X, (X – {A})  Y

  21. Example 1 : • A relation R is in 2NF if every non-prime attribute A in R is fully functionally dependent on the primary key of R • If primary key contains one attribute, the test need not be applied at all

  22. If the relation is not in 2NF, it can be ‘second normalized’ in to a number of 2NF relations in which non-prime attributes are associated only with the part of the primary key on which they are fully functionally dependent

  23. Third Normal Form • Relation should be in second normal form • Based on transitive dependency • A functional dependency A  Y in a relation R is a transitive dependency if there is a set of attributes Z that is neither a candidate key nor a subset of any key of R, and X Z and Z  Y hold

  24. The dependency ENO  DNUMBER and DNUMBER  DMGRENO hold and DNUMBER is neither a key nor a subset of a key • A relation is in 3NF if it satisfies 2NF and no non-prime attribute of R is transitively dependent on the primary key

  25. In what normal form this relation is ???? GRADES (StudentID, Course#, Semester#, Grade) • Suppose you are given a relation R = (A,B,C,D,E) with the following functional dependencies: {CE ! D,D ! B,C ! A}. a. Find all candidate keys. b. Identify the best normal form that R satisfies ( 1NF , 2NF , 3NF )

  26. What is normalization ???? • A relational database is basically composed of tables that contain related data. The process of organizing this data is called as normalization • What is 1 NF (Normal Form)???? • The domain of attribute must include only atomic (simple, indivisible) values. • What is 2NF???? • A relation schema R is in 2NF if it is in 1NF and every non-prime attribute A in R is fully functionally dependent on primary key. • What is 3NF? • A relation schema R is in 3NF if it is in 2NF and for every FD X    A either of the following is true • X is a Super-key of R. • A is a prime attribute of R. • In other words, if every non prime attribute is non-transitively dependent on primary key.

  27. Conceptual Design : Patient ( NID , Name , Age , {CotactDetails ( Address , {Telephone})} , Ward , WardInCharge, WardLocation) Convert this relation into 1st Normal Form, 2nd Normal Form, 3rd Normal Form

  28. Boyce-Codd Normal Form • BCNF • Simpler form of 3NF • Stricter than 3NF • Every relation in BCNF is also in 3NF • Relation in 3NF is not necessarily in BCNF

  29. A relation schema R is in BCNF if whenever a non-trivial functional dependency X A holds in relation R, then X is a super key of R •  is trivial (i.e.,  ) •  is a superkey for R

  30. R = (A, B, C)F = {AB B  C}Key = {A} • R is not in BCNF • Decomposition R1 = (A, B), R2 = (B, C) • R1and R2 in BCNF • Lossless-join decomposition • Dependency preserving

  31. Example

  32. Example :

  33. Patno--> PatNamePatno,appNo--> Time,doctorTime --> appNo

  34. Example : Grade_report ( StudNo, StudName, Grade ( Major, Advisor, Grade ( CourseNo, Ctitle, InstrucName, InstructLocn, Grade ) ) )

More Related