1 / 31

Database Design Normal forms & Normalization Compiled by S. Z.

Database Design Normal forms & Normalization Compiled by S. Z. Problems with unnormalized data . Contains redundancy Contains multi-values Does not have a primary key identified Cause data anomaly Etc. Example of Unnormalized Data. The idea is simple!.

abra
Download Presentation

Database Design Normal forms & Normalization Compiled by S. Z.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database DesignNormal forms& NormalizationCompiled by S. Z.

  2. Problems with unnormalizeddata • Contains redundancy • Contains multi-values • Does not have a primary key identified • Cause data anomaly • Etc.

  3. Example of Unnormalized Data

  4. The idea is simple! • If you want to normalize data, normalize at the higher level first, i.e., normalize the meta data, which is table design!! Guide to Oracle 10g

  5. History • Edgar F. Codd first proposed the process of normalization and what came to be known as the 1st normal form in his paper A Relational Model of Data for Large Shared Data Banks Codd stated: “There is, in fact, a very simple elimination procedure which we shall call normalization. Through decomposition nonsimple domains are replaced by ‘domains whose elements are atomic (nondecomposable) values.’”

  6. Database Normalization • Database normalization is the process of removing redundant data from your tables in to improve storage efficiency, data integrity, and scalability, through removing un-normalized relationship between the attributes in the same table. • Normalization generally involves splitting existing tables into multiple ones, which must be re-joined or linked each time a query is issued.

  7. In the relational model, methods exist for quantifying how efficient a database is. • These classifications are called normal forms (or NF), and there are algorithms for converting a given database between them. Guide to Oracle 10g

  8. Normal Form • Edgar F. Codd originally established three normal forms: 1NF, 2NF and 3NF. • There are now others that are generally accepted, but 3NF is widely considered to be sufficient for most applications. • Most tables when reaching 3NF are also in BCNF (Boyce-Codd Normal Form).

  9. When do you need to check normal forms • Tables mapped from ER model usually meets 3NF, but there is no guarantee. • Therefore once you have a database design, with tables mapped from the constructs of ER diagram, you need to check each table for up to 3NF. • If a table meets 3NF, usually the table is in the good shape. Otherwise, the table usually needs to be further decomposed. Guide to Oracle 10g

  10. More advanced normal forms • Is out the scope of 242 course. • Will be studied in 342 course. • Table design is more challenging than manipulating tables. Guide to Oracle 10g

  11. Dependency and partial dependency • What is dependency? • If you look at two attributes in a table, there are two kinds of relationship. • Independent from each other, for example age and state in student table. • One depends on the other, or in order words, one decide the other. For example, ssn and age. Age depends on ssn. If you know one’s ssn, you know his/her age. This shows the dependency of age on ssn, which usually is key. • Partial dependency (in case primary key consists of multiple fields.) • Fields within the table are dependent only on part of the primary key Guide to Oracle 10g

  12. First normal form (1NF) • Primary key field identified • No multi-valued attributes, no composite attributes, i.e. each attribute is atomic, one value for each attribute.

  13. Normalization (continued) • Second normal form (2NF) • In 1NF • No partial dependencies

  14. Normalization (continued) • Basic procedure for identifying partial dependency: • Look at each field that is not part of the composite primary key • Make certain you are required to have bothparts of the composite field to determine the value of the data element

  15. Bad Example • studenttable • (student ID, course ID, course Name, grade) • In this case, course name depends on courseID only, so called partial dependency. Thus violate 2NF.

  16. Normalization (continued) • Third normal form (3NF) • In 2NF • No transitive dependencies • i.e. the non – primary key attributes should be mutually independent • Transitive dependency • Field is dependent on another field within the table that is notthe primary key field

  17. Bad Example • Studenttable2 • (sid, student, state, state governor) • In this case, state depends on sid, while state governor depends on state, through which, depends on sid, so exist a transitive dependency from state governor to sid via state • Thus violate 3NF

  18. Another Example Table 2

  19. Similar problems with Table 2 • This table is not very efficient with storage. • This design does not protect data integrity. • Third, this table does not scale well.

  20. First Normal Form • In our Table 2, we have two violations of First Normal Form: • First, we have more than one author field, • Second, our subject field contains more than one piece of information. With more than one value in a single field, it would be very difficult to search for all books on a given subject.

  21. First Normal Table • Table 3

  22. We now have two rows for a single book. Additionally, we would be violating the Second Normal Form… • A better solution to our problem would be to separate the data into separate tables- an Author table and a Subject table to store our information, removing that information from the Book table:

  23. Subject Table Author Table Book Table

  24. Each table has a primary key, used for joining tables together when querying the data. A primary key value must be unique with in the table (no two books can have the same ISBN number), and a primary key is also an index, which speeds up data retrieval based on the primary key. • Now to define relationships between the tables

  25. Relationships Book_Author Table Book_SubjectTable

  26. Second Normal Form • As the First Normal Form deals with redundancy of data across a horizontal row, Second Normal Form (or 2NF) deals with redundancy of data in vertical columns. • As stated earlier, the normal forms are progressive, so to achieve Second Normal Form, the tables must already be in First Normal Form. • The Book Table will be used for the 2NF example

  27. 2NF Table Publisher Table Book Table

  28. 2NF • Here we have a one-to-many relationship between the book table and the publisher. A book has only one publisher, and a publisher will publish many books. When we have a one-to-many relationship, we place a foreign key in the Book Table, pointing to the primary key of the Publisher Table. • The other requirement for Second Normal Form is that you cannot have any data in a table with a composite key that does not relate to all portions of the composite key.

  29. Third Normal Form • Third normal form (3NF) requires that there are no functional dependencies of non-key attributes on something other than a candidate key. • A table is in 3NF if all of the non-primary key attributes are mutually independent • There should not be transitive dependencies

  30. Boyce-Codd Normal Form BCNF requires that the table is 3NF and only determinants are the candidate keys

  31. Normalization • Step-by-step process used to determine which data elements should be stored in which tables • Purpose • Eliminate data redundancy without losing information • Several levels of normalization • Forms

More Related