1 / 25

Normalisation

Normalisation. Relation A B C D E F. Relation1 A B. 1NF?. 2NF?. Relation2 A * C D E*. 3NF?. Help me Codd!!. Relation3 E F. Reading: Connolly and Begg 13 & 14 (4th ed), . Normalisation. From this…. …to this. In 3+ easy(?) steps. What is normalisation?.

donal
Download Presentation

Normalisation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Normalisation Relation A B C D E F Relation1 A B 1NF? 2NF? Relation2 A* C D E* 3NF? Help me Codd!! Relation3 E F Reading: Connolly and Begg 13 & 14 (4th ed),

  2. Normalisation From this… …to this In 3+ easy(?) steps

  3. What is normalisation? • A method for database design • Theory examines how “good” is a schema? • Transform non-normalised schemas • Minimise storage • Takes a set of attributes and derives the relational model • By separating out the required tables • Completely different approach to ERM • But should get the same result • A minimum of 3 steps are used: • For each stage, the normal form gets stronger (i.e. removes redundancy) so less open to update anomalies • All based on functional dependencies

  4. Functional Dependency • Underpins normalisation process • If every value of column A uniquely determines the value in column B, then • B is functionally dependent on A (B depends on A) • A determines B, or, formally, A B (A is called the determinant) • For example, • EmpID  Age, Dept (AB,C) Employee ID, Project  Role (X, Y  Z) • Note multiple attributes are often involved EmpIDProject Age Dept Dsize Budget Role

  5. Rules for functional dependency • A  B does NOT automatically mean B  A • E.g. student ID  name but not name  ID • Transitive dependency:If AB and BC then AC • Many other rules • E.g. if X,YZ but XZ also • In this case Zis partially dependenton X,Y • “Transitive” and “partial” dependency are two key concepts of the normalisation process

  6. A Question for you! EmpIDProject Age Dept Dsize Budget Role A B C D EmpIDProject Age Dept Dsize Budget Role E1 P2 33 D2 10 100 Analyst E1 P1 33 D2 10 200 Prog. E2 P1 34 D5 10 200 Prog. E2 P2 34 D5 20 100 Analyst Which functional dependency is violated by the data?

  7. Unnormalised Form Relation contains: non-atomic attribute values ID Employee Salary Project 1 Grey 31000 A 2 Brown 35000 B,C 3 White 55000 A,B,C 4 Black 47000 A,C Violation of 1NF non-atomic values

  8. redundancy Repeating First Normal Form ID Employee Salary Project Budget 1 Grey 31000 A 10 2 Brown 35000 B 5 2 Brown 35000 C 5 3 White 55000 A 5 3 White 55000 B 5 3 White 55000 C 5 4 Black 47000 A 10 4 Black 47000 C 5 Permits only single (atomic) attribute values Remove Repeating Group along with primary key from other Table ID (fk) Project Budget 1 A 10 2 B 5 2 C 5 3 A 5 3 B 5 3 C 5 4 A 10 4 C 5 ID Employee Salary 1 Grey 31000 2 Brown 35000 3 White 55000 4 Black 47000

  9. Second Normal Form • Full Functional Dependency (FFD) • X  Y is FFD • if removal of any attribute from X removes the dependency • X  Y is partially dependent • if removal of attribute from X leaves the dependency intact • 2NF test • involves testing for partial dependency on the PK (therefore PK MUST be composite to test for 2NF) • Relation R is in 2NF if: • every non-primary-key attribute in R is FFD on the primary key of R

  10. EmpIDProject Age Dept Dsize Budget Role • So which FD’s are violating 2NF? • “Second Normalised” by: • removing non-primary-key attributes and forming a FFD on appropriate part of primary key {EmpID ,Age, Dept , Dsize} {EmpID*, Project*, Role} {Project , Budget} 2NF

  11. Third Normal Form • Remove Transitive Dependency • Conditions • A non-primary-key attribute Z is transitively dependent on primary key X if: • X  Y; Y  Z (Y attribute provides the transition to the PK) [EmpID* Project* Role] A [EmpID Age Dept Dsize] B [Project Budget] C D None of the above Which of the above could have transitive dependency?

  12. Here is an un-normalised Table Ord# Date Cust# Name Prod# Desc Qty Supplier Tel 1 12/1/01 1 Jones 1 Disk 3 X 101 1 12/1/01 1 Jones 2 CD 5 Y 223 2 13/1/01 2 Black 1 Disk 1 X 101 2 13/1/01 2 Black 2 CD 1 Y 223 2 13/1/01 2 Black 3 Mouse 1 X 101 3 13/1/01 1 Jones 3 Mouse 1 X 101

  13. Normalise it to 1NF Ord# Date Cust# Name Prod# Desc Qty Supplier Tel 1 12/1/01 1 Jones 1 Disk 3 X 101 1 12/1/01 1 Jones 2 CD 5 Y 223 2 13/1/01 2 Black 1 Disk 1 X 101 2 13/1/01 2 Black 2 CD 1 Y 223 2 13/1/01 2 Black 3 Mouse 1 X 101 3 13/1/01 1 Jones 3 Mouse 1 X 101 fk Ord#Prod# Desc Qty Supplier Tel 1 1 Disk 3 X 101 1 2 CD 5 Y 223 2 1 Disk 1 X 101 2 2 CD 1 Y 223 2 3 Mouse 1 X 101 3 3 Mouse 1 X 101 Ord# Date Cust# Name 1 12/1/01 1 Jones 2 13/1/01 2 Black 3 13/1/01 1 Jones

  14. Ord# Date Cust# Name 1 12/1/01 1 Jones 2 13/1/01 2 Black 3 13/1/01 1 Jones Ord#Prod# Desc Qty Supplier Tel 1 1 Disk 3 X 101 1 2 CD 5 Y 223 2 1 Disk 1 X 101 2 2 CD 1 Y 223 2 3 Mouse 1 X 101 3 3 Mouse 1 X 101 Already in 2NF Prod# Desc Supplier Tel 1 Disk X 101 2 CD Y 223 3 Mouse X 101 Ord#Prod# Qty 1 1 3 1 2 5 2 1 1 2 2 1 2 3 1 3 3 1 Now we normalise this to 2NF remembering to test on the PK for any partial dependency fk fk

  15. So, any transitive dependency? Ord#Prod# Qty 1 1 3 1 2 5 2 1 1 2 2 1 2 3 1 3 3 1 Ord# Date Cust# Name 1 12/1/01 1 Jones 2 13/1/01 2 Black 3 13/1/01 1 Jones Prod# Desc Supplier Tel 1 Disk X 101 2 CD Y 223 3 Mouse X 101 fk fk

  16. Yes! But not in all ……………. Prod# Desc Supplier Tel 1 Disk X 101 2 CD Y 223 3 Mouse X 101 Ord#Prod# Qty 1 1 3 1 2 5 2 1 1 2 2 1 2 3 1 3 3 1 Cust# Name 1 Jones 2 Black Supplier Tel X 101 Y 223 Ord# Date Cust# Name 1 12/1/01 1 Jones 2 13/1/01 2 Black 3 13/1/01 1 Jones OK! Prod# Desc Supplier (fk) 1 Disk X 2 CD Y 3 Mouse X Ord# Date Cust# (fk) 1 12/1/01 1 2 13/1/01 2 3 13/1/01 1

  17. Final Decomposition Ord#{fk} Prod#{fk} Qty 1 1 3 1 2 5 2 1 1 2 2 1 2 3 1 3 3 1 Ord# Date Cust# (fk) 1 12/1/01 1 2 13/1/01 2 3 13/1/01 1 Cust# Name 1 Jones 2 Black Prod# Desc Supplier (fk) 1 Disk X 2 CD Y 3 Mouse X Supplier Tel X 101 Y 223 Now in 3NF

  18. The underlying E-R Model ….. Ord# Date Cust# Name Prod# Desc Qty Supplier Tel 1 12/1/01 1 Jones 1 Disk 3 X 101 1 12/1/01 1 Jones 2 CD 5 Y 223 2 13/1/01 2 Black 1 Disk 1 X 101 2 13/1/01 2 Black 2 CD 1 Y 223 2 13/1/01 2 Black 3 Mouse 1 X 101 3 13/1/01 1 Jones 3 Mouse 1 X 101 makes  Customer Order 1..1 0..* 0..* has How many tables would you get from mapping?  0..* despatches  Product Supplier 1..* 1..1

  19. So Normalisation to 3NF is Normal!! • Remember, 2NF and 3NF disallow partial and transitive dependencies respectively on the PK, otherwise they are open to update anomalies • But ….. even at 3NF, a relation may be open to update anomalies on rare occasions due to redundancy too • So we look briefly at these • Boyce-Codd • 4NF

  20. Boyce-Codd NF • Is a stronger normalised form then 3NF • Definition: A relation is in BCNF, if and only if, every determinant is a candidate key • And remember that a candidate key is any key that could become the PK of the relation (i.e. there may be competition for it!) • Potential to violate BCNF comes from: • A relation containing at least 2 composite candidate keys • Or candidate keys overlapping (i.e. they have at least one attribute in common)

  21. BCNF Example • Consider the candidate keys for: • FD1 {PK}: clientNo, interviewDate  interviewTime, staffNo, roomNo • FD2 {CK}: staffNo, interviewDate, interviewTime  clientNo • FD3 {CK}: roomNo, interviewDate, interviewTime  staffNo, clientNo • FD4: staffNo, interviewDate  roomNo PK is primary key and CK is candidate key. But what about FD4? It is not a CK Adapted from Connolly and Begg, 2005, 4th ed. Page 420

  22. So new decomposition? So duplication in the room number is now eradicated

  23. 4NF • Comes from 2 multi-valued attributes in a relation • E.g. for each value of A there is a set of values for B and a set for C, while B and C remain independent of each other So if you model your databases from ERM’s this type of dependency should not arise.

  24. Example of 4NF Note: if step 9 applied to multi-valued attributes then we should map this correctly and avoid such redundancy as the two tables on the right would be the result of the mapping! Adapted from Connolly and Begg, 2005, 4th ed. Page 428

  25. Normal Form Summary • A Relation’s degree of normalisation • Stronger in format at each stage • less vulnerable to update anomalies • First Normal Form (1NF) • The relation has no non-atomic values • Or the relation has “no repeating group” • 2nd Normal Form (2NF) • The relation has no partial dependencies • All non-key attributes are fully functionally dependent on the PK • 3rd Normal Form (3NF) • The relation has no transitive dependencies • Boyce-Codd • Every determinant is a candidate key • 4NF – no multi-valued dependencies

More Related