1 / 37

Database Normalization for Data Accuracy

Learn the importance of normalization in database design and the process of verifying and optimizing attribute grouping to prevent data redundancy and update anomalies.

Download Presentation

Database Normalization for Data Accuracy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Normalization การตรวจสอบความถูกต้องของการออกแบบระบบฐานข้อมูล ต้องมีขบวนการการตรวจสอบอย่างเป็นขั้นตอน เราเรียกขบวนการนี้ว่า Normalization. Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for each relation in the relational schema. Normalization can be applied to individual relations so that database can be normalized to a specific form to prevent the possible occurrence of update anomaly.

  2. Data Redundancy and Update Anomalies วัคถุประสงค์หลักของการออกแบบระบบฐานข้อมูลคือ การจัดกลุ่ม attributes ในแต่ละรีเลชั่นเพื่อลดความซ้ำซ้อนของข้อมูล ซึ่งส่งผลถึงการประหยัดเนื้อที่ในการจัดเก็บฐานข้อมูลด้วย รีเลชั่นที่มีข้อมูลซ้ำซ้อนมาก ก่อปัญหา UPDATE ANOMALIES ซึ่งแบ่งออกเป็น 3 ประเภทคือ Insertion anomalies Deletion Anomalies Modification Anomalies

  3. Insertion Anomalies Deletion Anomalies Modification Anomalies

  4. Insertion Anomalies To insert the details of new members of staff into the Staff_Branch relation, we must include the details of the branch at which the staff are to be located. Deletion Anomalies If we delete a row from the Staff_Branch relation that represents the last member of staff located at a branch, the details about that branch are also lost from the database. Modification Anomalies If we want to change the value of one of the attributes of a particular branch in the Staff_Branch relation, we must update the rows of all staff located at that branch. If this modification is not carried out on all the appropriate rows of the Staff_Branch relation, the data base will become inconsistent.

  5. Staff_Branch Relation

  6. Delete Anomalies Staff_Branch Relation

  7. Modification Anomalies Staff_Branch Relation

  8. Functional Dependencies One of the main concepts associated with normalization is functional dependency, which describes the relationship between attributes. นิยามของกฎเกณฑ์ของการขึ้นต่อกัน (Definition of Functional Dependency) ระบุไว้ว่า สมมุติให้ B เป็นแอตทริบิวตัวหนึ่ง และ A แอตทริบิวอีกตัวหนึ่ง ( A อาจเป็นเซ็ตของแอตทริบิวก็ได้) เรากล่าวว่า Bเป็นฟังก์ชั่นการขึ้นอยู่กับ A(denoted A  B) ก็ต่อเมื่อค่าของ Aสามารถใช้ในการเลือก (determine) ค่าของ B ได้เพียง 1 ค่าเสมอ สัญญลักษณ์การขึ้นต่อกัน (A  B) อ่านได้ดีงนี้ Bมีฟังก์ชั่นการขึ้นอยู่กับ A หรือ Aมีฟังก์ชั่นในการเลือก B หรือ B ขึ้นกับ A

  9. Functional Dependencies อนุญาตให้ผู้ใช้ระบุกฏเกณฑ์การขึ้นต่อกันด้วย attribute ที่ไม่ได้เป็น candidate key ได้ เช่น    (โดยที่ ไม่ได้เป็น candidate key) สำหรับทุก ๆ แถวในรีเลชั่น r หากแถวใดที่มีค่าในแอตทริบิว เหมือนกัน ระบบต้อง รับประกันว่า ในบรรดาแถวเหล่านั้น ต้องมีค่าของแอตทริบิว ที่เหมือนกันด้วย กล่าวคือ ถ้า   holds on R ดังนั้น ถ้า t1[] = t2[] ระบบต้องรับประกันว่า t1[] = t2[] Given a relation R, attribute y of R is dependent on attribute x of R if and only if whenever two tuples of R agree on their x-value, they must necessarily agree on their y-value.

  10. B is functionally dependent on A A B When a functional dependency exists, the attribute or group Of attributes on the left-hand side of the arrow is called the determinant. Position is functionally dependent on Staff_No Staff_No Position SL21 System Engineer Staff_No is not functionally dependent on Position Position Staff_No SL21 SG5 System Engineer

  11. Staff_No  SName, Saddress, Position, Salary, Branch_No Branch_No  BAddress, Tel_No BAddress  Branch_No

  12. Normalization is a formal method involved with a series of test to help database designer to be able to identify the optimal grouping of attributes for each relation in the relational schema. Unnormalized Form 1st Normal Form 2nd Normal Form 3rd Normal Form Boyce-Codd Normal Form Normalization can be applied to individual relations so that database can be normalized to a specific form to prevent the possible occurrence of update anomaly. The process of normalization is a formal method that identifies relations based on primary key (or candidate keys in the case of BCNF the functional dependencies among their attributes.

  13. Relationships of Normal Forms Higher Normal forms

  14. Case Study The DreamHome company manages property on behalf of the owners, and as part of this service, the company takes care of the property’s rental. To simplify this example, we assume that a customer rents a given property only once, and cannot rent more than one property at any one time. Unnormalized form (UNF) : A table that contains one or more repeating groups. Customer_Rental Relation A repeating group is an attribute or group of attributes within a table that occurs with multiple values for a single occurrence of the key attribute (s) for that table. The term key refers to the attribute (s) That uniquely identify each row within the unnormalized table.

  15. First normal form (1NF) : A relation in which the intersection of each row and column contains one and only one value. Customer_Rental Relation For the relational data model, it is important to recognize that it is only first normal form(1NF) that is critical in creating appropriate relations. All the subsequent normal forms are optional. However, to avoid the update anomalies, it is recommended that we proceed to at least 3NF.

  16. Set of the Functional Dependency of Customer_Rental relation fd1 Customer_No, Property_No RentStart, RentFinish(Primary key) fd2 Customer_No  CName(Partial dependency) fd3 Property_No  PAddress, Rent, Owner_No, OName(Partial dependency) fd4 Owner_No  OName (Transitive dependency) fd5 Customer_No, RentStart  Property_No, PAddress, RentFinish, Rent, Owner, OName(Candidate key) fd6 Property_No, RentStart  Customer_No, CName, RentFinish(Candidate key)

  17. Customer_No Property_No CName PAddress RentStart RentFinish Rent Owner_No OName fd1 (Primary key) fd2 (Partial dependency) (Partial dependency) fd3 fd6 (Transitive dependency) fd5 (Candidate key) fd6 (Candidate key)

  18. Second Normal Form (2NF) : A relation that is in the first normal form and every non-primary key attribute is fully functionally dependenton the primary key. Full functional : Indicates that if A and B are attributes of a relation, B is fully functionally dependent dependency on A if B is functionally dependent on A, but not on any proper subset of A. ถ้า B เป็น Non-Key attribute ซึ่งมีฟังก์ชั่นการขึ้นต่อกันอยู่กับส่วนใดส่วนหนึ่งของคีย์หลัก เราจะเรียกว่า B partial dependence on A. Partial dependency ต้องถูกขจัดออกโดยการแยก ออกไปตั้งเป็นตารางใหม่ เพื่อให้ Non-Key attribute ตัวนี้ fully dependent on คีย์หลัก OName Customer_No Property_No CName PAddress RentStart RentFinish Rent Owner_No fd1 (Primary key) fd2 (Partial dependency) fd3 (Partial dependency)

  19. Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property_Owner (Property_No, PAddress, Rent, Owner_No, OName) Rental Relation Customer Relation Property-Owner Relation 2NF applies to relations with composite keys, that is, relations with a primary key composed of two or more attributes. A relation with a single attribute primary key is automatically in at least 2NF.

  20. Transitive dependency Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property_Owner (Property_No, PAddress, Rent, Owner_No, OName) Transitive dependency Property-Owner Relation Rental Relation Customer Relation

  21. Transitive dependency : A condition where A, B, and C are attributes of a relation such that if A  B and B  C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C). Definition of Third Normal Form: A relation that is in first and second normal form, and in which no non-primary key attribute is transitively dependent on the primary key. Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property_Owner (Property_No, PAddress, Rent, Owner_No, OName) Property-for-Rent Relation Owner Relation

  22. Customer_Rental Relation Customer (Customer_No, CName) Rental (Customer_No, Property_No, RentStart, RentFinish) Property (Property_No, PAddress, Rent, Owner_No) Owner (Owner_No, Oname)

  23. Customer_Rental 1NF Property_Owner 2NF Customer Rental Property_for_Rent Owner 3NF Rental Relation Customer Relation Property-Owner Relation Owner Relation

  24. From 3NF to Boyce-Codd Normal Form (BCNF) BCNF is based on functional dependencies that take into account all candidate keys in a relation. For a relation with only one candidate key, 3NF and BCNF are equivalent. The difference between 3NF and BCNF is that for a functional dependency AB, 3NF allows this dependency in a relation if B is a primary-key attribute and A is not a candidate key. Whereas, BCNF insists that for this dependency to remain in a relation, A must be a candidate key. Therefore, BCNF is a stronger form of 3NF, such every relation in BCNF is also in 3NF. Boyce-Codd :A relation is in BCNF if and only if every determinant is normal form (BCNF)a candidate key. • Violation of BCNF is quite rare, since it may only happen under specific conditions. The potential to violate BCNF may occur in relation that • contains two (or more) composite candidate keys and • which overlap, that is share at least one attribute in common

  25. Case Study In this example, Client_Interview relation is presented. It contains details of the arrangements for interviews of clients by members of staff of the DreamHome company. The members of staff involved in interviewing clients are allocated to a specific room on the day of interview. However, a room may be allocated to several members of staff as required throughout a working day. A client is only interviewed once on a given date, but may be requested to attend further interviews at later dates. This relation has three candidate keys: (Client_No, Interview_Date), (Staff_No, Interview_Date, Interview_Time), and (Room_No, Interview_Date, Interview_Time). Therefore the Client_Interview relation has three composite candidate keys, which overlap by sharing the common attribute Interview_Date. We select Client_No, Interview_Date) to act as the primary key for this relation.

  26. Client_Interview (Client_No, Inverview_Date, Interview_Time, Staff_No, Room_No) The Client_Interview relation has the following functional dependencies : Fd1 Client_No, Interview_Date Interview_Time, Staff_No, Room_No (Primary key) Fd2 Staff_No, Interview_Date, Interview_Time  Client_No(Candidate key) Fd3 Room_No, Interview_Date, Interview_Time  Staff_No,Client_No(Candidate key) Fd4 Staff_No, Interview_Date  Room_No Client_Interview Relation

  27. Interview (Client_No, Interview-Date, Interview_Time, Staff_No) Staff_Room (Staff_No, Interview-Date, Room_No) Interview Relation Staff_Room Relation

  28. Review of Normalization (1NF to BCNF) The DreamHome company manages property on behalf of the owners, and as part of this service the company undertakes regular inspections of the property by members of staff. When staff are required to undertake these inspections, they are allocated a company car for use on the day of the inspections. However, a car may be allocated to several members of staff, as required throughout the working day. A member of staff may inspect several properties on a given date, but a property is only inspected once on a given date. Property_Inspection Relation Property_Inspection (Property_No, PAddress, IDate, ITime, Comments, Staff_No, SName, OName)

  29. 1NF : Property_Inspection Relation Property_Inspection (Property_No, IDate, ITime, PAddress,Comments, Staff_No, SName, OName) FD1 (Primary key) (Partial dependency) FD2 (Transitive dependency) FD3 FD4 (Candidate key) FD5 FD6 (Candidate key)

  30. FD1 (Primary key) (Partial dependency) FD2 Remove Partial dependency (decompose the relation) to obtain 2NF Property Relation Property_Inspection Relation

  31. Property Relation (Property_No, PAddress) Property_Inspection Relation FD1 (Primary key) FD3 (Transitive dependency) FD4 (Candidate key) FD5 (Candidate key) FD6

  32. Property Relation Remove Transitive dependency (decompose the relation) to obtain 3NF Staff Relation Property_Inspection Relation

  33. Staff Relation Property Relation Remove remaining anomalies from functional dependencies to obtain BCNF Property_Inspection Relation Staff_Car (Staff_No, IDate, Car_Reg) Inspection (Property_No, IDate, ITime, Comments, Staff_No)

  34. From BCNF to Fourth Normal Form (4NF) Although BCNF removes any anomalies due to functional dependencies, further research led to the identification of another type of dependency called multi-valued dependency (MVD), which can cause similar design problems for relations in terms of data redundancy. ตารางต่อไปนี้เป็น BCNF แต่ยังเกิดปัญหา update anomalies Lect_Sub_Research Relation

  35. Multi-valued : Represents a dependency between attributes (for example, A, dependency B, and C) in a relation, such that for each value of A there is a (MVD) set of values for B, and a set of values for C. However, the set of values for B and C are independent of each other. Lecturer > Subject Lecturer > Research A > B A > C Lec_Sub_Research Relation Lec_Sub Relation Lec_Research Relation

  36. Unnormalized form (UNF) Remove repeating groups First normal form (1NF) Remove partial dependencies Second normal form (2NF) Remove transitive dependencies Third normal form (3NF) Remove remaining anomalies From functional dependencies Boyce-Codd form (BCNF) Remove multi-valued dependencies Fourth normal form (4NF)

More Related