1 / 61

Logical Database Design and the Relational Database

Logical Database Design and the Relational Database. Professor Chen School of Business Administration Gonzaga University Spokane, WA 99258. What is a Relation?. A relation is a named , two-dimensional table of data.

Download Presentation

Logical Database Design and the Relational Database

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Logical Database Design andthe Relational Database Professor Chen School of Business Administration Gonzaga University Spokane, WA 99258

  2. What is a Relation? • A relation is a named, two-dimensional table of data. • Each relation consists of a set of named columns and an arbitrary number of unnamed rows. • Not all table are relations

  3. Emp_ID Name Dept_Name Salary 100 Margaret Simpson Marketing 48,000 140 Allen Beeton Accounting 52,000 110 Chris Lucero Info. System 43,000 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 Fig. 6-1: EMPLOYEE1 Relation with sample data EMPLOYEE1

  4. Relational Definitions • Relation • Every relation has a unique name. • Every attribute value is atomic (singled-value) (Fig. 6-1) • Every row is unique. • Attributes in tables have unique names. • The order of the columns is irrelevant. • The order of the rows is irrelevant.

  5. Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X Java 8/12/199X Fig. 6-2: Eliminating multi-valued attributes (a) Table with repeating groups (Un-Normalized)

  6. Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X Java 8/12/199X Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X

  7. Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X Fig. 6-2: Eliminating multi-valued attributes (b) EMPLOYEE2 Relation (Normalized) EMPLOYEE2

  8. Relational Keys and Structures • Primary Key • Composite Key • Foreign Key • One-to-Many Relationship • Many-to-Many Relationship • Intersection Data • Candidate Key • Surrogate Key

  9. Fig. 6-3: Schema for four relations (Pine Valley Furniture) Graphical Representation

  10. Fig. 6-3: Schema for four relations (Pine Valley Furniture) Graphical and Text Representations CUSTOMER(Customer_ID, Customer_name,Address, City,State,Zip) ORDER(Order_ID, Order_Date,Customer_ID) ORDER_LINE(Order_ID, Product_ID,Quantity) PRODUCT(Product_ID, Product_Description, Product_Finish,Unit_Price, On_Hand)

  11. Integrity Constraints • Domain Constraints • Allowable values for an attribute. • A domain definition contains: domain name, data type, size, meaning, and allowable values/range (if applicable). • Entity Integrity • No primary key attribute may be null. • Operational Constraints • Business rules (see Chapter 4)

  12. Fig. 6-5: Referential integrity constraints (Pine Valley Furniture) pk fk pk ck/pk fk fk pk

  13. Referential Integrity (Addition and Deletion) PK FK

  14. Referential Integrity (Summary) • Addition - you can’t add (insert) an ORDER record if Customer_ID (FK) does not exist (or does not match) a Customer_ID (PK) in the CUSTOMER table. • Deletion- you can’t delete a CUSTOMER record if there is (are) related Customer_ID in the ORDER table.

  15. Integrity Constraints • Referential Integrity • A rule that states that either each foreign key value must match a primary key value in the other relation or else the foreign key value must be null. • For example: Delete Rules • Restrict • Cascade • Set-to-Null

  16. Referential Integrity (Conclusion) pk common field fk pk

  17. Referential Integrity (Conclusion)

  18. Referential Integrity (Conclusion)

  19. Referential Integrity (Conclusion)

  20. Order of Entering Data and Referential Integrity LOCATION FACULTY STUDENT

  21. Well-Structured Relations • A well-structured relation contains minimal redundancy and allows users to insert, modify, and delete the rows in a table without errors or inconsistencies. • The following anomalies should be removed for a well-structured relation: • Insertion Anomaly • Deletion Anomaly • Modification Anomaly

  22. Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X Is EMPLOYEE2 a Well-Structured relation? EMPLOYEE2

  23. Is EMPLOYEE2 a Well-Structured relation? NO! WHY?

  24. Insertion Anomaly: Inserting a new row, the user must supply values for both Emp_ID (PK) and Course_Title (CK and FK). This is an (insertion) anomaly, since the user should be able to enter employee data without knowing (supplying) course (title) data. EMPLOYEE2 Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X

  25. Deletion Anomaly: Deleting the employee number 140, it results in losing not only the employee’s information but also the course had an offering that completed on that date. EMPLOYEE2 Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X

  26. Modification Anomaly: If the employee number 100 gets a salary increase, we must record the increase in each of the rows for that employee (two occurences); otherwise the data will be inconsistent. EMPLOYEE2 Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X

  27. Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X Emp_ID Name Dept_Name Salary 100 Margaret Simpson Marketing 48,000 140 Allen Beet Accounting 52,000 110 Chris Lucero Info. System 43,000 190 Lorenzo Davis Finance 55,000 150 Sususan Martin Marketing 42,000 Emp_IDCourse_ Date_ Title Completed 100 SPSS 6/19/199X 100 Surveys 10/7/199X 140 Tax Acc 12/8/199X 110 SPSS 1/12/199X 110 C++ 4/22/199X 150 SPSS 6/19/199X 150 Java 8/12/199X Fig. 6-7: Normalized Relations from EMPLOYEE2 EMPLOYEE2 EMPLOYEE1 EMP_COURSE

  28. Normalization • Normalization is the process of decomposing relations with anomalies to produce smaller, well-structured and stable relations.

  29. Data Normalization • Normalization is a formal process for deciding which attributes should be grouped together in a relation (see Fig.6-7 next). • Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data.

  30. Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X Emp_ID Name Dept_Name Salary 100 Margaret Simpson Marketing 48,000 140 Allen Beet Accounting 52,000 110 Chris Lucero Info. System 43,000 190 Lorenzo Davis Finance 55,000 150 Sususan Martin Marketing 42,000 Emp_IDCourse_ Date_ Title Completed 100 SPSS 6/19/199X 100 Surveys 10/7/199X 140 Tax Acc 12/8/199X 110 SPSS 1/12/199X 110 C++ 4/22/199X 150 SPSS 6/19/199X 150 Java 8/12/199X Fig. 6-7: Normalized Relations from EMPLOYEE2 EMPLOYEE2 EMPLOYEE1 EMP_COURSE

  31. Functional Dependencies and Keys • Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute. • Candidate Key:An attribute, or combination of attributes, that uniquely identifies a row in a relation. • A candidate key is always a determinant, while a determinant may or may not be a candidate key. • Each non-key field is functionally dependent on every candidate key.

  32. Fig. 6-22: Steps in normalization

  33. First Normal Form • Normal form is a state of a relation that results from applying simple rules regarding functional dependencies (or relationships between attributes) to that relation. • No multi-valued attributes. • Every attribute value is atomic. • Fig. 6-2a, b.

  34. Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X Java 8/12/199X Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X

  35. Second Normal Form • 1NF and every non-key attribute is fully functionally dependent on the primary key. • Every non-key attribute must be defined by the entire key (either a single PK or a CK), not by only part of the key. • No partial functional dependencies. • Fig. 6-1,6-23a

  36. Fig. 6-23: A Process of 1NF to 2NF (EMPLOYEE2 - - 1NF) Emp_ID Course_Title Name Dept_Name Salary Date_Completed Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X (b) Functional Dependencies in EMPLOYEE2 Partial Depend.

  37. Fig. 6-23: A Process of 1NF to 2NF Emp_ID Name Dept_Name Salary 100 Margaret Simpson Marketing 48,000 140 Allen Beeton Accounting 52,000 110 Chris Lucero Info. System 43,000 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 Emp_ID Name Dept_Name Salary EMPLOYEE1 (a) Functional dependencies in EMPLOYEE1 (2NF)

  38. Emp_ID Name Dept_Name Salary Emp_ID Course_Title Date_Completed Fig. 6-23: Summary on Normalization EMPLOYEE2 Emp_ID Course_Title Name Dept_Name Salary Date_Completed Partial Depend. EMPLOYEE1 EMP_COURSE 2NF 3NF ?

  39. Emp_ID Name Dept_Name Salary Course_ Date_ Title Completed 100 Margaret Simpson Marketing 48,000 SPSS 6/19/199X 100 Margaret Simpson Marketing 48,000 Surveys 10/7/199X 140 Allen Beeton Accounting 52,000 Tax Acc 12/8/199X 110 Chris Lucero Info. System 43,000 SPSS 1/12/199X 110 Chris Lucero Info. System 43,000 C++ 4/22/199X 190 Lorenzo Davis Finance 55,000 150 Susan Martin Marketing 42,000 SPSS 6/16/199X 150 Susan Martin Marketing 42,000 Java 8/12/199X Emp_IDCourse_ Date_ Title Completed 100 SPSS 6/19/199X 100 Surveys 10/7/199X 140 Tax Acc 12/8/199X 110 SPSS 1/12/199X 110 C++ 4/22/199X 150 SPSS 6/19/199X 150 Java 8/12/199X Emp_ID Name Dept_Name Salary 100 Margaret Simpson Marketing 48,000 140 Allen Beet Accounting 52,000 110 Chris Lucero Info. System 43,000 190 Lorenzo Davis Finance 55,000 150 Sususan Martin Marketing 42,000 Fig. 6-23: Summary on Normalization EMPLOYEE2 (1NF) EMPLOYEE1 (3NF) EMP_COURSE (3NF)

  40. Third Normal Form • 2NF and no transitive dependencies (functional dependency between non-key attributes.) • Fig. 6-23, 6-24, 6-25.

  41. Fig. 6-24: Relation with transitive dependency (a) SALES relation with simple data

  42. What Anomalies might be in SALES relation? WHY? Because it is … Not in the 3NF (transitive dependency) • Insertion anomaly ? • Deletion anomaly ? • Modification anomaly ?

  43. Fig. 6-24: (b) Transitive dependency in SALES relation Cust_ID ---> Name,Salesperson, Region and Salesperson ---> Region therefore ... Cust_ID ---> Region

  44. Fig. 6-25: Removing a transitive dependency (a) Decomposing the SALES relation

  45. Fig. 6-25: (b) Relations in 3NF

  46. Snum Origin Destination Distance 409 Seattle Denver 1,537 618 Chicago Dallas 1,058 723 Boston Atlanta 1,214 824 Denver Las Angeles 1,150 629 Minneapolis St. Louis 587 353 Seattle Denver 1,537 Fig. 6-26: Another example with transitive dependencies Insertion anomaly? Deletion anomaly? Modification anomaly?

  47. Break !

  48. SHIPMENT Snum Origin Destination Distance Snum Origin Destination 409 Seattle Denver 618 Chicago Dallas 723 Boston Atlanta 824 Denver Las Angeles 629 Minneapolis St. Louis 353 Seattle Denver Origin Destination Distance Seattle Denver 1,537 Chicago Dallas 1,058 Boston Atlanta 1,214 Denver Las Angeles 1,150 Minneapolis St. Louis 587 Seattle Denver 1,537 Fig. 6-26: Another example with transitive dependencies

  49. Fig. 6-22: Steps in normalization

More Related