1 / 33

IT 20303

IT 20303. The Relational DBMS Section 05. Relational Database Theory. Normalization for Logical Database Design. Relational Database Theory. Normalization Process of analyzing a grouping of data items Based on inherent characteristics Often applied to existing files or databases.

selima
Download Presentation

IT 20303

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. IT 20303 • The Relational DBMS • Section 05

  2. Relational Database Theory • Normalization for Logical Database Design

  3. Relational Database Theory • Normalization • Process of analyzing a grouping of data items • Based on inherent characteristics • Often applied to existing files or databases

  4. Relational Database Theory • Normalization • Principles • Data items belong together in a logical group • Group of items can be identified by own unique identifier

  5. Relational Database Theory • Normalization • Data in the group describes one, and only one, thing • A Bottom-Up approach

  6. Relational Database Theory • Why Normalize • Avoid update anomalies • Nasty side effects • Minimize storage of redundant data • Support simpler logic for manipulating data

  7. Relational Database Theory • Why Not Normalize • Data is never (very rarely) updated • Data warehouse system is seldom normalized

  8. Relational Database Theory • Sample Data Not Normalized

  9. Relational Database Theory • How to Normalize Data using Functional Dependencies • Definition of Functional Dependency • Given a relation R, attribute Y of R is functionally dependent on attribute X of R, if and only if each X value in R has associated with it precisely one Y-value in R (at any one time)

  10. Relational Database Theory • Y of R is Dependent on X of R • X (-->)functionally determines Y X Y

  11. Relational Database Theory • Functional Dependency Diagram of Hospital Ward Example Patient Name Patient No Date of Birth Ward Type Ward Name No of Beds Senior Nurse

  12. Relational Database Theory • Table structure based on FD Diagram WARD PATIENT

  13. Relational Database Theory • Normalization using Codd’s Rules • Origin • Early enthusiasts wanted to use relational theory • Sought rules for structuring data in relational model

  14. Relational Database Theory • Normalization using Codd’s Rules • Codd and contemporaries developed rules for “Normal Forms” • 1NF • 2NF • 3NF • Normal levels to do in database design • Boyce/Codd NF – 3.5NF • 4NF • 5NF

  15. Relational Database Theory • Customer-Order-Line Item Example • Assume an existing order-entry program and data file:

  16. Relational Database Theory • 1NF – Break out repeating groups ORDER ORDER LINEITEM

  17. Relational Database Theory • 2NF- Break out attributes dependent on part of the primary key LINEITEM LINEITEM PRODUCT ORDER

  18. Relational Database Theory • 3NF- Break out attributes wholly dependent on another key ORDER CUSTOMER ORDER LINEITEM PRODUCT

  19. Relational Database Theory • Rules for 1NF, 2NF, & 3NF • 1NF • Break out repeating groups into a separate entity • 2NF • Break out attributes that are dependent on part of the primary key into a separate entity • Called Partial Dependency • 3NF • Break out attributes that are wholly dependent on another key (not PK) into a separate entity • Called Transitive Dependency

  20. Relational Database Theory • Normalization • A relation R is in 3rd Normal Form (3NF) if and only if the non-key attributes of R (if any) are: • Mutually independent, and • Fully dependent on the primary key of R

  21. Relational Database Theory • Normalization Cont’d • A relation is in 3NF if all the attributes are functionally dependent • On the Key • On the Whole Key, and • On Nothing but the Key • (So Help Me Codd)

  22. Relational Database Theory • Reconcile differences between the Data Model and Normalized Data Structures • Data model and normalized data structures must be reconciled • Discard data items from old files that are no longer needed • Calculation fields • Redundant fields • Resolve discrepancies in data item names • Ensure that new fields are really necessary • Use standard naming conventions

  23. Relational Database Theory • Example 01: • What happens when a part has more than four suppliers? • What happens when a supplier is dropped? • How do you query the parts with two or more suppliers? • Normalized Table:

  24. Relational Database Theory • Example 02: Normalize this table

  25. End Section

  26. Relational Database Theory • Multiply ways to Normalize Data • Normalization can be accomplished in different ways • Well-formed E-R model is normalized • Functional dependencies • Codd’s Rules for 1NF, 2NF, & 3NF • Discrepancies indicate something is missing or changed • One approach validates or checks another approach

  27. Relational Database Theory • Impact of Normalization • Improve the integrity of data • Purpose is to eliminate update anomalies • Minimize storage of redundant data • Reduce the complexity of programming logic • Emphasis now is on maintainability, simplicity of program • Normalized data can minimize complexity of code that manipulates the data • Enhance the stability, “goodness” of database design • Normalized data tends to be easier to understand • Normalized data can be used by many different applications more easily

  28. Relational Database Theory • Impact of Normalization on Performance • Concern that a large number of tables-and table joins-will result in poor performance • Join can be a very expensive operation • Test to determine frequency of joins, number of tables joined • After database is created and available

  29. Relational Database Theory • Impact of Normalization on Performance Cont’d • Requirements for application performance, response time dictate corrective actions • Performance addressed in section on physical database design • There are alternatives to de-normalizing data to improve performance

  30. Relational Database Theory • Recommendations for Data that is Updated • First Normalize • Don’t be dismayed by too many tables • Normalization increases number of tables but improves logic • Normalization is a helpful logical database design technique…for any DBMS

  31. Relational Database Theory • Objective of the design process is a “Good” design • The logical database design process • Is well understood • Uses complementary techniques • Can be automated with CASE tools

  32. Relational Database Theory • Objective of the design process is a “Good” design cont’d • A “Good” database design • Contains all the important entities and data items • Has stable primary keys • Identifies clearly all relationships • Has table structures in 3NF • Is understood by designers and users • Accurately models the real world, as described in the requirements

  33. Relational Database Theory • Questions?

More Related