week 11 november 7 l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Week 11 November 7 PowerPoint Presentation
Download Presentation
Week 11 November 7

Loading in 2 Seconds...

play fullscreen
1 / 76

Week 11 November 7 - PowerPoint PPT Presentation


  • 302 Views
  • Uploaded on

Week 11 November 7. Data Normalization and ERD Conceptual, Logical and Physical Database Design. Data Normalization. The purpose of normalization is to produce a stable set of relations that is a faithful model of the operations of the enterprise. Achieve a design that is highly flexible

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Week 11 November 7' - issac


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
week 11 november 7

Week 11November 7

Data Normalization and ERD

Conceptual, Logical and Physical Database Design

data normalization
Data Normalization
  • The purpose of normalization is to produce a stable set of relations that is a faithful model of the operations of the enterprise.
    • Achieve a design that is highly flexible
    • Reduce redundancy
    • Ensure that the design is free of certain update, insertion and deletion anomaliesCatherine Richardo, 1990
normalization
Normalization

1NF

Flat file

2NF

Partial dependencies removed

3NF

Transitive dependencies removed

BCNF

Every determinant is a candidate key

Non-tivial multi-valued dependencies removed

4NF

slide4

0000 000 0000 0

John Smith

10001

Order No.

Stereos To Go

Invoice

6 15 99

Date: / /

Go, Hogs

0000-000-0000-0

Account No.

John Smith

Customer:

2036-26 Street

1/05

Address:

Sacramento CA 95819

City

State

Zip Code

6 18 99

Date Shipped: / /

Product

Item

Product Description/Manufacturer

Price

Qty

Code

Number

SAGX730 Pioneer Remote A/V Receiver

1

1

1

1

56995

35995

39995

2

AT10 Cervwin Vega Loudspeakers

CDPC725 Sony Disc-Jockey CD Changer

3

4

5

132985

10000

10306

153291

Subtotal

Shipping & Handling

Sales Tax

Total

unnormalized relation
Unnormalized Relation

(Invoice_number, Invoice_date, Date_delivered, Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code,

Item1 Item1_descrip Item1_qty Item1_price,

Item2 Item2_descrip Item2_qty Item2_price, . . . ,

Item7 Item7_descrip Item7_qty Item7_price)

How would a program process the data to recreate the invoice?

unnormalized to 1nf
Unnormalized to 1NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account Cust_name Cust_addr Cust_city Cust_state Zip_code,

Item1, Item1_descrip, Item1_qty, Item1_price,

Item2, Item2_descrip, Item2_qty, Item2_price, . . . ,

Item7, Item7_descrip, Item7_qty, Item7_price)

Repeating groups

A flat file places all the data of a transaction into a single record.

This is reminiscent of a COBOL or BASIC program processing a single transaction with one read statement.

unnormalized to 1nf7
Unnormalized to 1NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code,

Item, Item_descrip, Item_qty, Item_price)

Nominated group of attributes to serve as the key

(form a unique combination)

  • Eliminate the repeating groups.
  • Each row retains data for one item.
  • If a person bought 5 items, we would have five tuples
slide8

10001 123456 John Smith ••• SAGX730 Pioneer Remote A/V Rec 1 569.95

10001 123456 John Smith ••• AT10 Cerwin Vega Loudspeakers 1 359.95

10001 123456 John Smith ••• CDPC725 Sony Disc Jockey CD 1 399.95

10001 123456 John Smith ••• S/H Shipping 1 100.00

10001 123456 John Smith ••• Tax Sales Tax 1 103.06

1NF

Flat File

Invoice number

Account number

Customer name

Item Quantity

Item Price

Item

Description

from 1nf
From 1NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code,

Item, Item_descrip, Item_qty, Item_price)

Functional dependencies and determinants

Example: item_descrip is functionally dependent on item, such that item is the determinant of item_descript.

from 1nf to 2nf
From 1NF to 2NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)

(Item, Item_descrip, Item_qty, Item_price)

Is this unique by itself?

What happens if the item is purchased more than once?

from 1nf to 2nf11
From 1NF to 2NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)

Partial dependency

(Invoice_number,Item, Item_descrip, Item_qty, Item_price)

Composite key (forms a unique combination)

from 1nf to 2nf12
From 1NF to 2NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)

(Invoice_number,Item, Item_qty, Item_price)

(Item, Item_descrip)

from 2nf to 3nf
From 2NF to 3NF

(Invoice_number, Invoice_date, Date_delivered, Cust_account, Cust_name, Cust_addr, Cust_city, Cust_state, Zip_code)

(Invoice_number,Item, Item_qty, Item_price)

(Item, Item_descrip)

Which attributes are dependent on others?

Is there a problem?

transitive dependencies and anomalies
Transitive Dependencies and Anomalies
  • Insertion anomalies
    • To add a new row, all customer (name, address, city, state, zip code, phone) and products (description) must be consistent with previous entries
  • Deletion anomalies
    • By deleting a row, a customer or product may cease to exist
  • Modification anomalies
    • To modify a customer’s or product’s data in one row, all modifications must be carried out to all others
insertion and modification anomalies for example
Insertion and Modification AnomaliesFor example…

Insert a new Panasonic product

Product_code

Manufacturer_name

DVD-A110 Panasonic

PV-4210 Panasonic

PV-4250 Panasonic

CT-32S35 PAN

Inconsistency

DVD-A110 Panasonic

PV-4210 PanaSonic

PV-4250 Pana Sonic

CT-32S35 PAN

Change all Panasonic products’ manufacturer name to “Panasonic USA”

deletion anomaly for example
Deletion AnomalyFor Example…

4377182 John Smith  Sacramento CA 95831

4398711 Arnold S  Davis CA 95691

4578461 Gray Davis  Sacramento CA 95831

4873179 Lisa Carr  Reno NV 89557

By deleting customer Arnold S, we would also be deleting Davis, California.

slide17

Invoice_number

Invoice_date

Date_delivered

Cust_account

Cust_name

Cust_addr

Cust_city

Cust_state

Zip_code

Item

Item_descrip

Invoice_number+Item

Item_qty

Item_price

Transitive Dependencies

  • A condition where A, B, C are attributes of a relation such that if A  B and B  C, then C is transitively dependent on A via B (provided that A is not functionally dependent on B or C).
why should city and state be separated from customer relation
Why Should City and State Be Separated from Customer Relation?
  • City and state are dependent on zip code for their values and not the customer’s identifier (i.e., key). Zip_code  City, State
  • Otherwise, Cust_account  Cust_addr, Zip_code  City, StateIn which case, you have transitive dependency.
slide19
3NF

Invoice Relation

(Invoice_number, Invoice_date, Date_delivered, Cust_account)

Customer Relation

(Cust_account, Cust_name, Cust_addr, Zip_code)

Zip_code Relation

(Zip_code, City, State)

Invoice_items Relation

(Invoice_number, Item, Item_qty, Item_price)

Items Relation

(Item, Item_descrip)

slide20
3NF

Invoice Relation

(Invoice_number, Invoice_date, Date_delivered, Cust_account)

Customer Relation

(Cust_account, Cust_name, Cust_addr, Zip_code)

Zip_code Relation

(Zip_code, City, State)

Invoice_items Relation

(Invoice_number, Item, Item_qty, Item_price)

Items Relation

(Item, Item_descrip)

Manufacturers Relation

(Manuf_code, Manuf_name)

Since the Items relation contains the manufacturer’s name in the description, a separate Manufacturers relation can be created

first to third normal form 1nf 3nf
First to Third Normal Form(1NF - 3NF)
  • 1NF: A relation is in first normal form if and only if every attribute is single-valued for each tuple (remove the repeating or multi-value attributes and create a flat file)
  • 2NF: A relation is in second normal form if and only if it is in first normal form and the nonkey attributes are fully functionally dependent on the key (remove partial dependencies)
  • 3NF: A relation is in third normal form if it is in second normal form and no nonkey attribute is transitively dependent on the key (remove transitive dependencies)
putting it together

Putting It Together

ERD of the Normalized Data Model

slide24
3NF

Invoice Relation

(Invoice_number, Invoice_date, Date_delivered, Cust_account)

Customer Relation

(Cust_account, Cust_name, Cust_addr, Zip_code)

Zip_code Relation

(Zip_code, City, State)

Invoice_items Relation

(Invoice_number, Item, Item_qty, Item_price)

Items Relation

(Item, Item_descrip, Manuf_code)

Manufacturers Relation

(Manuf_code, Manuf_name)

slide25
ERD

Invoices

Customers

Zip_Codes

Cust_account

Cust_name

Cust_addr

Zip_code

Zip_code

City

State

Invoice_number

Invoice_date

Date_delivered

Cust_account

Invoice_items

Manufacturers

Items

Item

Item_descrip

Manuf_code

Invoice_number

Item

Item_qty

Item_price

Manuf_code

Manuf_name

slide26
ERD

Invoices

Customers

Zip_Codes

Order

Locate

Invoice_number

Invoice_date

Date_delivered

Cust_account

Cust_account

Cust_name

Cust_addr

Zip_code

Zip_code

City

State

(0..*)

(1..1)

(0..*)

(1..1)

(1..1)

 Have

(1..*)

Invoice_items

Manufacturers

Items

Appear on

Produce

Item

Item_descrip

Manuf_code

Invoice_number

Item

Item_qty

Item_price

Manuf_code

Manuf_name

(0..*)

(1..1)

(0..*)

(1..1)

slide27
ERD

Invoices

Customers

Zip_Codes

Order

Locate

Invoice_number

Invoice_date

Date_delivered

Cust_account

Cust_account

Cust_name

Cust_addr

Zip_code

Zip_code

City

State

(0..*)

(1..1)

(0..*)

(1..1)

Partial

(1..1)

Zip codes locate Customers.

() A zip code can be related to a minimum of zero and a maximum of many customers.

() A customer can be related to a minimum and maximum of one zip code.

 Have

(1..*)

Invoice_items

Manufacturers

Items

Appear on

Produce

Item

Item_descrip

Manuf_code

Invoice_number

Item

Item_qty

Item_price

Manuf_code

Manuf_name

(0..*)

(1..1)

(0..*)

(1..1)

slide28
ERD

Invoices

Customers

Zip_Codes

Order

Locate

Invoice_number

Invoice_date

Date_delivered

Cust_account

Cust_account

Cust_name

Cust_addr

Zip_code

Zip_code

City

State

(0..*)

(1..1)

(0..*)

(1..1)

Partial

(1..1)

Customers order (items) on invoices.

() A customer can be related to a minimum of zero and a maximum of many invoices.

() An invoice can be related to a minimum and maximum of one customer.

 Have

(1..*)

Invoice_items

Manufacturers

Items

Appear on

Produce

Item

Item_descrip

Manuf_code

Invoice_number

Item

Item_qty

Item_price

Manuf_code

Manuf_name

(0..*)

(1..1)

(0..*)

(1..1)

slide29
ERD

Invoices

Customers

Zip_Codes

Order

Locate

Invoice_number

Invoice_date

Date_delivered

Cust_account

Cust_account

Cust_name

Cust_addr

Zip_code

Zip_code

City

State

(0..*)

(1..1)

(0..*)

(1..1)

Invoices possess invoice items

() An invoice can be related to a minimum of one and a maximum of many invoice items.

()An invoice item can be related to a minimum and maximum of one invoice.

(1..1)

Mandatory

 Have

(1..*)

Invoice_items

Manufacturers

Items

Appear on

Produce

Item

Item_descrip

Manuf_code

Invoice_number

Item

Item_qty

Item_price

Manuf_code

Manuf_name

(0..*)

(1..1)

(0..*)

(1..1)

slide30
ERD

Invoices

Customers

Zip_Codes

Items are sold on invoice items.

() An item can be related to a minimum of zero and a maximum of many invoice items.

() An invoice item can be related to a minimum and maximum of one item.

Order

Locate

Invoice_number

Invoice_date

Date_delivered

Cust_account

Cust_account

Cust_name

Cust_addr

Zip_code

Zip_code

City

State

(0..*)

(1..1)

(0..*)

(1..1)

(1..1)

 Have

Partial

(1..*)

Invoice_items

Manufacturers

Items

Appear on

Produce

Item

Item_descrip

Manuf_code

Invoice_number

Item

Item_qty

Item_price

Manuf_code

Manuf_name

(0..*)

(1..1)

(0..*)

(1..1)

slide31
ERD

Manufacturers produce items.

() A manufacturer can be related to a minimum of zero and a maximum of many items.

() An item can be related to a minimum and maximum of one manufacturer.

Invoices

Customers

Zip_Codes

Order

Locate

Invoice_number

Invoice_date

Date_delivered

Cust_account

Cust_account

Cust_name

Cust_addr

Zip_code

Zip_code

City

State

(0..*)

(1..1)

(0..*)

(1..1)

(1..1)

 Have

Partial

(1..*)

Invoice_items

Manufacturers

Items

Appear on

Produce

Item

Item_descrip

Manuf_code

Invoice_number

Item

Item_qty

Item_price

Manuf_code

Manuf_name

(0..*)

(1..1)

(0..*)

(1..1)

higher forms of data normalization

Higher Forms of Data Normalization

Boyce-Codd Normal Form (BCNF)

Fourth Normal Form (4NF)

Fifth Normal Form (5NF)

Domain Key Normal Form (DKNF)

boyce codd normal form bcnf
Boyce-Codd Normal Form (BCNF)
  • A relation is in Boyce-Codd normal form if and only if every determinant is a candidate key
  • For a relation with only one candidate key, 3NF and BCNF are equivalent.
  • Usually occurs when keys of different relations overlap

A determines B

Attribute A

Attribute B

Determinant

(B is functionally dependent on A)

bcnf example
BCNF Example

User (UserID, Dept, Name, ComputerID, EmpClassification)

ComputerID  Dept (a department issues a computer)

UserID, Dept  ComputerID, Name, EmpCassification

(Employees may have the same name and UserIDs are unique within the department only)

UserID, ComputerID  Dept, Name, EmpClassification

BCNF

UserComputer (ComputerID, Dept)

User (UserID, ComputerID, Name, EmpClassification)

from 3nf to bcnf
From 3NF to BCNF

Invoice Relation

(Invoice_number, Invoice_date, Date_delivered, Cust_account)

Customer Relation

(Cust_account, Cust_name, Cust_addr, Zip_code)

Zip_code Relation

(Zip_code, City, State)

Candidate keys?

Invoice_items Relation

(Invoice_number, Item, Item_qty, Item_price)

Items Relation

(Item, Item_descrip)

Manufacturers Relation

(Manuf_code, Manuf_name)

fourth normal form 4nf
Fourth Normal Form (4NF)
  • A relation is in fourth normal form if and only if it is in Boyce-Codd normal form and there are no nontrivial dependencies.
    • Identify all determinants and make sure they are candidate keys
4nf example

100 Finance F177-99

100 Marketing F177-99

100 Finance F288-00

102 Finance F288-00

102 Marketing F177-99

102 Finance F177-99

4NF Example

Employee (EmployeeID, Dept, Project)

Matrix management

Multivalued dependencies

4NF

Employee (EmployeeID, Dept)

Projects (EmployeeID, Project)

4nf example38

100 Finance F177-99

100 Marketing F177-99

100 Finance F288-00

102 Finance F288-00

102 Marketing F177-99

102 Finance F177-99

4NF Example

Matrix management

Multivalued dependencies

4NF

100 Finance

100 Marketing

102 Finance

102 Marketing

100 F177-99

100 F288-00

102 F288-00

102 F177-99

fifth normal form 5nf aka project join nf
Fifth Normal Form (5NF)aka Project-Join NF
  • A relation is in fifth normal form if no remaining nonloss projections (i.e., all projects preserve all information contained in the original relation)are possible, except the trivial one in which the key appears in each project.
    • The join of all projects will result in the original relation
    • No systematic method exists for obtaining 5NF or for ensuring that a set of relations is indeed 5NFRicardo, 1990
domain key normal form dknf
Domain-Key Normal Form (DKNF)
  • A relation is in domain-key normal form if every constraint is a logical consequence of domain constraints or key constraints (i.e., all possible values are a result of an imposed constraint)
    • There is no proven method of converting a design to DKNF, so it remains an ideal rather than a state that can readily be achievedRicardo, 1990
slide41
DKNF

For example:

Emp_ID, Emp_name, Classification, Position, Salary

  • Domain for Position
  • Strategic Planner
  • CIO
  • Vice President
  • Domain for Classification:
  • Executive
  • Manager
  • Staff
  • Domain for Position
  • Programmer/Analyst I
  • Programmer/Analyst II
  • Database/Analyst I
database design methodology
Database Design Methodology

Conceptual database design

  • Build conceptual representation of the database

Logical database design

  • Translate conceptual representation to logical structure of the database

Physical database design

  • Operatioanlize logical structure in a physical implementation
conceptual database design
Conceptual Database Design
  • The process of constructing a model of the data used in an enterprise, independent of all physical considerations
  • What’s involved…
    • Identify entity types, relationship types
    • Identify and associate attributes with entity or relationship types
    • Determine attribute domains
    • Determine candidate, primary and alternate key attributes
    • Consider use of enhanced modeling concepts
    • Check model for redundancies
    • Validate conceptual model against user transactions
    • Review conceptual data model with the users
logical database design
Logical Database Design
  • The process of constructing a model of the data used in an enterprise based on a specific data model, but independent of a particular DBMS and other physical considerations
  • What’s involved…
    • Derive relations for logical data model
    • Validate relations using data normalization
    • Validate relations against user transactions
    • Check integrity constraints
    • Review logical data model (ERD) with the users
    • Merge logical data models into global data model
    • Check for future growth
gather information
Gather Information
  • Meet with the users to get gather information
    • Interviews
    • Documents
derive relations
Derive Relations

 Invoices have invoice items

 One-to-many relationship

  • Strong and weak entity types
  • Relationship types (cardinality)
  • Participation (mandatory vs. partial)

Invoice

Invoice number (pk)

Invoice date

Delivery date

Sales type

Customer account

Invoice Items

Invoice number (pk)

Product code (pk)

Manufacture code

Quantity

Sales Price

Have

1..1

1..*

 Mandatory (all invoices must have at least one invoice item

 Weak entity type (Invoice number is part of key)

 Strong entity type

validate relations
Validate Relations
  • Normalize relations
  • Validate against transactions - Can a transaction be recreated given the data retained in the relations?)
  • Check integrity constraints
    • Required data (not null)
    • Domain constraints (in, references)
    • Multiplicity
    • Entity integrity (primary key)
    • Referential integrity (foreign key)
    • General constraints (business rules)
review data model with the users
Review Data Model with the Users
  • Be pleasant and professional, not arrogant, challenging or condescending
    • Not everyone is receptive to change
    • Your role is to facilitate change
  • The user is always “right” – It’s his/her data
  • Document all change requests (CYA)
  • Listen, listen, listen… (Even if you don’t agree)
logical global data model
Logical Global Data Model

Invoice

Records Transactions

Inventory

Counts and retail prices

Local Data Models

Cust Accounts

Global Data Model

Customer credit accounts

Cust Billing

Customer credit sales

Local data models are merged to create a (near) normalized global data model

Vendor History

Vendor performance

Product Sales

Sales history

physical database design
Physical Database Design
  • The process of producing a description of the implementation of the database on secondary storage
  • It describes the base relations, fle organizations and indexes used to achieve efficient access to the data and nay associated integrity constraints and security measures
  • What’s involved…
    • Translate logical data model for target DBMS: Design base relations, representation of derived data and general constraints
    • Design file organizations and indexes: Analyze transactions, choose file organizations, choose indexes, estimate disk space requirements
    • Design user views and security mechanisms
    • Consider the introduction of controlled redundancy
    • Monitor and tune the operational system

Dictated by the DB product

logical vs physical database design
Logical vs. Physical Database Design
  • LogicalThe process of constructing a model of the information use the enterprise based on one model of data, BUT independent of a particular DBMS and other physical aspects.
  • PhysicalThe process of producing a description of the implementation of the database on secondary storage; it describes the storage structures and access methods used to gain access effectively.

Whereas the logical database design is concerned with the what, physical database design is concerned with the how.

physical database design52
Physical Database Design

Five steps:

  • Translate the global (enterprise) logical data model for the target DBMS
  • Design files organizations and indexes, estimate database space (disk space requirements)
  • Design and implement user views and security mechanisms
  • Consider the introduction of controlled redundancy (denormalization)
  • Monitor and tune the operational system
translate the global logical database model for the target dbms
Translate the Global Logical Database Model for the Target DBMS
  • Design the relations for the target DBMS
    • Decide how to represent the base relations in the global logical data model in the target DBMS
      • Specify keys (primary, foreign), default values, integrity constraints (table, column), and indexes
  • Design integrity rules for the target DBMS
    • Design the enterprise constraints for the target DBMS
      • Applies to updates and inserts
design and implement the physical representation
Design and Implement the Physical Representation
  • Determine the file organizations and access methods that will be used to store the base relations (i.e., the way in which relations and tuples will be held in secondary storage)
    • Understand the system resources
      • Understand the capabilities of the hardware (CPU, memory, disk I-O)
      • Analyze the software’s performance and limitations on the network (client/server) and Internet

Depends on the vendor!

design and implement the physical representation55
Design and Implement the Physical Representation
  • Analyze the transactions - understand the functionality of the transactions that will run on the database, and analyze the import transactions
  • Choose file organization
  • Choose secondary indexes - determine whether secondary indexes will enhance performance
    • Index the primary key (if it is not the key of the file organization)
    • Do not index small relations
    • Add a secondary index to a heavily used secondary key
    • Add a secondary index to a frequently used foreign key
design and implement the physical representation56
Design and Implement the Physical Representation

(cont.)

  • AVOID INDEXING AN ATTRIBUTE OR RELATION THAT IS FREQUENTLY UPDATED
  • Avoid indexing an attribute if the query will retrieve a large portion of the tuples in a relation
  • Avoid indexing attributes that consist of long character strings
design and implement the physical representation57
Design and Implement the Physical Representation
  • Consider the introduction of controlled redundancy
    • Determine whether introducing redundancy in a controlled manner by relaxing the normalization rules will enhance performance
      • Denormalize only when necessary
        • However, denormalizing
          • Makes implementation more complex
          • Sacrifices flexibility
          • May slow down updates (although retrievals may be increased)
3nf logical database design
3NF (Logical Database Design)

Invoice Relation

Invoice_number Invoice_date Date_delivered Cust_account

Customer Relation

Cust_account Cust_name Cust_addr Zip_code

Zip_code Relation

Zip_code City State

Invoice_items Relation

Invoice_numberItem Item_qty Item_price

Items Relation

Manufacturers Relation

Manuf_code Manuf_name

Item Item_descrip Manuf_code

denormalization
Denormalization
  • Duplicating attributes or combining relations
    • Combining 1:1 relationships

Customers Relation

Cust_account Cust_name Cust_addr Zip_code

Customer_accounts Relation

Cust_account Account_type Credit_limit Current_balance Pay_history

Customers Relation

Cust_account Cust_name Cust_addr Zip_code Account_type Credit_limit Current_balance Pay_history

denomalization
Denomalization
  • Duplicating attributes or combining relations
    • Duplicating nonkey attributes in 1:M relationships to reduce joins (creating partial or transitive dependencies)

Customers Relation

Cust_account Cust_name Cust_addr Zip_code Account_type Credit_limit Current_balance Pay_history

Zip_codes Relation

Zip_code City State

Customers Relation

Cust_account Cust_name Cust_addr City State Zip_code Account_type Credit_limit Current_balance Pay_history

denomalization61
Denomalization

(cont.)

  • Reference tables (introducing transitive dependencies)

Invoice_items Relation

Invoice_number Item Item_qty Item_price

Items Relation

Manufacturers Relation

Manuf_code Manuf_name

Item Item_descrip Manuf_code

Problem: In order to know the manufacturer’s name of a customer’s purchased item, a join between Items and Manufacturers must be performed

denomalization62
Denomalization

(cont.)

  • Reference tables (introducing transitive dependencies)

Invoice_items Relation

Invoice_number Item Item_qty Item_price

Items Relation

Manufacturers Relation

Manuf_code Manuf_name

Item Item_descrip Manuf_code

Invoice_number Item Manuf_codeManuf_name Item_qty Item_price

denomalization63
Denomalization

(cont.)

  • Duplicating foreign key attributes in 1:M relationships to reduce joins

Invoice_items Relation

Invoice_number Item Item_qty Item_price

Items Relation

Manufacturers Relation

Manuf_code Manuf_name

Item Item_descrip Manuf_code

  • Problem: To find the manufacturer’s name of a product (e.g., Sony CDP-525) from line_items (relation), two joins must be made: manufacturers to products, and products to manufacturers.
denomalization64
Denomalization

(cont.)

Invoice_items Relation

Invoice_number Item Item_qty Item_price

Items Relation

Manufacturers Relation

Manuf_code Manuf_name

Item Item_descrip Manuf_code

Invoice_number Item Manuf_code Item_qty Item_price

denomalization65
Denomalization

(cont.)

  • Duplicating attributes in M:N relationships to reduce joinsIf joint accounts are allowed and different types of accounts (i.e., long term, revolving) are available:

Customers Relation

Cust_account Cust_name Cust_addr Zip_code Soc_Sec_Num

M:N

Customer_accounts Relation

Cust_account Account_type Credit_limit Current_balance Pay_history

Soc_Sec_Num

denormalization66
Denormalization

123456789 John Smith … 123-45-6789

123456789 Jane Smith ... 987-65-4321

112233445 John Doe … 567-32-1234

A customer can have several accounts...

123456789 … 123-45-6789

123456789 … 987-65-7321

543219876 … 123-45-6789

678901234 … 987-65-7321

548794133 … 567-32-1234

An account can have several owners...

denormalization67
Denormalization

(cont.)

Customers Relation

Cust_account Cust_name Cust_addr Zip_code Soc_Sec_Num

M:N

Customer_accounts Relation

Cust_account Account_type Credit_limit Current_balance Pay_history

Soc_Sec_Num

denormalization68
Denormalization

(cont.)

  • Duplicating attributes in M:N relationships to reduce joins

Customers Relation

Cust_account Cust_name Cust_addr Zip_code Soc_Sec_Num

Customer_accounts Relation

Cust_account Account_type Credit_limit Current_balance Pay_history

Soc_Sec_Num

Cust_account Account_type Credit_limit Current_balance Pay_history

Soc_Sec_Num Cust_name

denomalization69
Denomalization

(cont.)

  • Introducing repeating groups (if the number of occurrences is known and/or constant)
  • Creating extract tables (in an extreme case, an unnormalized relation) - frees computing resources

Cust_account Account_type Credit_limit Current_balance Pay_history

Soc_Sec_Num1 Cust_name1 Soc_Sec_num2 Cust_name2

denomalization70
Denomalization

(cont.)

  • Introduction of “codes” to
    • Simplify the composite key
    • Retain the original sequence

Invoice_items Relation

Invoice_number Item Manuf_code Item_qty Item_price

Invoice_number Item_number Item Manuf_code Item_qty Item_price

denormalization72
Denormalization

Invoice_number

Item_description

Manuf_code

Retail_price

Item

Qty

10001 AT10 CV Loudspeakers 2 359.95

10001 CDPC725 SON Disc-Jockey CD Changer 1 399.95

10001 SAGX730 PIO Remote A/V Receiver 1 569.95

Key

Problem: These items are not in the sequence as they appear on the original document when retrieved from the table.

denormalization73
Denormalization

Invoice_number

Item_description

Item_number

Manuf_code

Retail_price

Item

Qty

10001 01 SAGX730 PIO Remote A/V Receiver 1 569.95

10001 02 AT10 CV Loudspeakers 2 359.95

10001 03 CDPC725 SON Disc-Jockey CD Changer 1 399.95

Key

denomalization74
Denomalization

(cont.)

  • Introducing calculated attributes
    • Simplify processing

Invoice_items Relation

Invoice_number Item Manuf_code Item_qty Item_price

Invoice_number Item_number Item Manuf_code Item_qty Item_price

Extended_price

Item_qty x Item_price

denormalization75
Denormalization

Item_description

Invoice_number

Extended_price

Item_number

Manuf_code

Retail_price

Item

Qty

10001 01 SAGX730 PIO Remote A/V Receiver 1 569.95 569.95

10001 02 AT10 CV Loudspeakers 2 359.95 719.90

10001 03 CDPC725 SON Disc-Jockey CD Changer 1 399.95 399.95

Calculation