Denormalization

1 / 10

# Denormalization - PowerPoint PPT Presentation

Denormalization. - Causes redundancy, but fast performance &amp; no referential integrity - Denormalize when • specific queries occur frequently, • a strict performance is required and • it is not heavily updated

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Denormalization' - fallon-waller

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Denormalization

- Causes redundancy, but fast performance & no referential integrity

- Denormalize when

• specific queries occur frequently,

• a strict performance is required and

• it is not heavily updated

-So, denormalize only when there is a very clear advantage to doing so and document carefully the reason for doing so

typical denormalization techniques
• Flatten a repeating group in one table

EMP (E#, Ename)

SKILL (E#, Skill)

Use EMP (E#, Skill, Ename) when Emp has a smaller # of attributes.

- This means use Method 2 of 1NF algorithm. But know the danger of this method as we discussed in MVD.

Cont’

(2) Embed stable Code-Interpretation (Reference) Table.

FLIGHT (F#, Departs, From_Code, To_Code)

CODE (Code, Airport_Name)

Use FLIGHT (F#, Departs, From_AP, From_Code, To_AP, To_Code)

Cont’

Combine1:1 or 1:N (a) when N is small and (b) the record on the "one" side is small (thus the amount of redundancy will be small)

SALE (S#, SPName, SaleDate), SALE_ITEMS (S#, Line#, Code, Qty)

Use

SALE(S#, Line#, SPName, SaleDate, Code, Qty)

-- "How many T179\'s did we sell yeaterday?" can be answered without join.

• Another example:

Order_Item(O#, I#, C#, Cname, I_Desc, Qty, I_Price)

Cont’

(4) When the other entity in is not interesting by itself

Order(O#, ODate, OShipTerms, PmtTerms, Cname, CAddr)

(5) Replicate non-frequently updated attributes to avoid JOIN

WORK_ON (ESSN, P_NUM, PName, Hours)

Problems of denormalization
• Makes row longer
• Makes data transfer longer
• Needs more memory for memory processing
• Cause redundancy and expensive update

- Add summary attributes or derived attributes

- Redundant relationships can improve performance with the cost of update overhead

Schema translation
• Reduce #of relations for JOIN by using mapped translation
• Handling null values
• Combine 1:1 relationships
• Relax participation constraints
• Divide the big table into two, if A & B are distinct in R(A, B)
• Ignore FDs based on co-occurring attributes, which are not updated

ZIP --> CITY

Primary key

- Most frequently used attributes

- Prefer small sized attributes (used in indexes, Ref. integrity)

Index

- Create a set of appropriate indexes optimzing queries (This will be discussed more in physical DB chapters.)