1 / 10

Denormalization

Denormalization. - Causes redundancy, but fast performance & no referential integrity - Denormalize when • specific queries occur frequently, • a strict performance is required and • it is not heavily updated

sovann
Download Presentation

Denormalization

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Denormalization - Causes redundancy, but fast performance & no referential integrity - Denormalize when • specific queries occur frequently, • a strict performance is required and • it is not heavily updated -So, denormalize only when there is a very clear advantage to doing so and document carefully the reason for doing so

  2. typical denormalization techniques • Flatten a repeating group in one table Instead of EMP (E#, Ename) SKILL (E#, Skill) Use EMP (E#, Skill, Ename) when Emp has a smaller # of attributes. - This means use Method 2 of 1NF algorithm. But know the danger of this method as we discussed in MVD.

  3. Cont’ (2) Embed stable Code-Interpretation (Reference) Table. Instead of FLIGHT (F#, Departs, From_Code, To_Code) CODE (Code, Airport_Name) Use FLIGHT (F#, Departs, From_AP, From_Code, To_AP, To_Code)

  4. Cont’ Combine1:1 or 1:N (a) when N is small and (b) the record on the "one" side is small (thus the amount of redundancy will be small) Instead of SALE (S#, SPName, SaleDate), SALE_ITEMS (S#, Line#, Code, Qty) Use SALE(S#, Line#, SPName, SaleDate, Code, Qty) -- "How many T179's did we sell yeaterday?" can be answered without join. • Another example: Order_Item(O#, I#, C#, Cname, I_Desc, Qty, I_Price)

  5. Cont’ (4) When the other entity in is not interesting by itself Order(O#, ODate, OShipTerms, PmtTerms, Cname, CAddr) (5) Replicate non-frequently updated attributes to avoid JOIN WORK_ON (ESSN, P_NUM, PName, Hours)

  6. Problems of denormalization • Makes row longer • Makes data transfer longer • Needs more memory for memory processing • Cause redundancy and expensive update

  7. Adding redundant data - Add summary attributes or derived attributes - Redundant relationships can improve performance with the cost of update overhead

  8. Schema translation • Reduce #of relations for JOIN by using mapped translation • Handling null values • Combine 1:1 relationships • Relax participation constraints • Divide the big table into two, if A & B are distinct in R(A, B) • Ignore FDs based on co-occurring attributes, which are not updated ZIP --> CITY

  9. Primary key - Most frequently used attributes - Prefer small sized attributes (used in indexes, Ref. integrity)

  10. Index - Create a set of appropriate indexes optimzing queries (This will be discussed more in physical DB chapters.)

More Related