1 / 10

Denormalization Techniques: Balancing Performance and Data Redundancy

Denormalization involves intentionally introducing redundancy into a database to enhance performance, particularly for frequently executed queries where strict performance is critical. It is essential to denormalize cautiously, only when a clear benefit is evident, while thoroughly documenting the rationale. Common techniques include flattening tables, embedding reference codes, and combining 1:1 or 1:N relationships. Although denormalization can lead to faster query responses, it may also result in longer row lengths, increased memory usage, and potential update overhead, necessitating careful analysis and planning.

sovann
Download Presentation

Denormalization Techniques: Balancing Performance and Data Redundancy

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Denormalization - Causes redundancy, but fast performance & no referential integrity - Denormalize when • specific queries occur frequently, • a strict performance is required and • it is not heavily updated -So, denormalize only when there is a very clear advantage to doing so and document carefully the reason for doing so

  2. typical denormalization techniques • Flatten a repeating group in one table Instead of EMP (E#, Ename) SKILL (E#, Skill) Use EMP (E#, Skill, Ename) when Emp has a smaller # of attributes. - This means use Method 2 of 1NF algorithm. But know the danger of this method as we discussed in MVD.

  3. Cont’ (2) Embed stable Code-Interpretation (Reference) Table. Instead of FLIGHT (F#, Departs, From_Code, To_Code) CODE (Code, Airport_Name) Use FLIGHT (F#, Departs, From_AP, From_Code, To_AP, To_Code)

  4. Cont’ Combine1:1 or 1:N (a) when N is small and (b) the record on the "one" side is small (thus the amount of redundancy will be small) Instead of SALE (S#, SPName, SaleDate), SALE_ITEMS (S#, Line#, Code, Qty) Use SALE(S#, Line#, SPName, SaleDate, Code, Qty) -- "How many T179's did we sell yeaterday?" can be answered without join. • Another example: Order_Item(O#, I#, C#, Cname, I_Desc, Qty, I_Price)

  5. Cont’ (4) When the other entity in is not interesting by itself Order(O#, ODate, OShipTerms, PmtTerms, Cname, CAddr) (5) Replicate non-frequently updated attributes to avoid JOIN WORK_ON (ESSN, P_NUM, PName, Hours)

  6. Problems of denormalization • Makes row longer • Makes data transfer longer • Needs more memory for memory processing • Cause redundancy and expensive update

  7. Adding redundant data - Add summary attributes or derived attributes - Redundant relationships can improve performance with the cost of update overhead

  8. Schema translation • Reduce #of relations for JOIN by using mapped translation • Handling null values • Combine 1:1 relationships • Relax participation constraints • Divide the big table into two, if A & B are distinct in R(A, B) • Ignore FDs based on co-occurring attributes, which are not updated ZIP --> CITY

  9. Primary key - Most frequently used attributes - Prefer small sized attributes (used in indexes, Ref. integrity)

  10. Index - Create a set of appropriate indexes optimzing queries (This will be discussed more in physical DB chapters.)

More Related