1 / 13

INFS 6510 – Competitive Intelligence Systems

INFS 6510 – Competitive Intelligence Systems. Normalization vs. Denormalization. Relational Databases (RDBMS). Collection of linked tables Tables linked by Primary Key / Foreign Key relationships (Referential Integrity) Primary Key – column whose values make each record unique

kumiko
Download Presentation

INFS 6510 – Competitive Intelligence Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INFS 6510 – Competitive Intelligence Systems Normalization vs. Denormalization

  2. Relational Databases (RDBMS) • Collection of linked tables • Tables linked by Primary Key / Foreign Key relationships (Referential Integrity) • Primary Key – column whose values make each record unique • Foreign Key – value in column that links to Primary Key in another table • SQL – Structured Query Language (language to access data in relational tables)

  3. Relational DB Example Cust #Cust Name 100 Bob 101 Sue 102 Juan Order #Prod#QtyCust# 1 QR22 1 100 2 QR22 25 100 3 SB56 3 102 CUSTOMER TABLE ORDER TABLE Foreign Key Primary Key

  4. Database Structure & Design • 2 Approaches: Conflict 1. Optimize for Data Capture i.e., CapturingTransactions 2. Optimize for Data Access i.e., Queries & Reporting

  5. Approach #1: Optimize for Data Capture • To optimize for data capture, you must: • Eliminate redundancy of data (or else wasted space & processing occurs) • Ensure data integrity (or else data anomalies) • Ensure that changes in data (modifications, deletions, etc. only have to happen in one place) • Normalization – process in which a DBMS is optimized for data capture • All data “redundancy” is removed from Database • Has multiple forms (0, 1st, 2nd, 3rd, et al.)

  6. Moving from 0NF to 1NFRule: Make a separate table for each set of related attributes, and give each table a primary key of unique values. Cust # CustName 100, 100, 101 Bob, Sue, Juan CUSTOMER TABLE ONF Cust #Cust Name 100 Bob 101 Sue 102 Juan CUSTOMER TABLE 1NF Primary Key Created with Unique values

  7. Dependency on Primary Key 100 Bob 100 Bob Moving from 1NF to 2NFRule: Eliminate any repeating values caused by a dependency on a “keyed” column (i.e., either Primary or Foreign) Cust #Cust NameOrder# 100 Bob 1 100 Bob 2 101 Sue 3 TABLE X 1NF Cust #Cust Name 100 Bob 101 Sue Order #Cust# 1 100 2 100 3 101 CUSTOMER TABLE ORDER TABLE 2NF

  8. Dependency b/t 2 non-key columns PGH 2 days PGH 2 days Moving from 2NF to 3NFRule: Eliminate any repeating values caused by a dependency on a “non-keyed” column (i.e., dependency on ANY column) Cust #CityOrder#ShipTime 100 PGH 1 2 days 101 PGH 2 2 days 102 LA 3 5 days TABLE X 2NF City #CityShipTime 10 PGH 2 days 20 LA 5 days Cust #City# 100 10 101 10 102 20 SHIP TIME TABLE CUSTOMER TABLE 3NF

  9. Normalized DB Example

  10. Approach #2: Optimize for Data Access(in a separate, read-only Data Warehouse) • To optimize for data access, you must: • Allow data redundancy • Reduce the number of table joins (links among tables) • Denormalizing – Adding redundancy & reducing joins in a DBMS

  11. Denormalizing – Most Common Approach • Star Schema (Clustering) • Fact (core or transaction) Tables in middle of star • Dimensional (structural or “lookup”) Tables around “points” of star Cust #CustName 100 Bob 101 Sue 102 Juan Prod #ProdName QR22 Rake SR56 Spade TW43 Mulch CUSTOMER TABLE PRODUCT TABLE Order #DateCust#Prod#Rep# 1 06/15/XX 100 QR22 1000 2 07/19/XX 100 QR22 1000 3 08/30/XX 101 SR56 2000 ORDER TABLE Rep #RepName 1000 Lee 2000 James 3000 Natasha Date Quarter 06/29/XX 2 Bob 06/30/XX 2 Sue 07/01/XX 3 Juan REP TABLE DATE/TIME

  12. Denormalizing (continued)• Stars are linked via common (i.e., Conformed) Dimensions to form Data Warehouse Cust #CustName 100 Bob 101 Sue 102 Juan CUSTOMER TABLE Order #DateCust#Prod#Rep# 1 06/15/XX 100 QR22 1000 2 07/19/XX 100 QR22 1000 3 08/30/XX 101 SR56 2000 ORDER TABLE ORDER TABLE Prod #ProdName QR22 Rake SR56 Spade TW43 Mulch PRODUCT TABLE Date Quarter 06/29/XX 2 Bob 06/30/XX 2 Sue 07/01/XX 3 Juan CUSTOMER TABLE TIME DATE/TIME Rep #RepName 1000 Lee 2000 James 3000 Natasha REP TABLE Prod#ProdName Stock DateUnits QR22 Rake 03/23/XX 150 TW43 Mulch 04/15/XX 1452 SR56 Spade 05/01/XX 997 INVENTORY TABLE

  13. Mapping Normalized Tables to Denormalized (Data Warehouse) TablesUsing ETL Tools

More Related