normalization n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Normalization PowerPoint Presentation
Download Presentation
Normalization

Loading in 2 Seconds...

play fullscreen
1 / 35

Normalization - PowerPoint PPT Presentation


  • 134 Views
  • Uploaded on

Normalization. Normalization. Normalization is the process of efficiently organizing data in a database with two goals in mind First goal: eliminate redundant data for example, storing the same data in more than one table Second Goal: ensure data dependencies make sense

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Normalization


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
normalization1
Normalization
  • Normalization is the process of efficiently organizing data in a database with two goals in mind
  • First goal: eliminate redundant data
    • for example, storing the same data in more than one table
  • Second Goal: ensure data dependencies make sense
    • for example, only storing related data in a table
benefits of normalization
Benefits of Normalization

Bad database designs results in: redundancy: inefficient storage.

anomalies: data inconsistency, difficulties in maintenance

  • Less storage space
  • Quicker updates
  • Less data inconsistency
  • Clearer data relationships
  • Easier to add data
  • Flexible Structure
first normal form 1nf
First Normal Form (1NF)
  • A database schema is in First Normal Form if all tables are flat

Student

Student

Takes

Course

May needto add keys

functional dependencies
Functional Dependencies
  • A form of constraint
    • hence, part of the schema
  • Finding them is part of the database design
  • Also used in normalizing the relations
  • Warning: this is the most abstract, and “hardest” part of the database design.
functional dependencies1

Functional dependency between A and B

Functional Dependencies

Definition:

If two tuples agree on the attributes

A1, A2, …, An

then they must also agree on the attributes

B1, B2, …, Bm

Formally:

A1, A2, …, An B1, B2, …, Bm

examples
Examples
  • EmpID  Name, Phone, Position
  • Position  Phone
  • but Phone  Position

EmpID

Name

Phone

Position

E0045

Smith

1234

Clerk

E1847

John

9876

Salesrep

E1111

Smith

9876

Salesrep

E9999

Mary

1234

Lawyer

in general
In General
  • To check A  B, erase all other columns
example
Example

EmpID

Name

Phone

Position

E0045

Smith

1234

Clerk

E1847

John

9876

Salesrep

E1111

Smith

9876

Salesrep

E9999

Mary

1234

Lawyer

Position  Phone

functional dependencies2
Functional Dependencies
  • Important concept in differentiating good database designs from bad ones
  • FD is a generalization of the notion of keys
  • An FD is a set of attributes whose values uniquely determine the values of the remaining attributes.

Emp(eno, ename, sal) key FDs:eno => ename

Dept(dno, dname, floor) eno => sal

Works-in(eno,dno, hours) (eno,dno) => hours

dno => dname

dno => floor

functional dependencies3
Functional Dependencies
  • Trivial dependencies:  => 

 =>  if   

  • Closure
    • we need to consider all FDs
    • some are implied by others; e.g., FDs are transitive; if A=>B and B=>C, then A=>C
    • Given F = set of FDs, we want to find F’ (the closure of all FDs logically implied by F)
inference rules for fd s
Inference Rules for FD’s

A1, A2, …, An B1, B2, …, Bm

Splitting rule

and

Combining rule

Is equivalent to

A1, A2, …, An B1

A1, A2, …, An B2

. . . . .

A1, A2, …, An Bm

inference rules for fd s continued
Inference Rules for FD’s(continued)

Trivial Rule

A1, A2, …, An Ai

where i = 1, 2, ..., n

Why ?

inference rules for fd s continued1
Inference Rules for FD’s(continued)

Transitive Closure Rule

A1, A2, …, An B1, B2, …, Bm

If

and

B1, B2, …, Bm  C1, C2, …, Cp

A1, A2, …, An C1, C2, …, Cp

then

Why ?

closure of a set of fds
Closure of a set of FDs
  • It is not suff. to consider just the given set of FDs
  • We need to consider all FDs that hold
  • Given F, more FDs can be inferred
  • Such FDs are said to be logically implied by F
  • F+ is the set of all FDs logically implied by F
  • We can compute F+using formal defn. of FD
  • If F were large, this process would be lengthy & cumbersome
  • Axioms or Rules of Inference provide simpler technique
  • Armstrong;s Axioms
inference rules for fds
Inference Rules for FDs

Armstrong's inference rules:

IR1. (Reflexive) If Y  X, then X Y

IR2. (Augmentation) If X Y, then XZ YZ

(Notation: XZ stands for X U Z)

IR3. (Transitive) If X Y and Y Z, then X Z

IR1, IR2, IR3 form a sound & complete set of inference rules

Never generates any wrong FD

Generate all FDs that hold

inference rules for fds1
Inference Rules for FDs

Some additional inference rules that are useful:

Decomposition: If XYZ, then XY & XZ

Union: If XY & XZ, then XYZ

Psuedotransitivity: If XY & WYZ,then WXZ

  • The last three inference rules, as well as any other inference rules, can be deduced from IR1, IR2, and IR3 (completeness property)
example1
Example
  • R = (A, B, C, G, H, I)F = { A BA CCG HCG IB H}
  • some members of F+
    • A H
      • by transitivity from A B and B H
    • AG I
      • by augmenting A C with G, to get AG CG and then transitivity with CG I
    • CG HI
      • By union rule
2nf example 1
2NF – Example - 1
  • Inventory (Item, Supplier, Cost, Supplier Address)
  • We first check if Cost is fully functionally dependent upon the ENTIRE Primary-Key
  • If I know just Item, can I find out Cost?
    • No. We can have > 1 supplier for the same product.
  • If I know just Supplier, and I find out Cost?
    • No. We need to know what the Item is as well.
  • So, Cost is fully functionally dependent upon the ENTIRE Primary-Key
2nf example 2
2NF – Example - 2
  • Inventory (Item, Supplier, Cost, Supplier Address)
  • We then check if Supplier Address is fully functionally dependent upon the ENTIRE Primary-Key
  • If I know just Item, can I find out Supplier Address?
    • No. We can have > 1 supplier for the same product.
  • If I know just Supplier, and I find out Supplier Address?
    • Yes. The supplier’s address does not depend on the Item.
  • So, Supplier Address is NOT fully functionally dependent upon the ENTIRE Primary-Key  NOT 2NF
so putting things together
So putting things together

The above relation is now in 2NF since the relation has no non-key attributes.

transitive dependence
Transitive Dependence

Give a relation R,

Assume the following FD hold:

Note : Both Ename and Address attributes are non-key attributes in R, and since

Address depends on a non-Prime attribute Name, which depends on the primary

key(EmpNo), a transitive dependency exists

R2

R1

database normalization
Database Normalization
  • Boyce-Codd Normal Form (BCNF)
    • A relation is in Boyce-Codd normal form (BCNF) if every determinant in the table is a candidate key.

(A determinant is any attribute whose value determines other values with a row.)

    • If a table contains only one candidate key, the 3NF and the BCNF are equivalent.
    • BCNF is a special case of 3NF.
slide29
BCNF
  • Based on FDs that take into account all candidate keys of a relation
  • For a relation with only 1 CK, 3NF & BCNF are equivalent
  • A relation is said to be in BCNF if every determinant is a CK
  • Is PLOTS in BCNF?
  • NO
bcnf vs 3nf
BCNF vs 3NF
  • BCNF: For every functional dependency X->Y in a set F of functional dependencies over relation R, either:
    • Y is a subset of X or,
    • X is a superkey of R
  • 3NF: For every functional dependency X->Y in a set F of functional dependencies over relation R, either:
    • Y is a subset of X or,
    • X is a superkey of R, or
    • Y is a subset of K for some key K of R
      • N.b., no subset of a key is a key
3nf schema
3NF Schema

For every functional

dependency X->Y in a set F

of functional dependencies

over relation R, either:

  • Y is a subset of X or,
  • X is a superkey of R, or
  • Y is a subset of K for some key K of R

Client, Office -> Client, Office, Account

Account -> Office

3nf schema1
3NF Schema

For every functional

dependency X->Y in a set F

of functional dependencies

over relation R, either:

  • Y is a subset of X or,
  • X is a superkey of R, or
  • Y is a subset of K for some key K of R

Client, Office -> Client, Office, Account

Account -> Office

bcnf vs 3nf1

Lossless decomposition

Account -> Office

No non-trivial FDs

BCNF vs 3NF

3NF has some redundancy

BCNF does not

Unfortunately, BCNF is not dependency preserving, but 3NF is

For every functional

dependency X->Y in a set F

of functional dependencies

over relation R, either:

  • Y is a subset of X or,
  • X is a superkey of R
  • Y is a subset of K for some key K of R

Client, Office -> Client, Office, Account

Account -> Office

closure of a set of fds1
Closure of a set of FDs
  • Given a set of FDs F on a relation R, it may be possible that several other FDs must also hold for R
  • For Example, R=(A,B,C) & FDs, AB & BC hold in R, then FD AC also holds on R
  • For a given value of A, there can be only one corresponding value of B, & for that value of B, there can be only one corresponding value for C
  • The closure of F is the set of all FDs that can be inferred from F, & is denoted by F+
problem 1
Problem 1
  • Consider the relation R(A,B,C) with functional dependencies ABC and CB.
      • Is R in 2NF?
      • Is R in 3NF?
      • Is R in BCNF?