- By
**fay** - Follow User

- 134 Views
- Uploaded on

Download Presentation
## Normalization

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Normalization

- Normalization is the process of efficiently organizing data in a database with two goals in mind
- First goal: eliminate redundant data
- for example, storing the same data in more than one table
- Second Goal: ensure data dependencies make sense
- for example, only storing related data in a table

Benefits of Normalization

Bad database designs results in: redundancy: inefficient storage.

anomalies: data inconsistency, difficulties in maintenance

- Less storage space
- Quicker updates
- Less data inconsistency
- Clearer data relationships
- Easier to add data
- Flexible Structure

First Normal Form (1NF)

- A database schema is in First Normal Form if all tables are flat

Student

Student

Takes

Course

May needto add keys

Functional Dependencies

- A form of constraint
- hence, part of the schema
- Finding them is part of the database design
- Also used in normalizing the relations
- Warning: this is the most abstract, and “hardest” part of the database design.

Functional dependency between A and B

Functional DependenciesDefinition:

If two tuples agree on the attributes

A1, A2, …, An

then they must also agree on the attributes

B1, B2, …, Bm

Formally:

A1, A2, …, An B1, B2, …, Bm

Examples

- EmpID Name, Phone, Position
- Position Phone
- but Phone Position

EmpID

Name

Phone

Position

E0045

Smith

1234

Clerk

E1847

John

9876

Salesrep

E1111

Smith

9876

Salesrep

E9999

Mary

1234

Lawyer

In General

- To check A B, erase all other columns

Example

EmpID

Name

Phone

Position

E0045

Smith

1234

Clerk

E1847

John

9876

Salesrep

E1111

Smith

9876

Salesrep

E9999

Mary

1234

Lawyer

Position Phone

Functional Dependencies

- Important concept in differentiating good database designs from bad ones
- FD is a generalization of the notion of keys
- An FD is a set of attributes whose values uniquely determine the values of the remaining attributes.

Emp(eno, ename, sal) key FDs:eno => ename

Dept(dno, dname, floor) eno => sal

Works-in(eno,dno, hours) (eno,dno) => hours

dno => dname

dno => floor

Functional Dependencies

- Trivial dependencies: =>

=> if

- Closure
- we need to consider all FDs
- some are implied by others; e.g., FDs are transitive; if A=>B and B=>C, then A=>C
- Given F = set of FDs, we want to find F’ (the closure of all FDs logically implied by F)

Inference Rules for FD’s

A1, A2, …, An B1, B2, …, Bm

Splitting rule

and

Combining rule

Is equivalent to

A1, A2, …, An B1

A1, A2, …, An B2

. . . . .

A1, A2, …, An Bm

Inference Rules for FD’s(continued)

Transitive Closure Rule

A1, A2, …, An B1, B2, …, Bm

If

and

B1, B2, …, Bm C1, C2, …, Cp

A1, A2, …, An C1, C2, …, Cp

then

Why ?

Closure of a set of FDs

- It is not suff. to consider just the given set of FDs
- We need to consider all FDs that hold
- Given F, more FDs can be inferred
- Such FDs are said to be logically implied by F
- F+ is the set of all FDs logically implied by F
- We can compute F+using formal defn. of FD
- If F were large, this process would be lengthy & cumbersome
- Axioms or Rules of Inference provide simpler technique
- Armstrong;s Axioms

Inference Rules for FDs

Armstrong's inference rules:

IR1. (Reflexive) If Y X, then X Y

IR2. (Augmentation) If X Y, then XZ YZ

(Notation: XZ stands for X U Z)

IR3. (Transitive) If X Y and Y Z, then X Z

IR1, IR2, IR3 form a sound & complete set of inference rules

Never generates any wrong FD

Generate all FDs that hold

Inference Rules for FDs

Some additional inference rules that are useful:

Decomposition: If XYZ, then XY & XZ

Union: If XY & XZ, then XYZ

Psuedotransitivity: If XY & WYZ,then WXZ

- The last three inference rules, as well as any other inference rules, can be deduced from IR1, IR2, and IR3 (completeness property)

Example

- R = (A, B, C, G, H, I)F = { A BA CCG HCG IB H}
- some members of F+
- A H
- by transitivity from A B and B H
- AG I
- by augmenting A C with G, to get AG CG and then transitivity with CG I
- CG HI
- By union rule

2NF – Example - 1

- Inventory (Item, Supplier, Cost, Supplier Address)
- We first check if Cost is fully functionally dependent upon the ENTIRE Primary-Key
- If I know just Item, can I find out Cost?
- No. We can have > 1 supplier for the same product.
- If I know just Supplier, and I find out Cost?
- No. We need to know what the Item is as well.
- So, Cost is fully functionally dependent upon the ENTIRE Primary-Key

2NF – Example - 2

- Inventory (Item, Supplier, Cost, Supplier Address)
- We then check if Supplier Address is fully functionally dependent upon the ENTIRE Primary-Key
- If I know just Item, can I find out Supplier Address?
- No. We can have > 1 supplier for the same product.
- If I know just Supplier, and I find out Supplier Address?
- Yes. The supplier’s address does not depend on the Item.
- So, Supplier Address is NOT fully functionally dependent upon the ENTIRE Primary-Key NOT 2NF

So putting things together

The above relation is now in 2NF since the relation has no non-key attributes.

Transitive Dependence

Give a relation R,

Assume the following FD hold:

Note : Both Ename and Address attributes are non-key attributes in R, and since

Address depends on a non-Prime attribute Name, which depends on the primary

key(EmpNo), a transitive dependency exists

R2

R1

Database Normalization

- Boyce-Codd Normal Form (BCNF)
- A relation is in Boyce-Codd normal form (BCNF) if every determinant in the table is a candidate key.

(A determinant is any attribute whose value determines other values with a row.)

- If a table contains only one candidate key, the 3NF and the BCNF are equivalent.
- BCNF is a special case of 3NF.

The Decomposition of a Table Structure to Meet

BCNF Requirements

BCNF

- Based on FDs that take into account all candidate keys of a relation
- For a relation with only 1 CK, 3NF & BCNF are equivalent
- A relation is said to be in BCNF if every determinant is a CK
- Is PLOTS in BCNF?
- NO

BCNF vs 3NF

- BCNF: For every functional dependency X->Y in a set F of functional dependencies over relation R, either:
- Y is a subset of X or,
- X is a superkey of R
- 3NF: For every functional dependency X->Y in a set F of functional dependencies over relation R, either:
- Y is a subset of X or,
- X is a superkey of R, or
- Y is a subset of K for some key K of R
- N.b., no subset of a key is a key

3NF Schema

For every functional

dependency X->Y in a set F

of functional dependencies

over relation R, either:

- Y is a subset of X or,
- X is a superkey of R, or
- Y is a subset of K for some key K of R

Client, Office -> Client, Office, Account

Account -> Office

3NF Schema

For every functional

dependency X->Y in a set F

of functional dependencies

over relation R, either:

- Y is a subset of X or,
- X is a superkey of R, or
- Y is a subset of K for some key K of R

Client, Office -> Client, Office, Account

Account -> Office

Account -> Office

No non-trivial FDs

BCNF vs 3NF3NF has some redundancy

BCNF does not

Unfortunately, BCNF is not dependency preserving, but 3NF is

For every functional

dependency X->Y in a set F

of functional dependencies

over relation R, either:

- Y is a subset of X or,
- X is a superkey of R
- Y is a subset of K for some key K of R

Client, Office -> Client, Office, Account

Account -> Office

Closure of a set of FDs

- Given a set of FDs F on a relation R, it may be possible that several other FDs must also hold for R
- For Example, R=(A,B,C) & FDs, AB & BC hold in R, then FD AC also holds on R
- For a given value of A, there can be only one corresponding value of B, & for that value of B, there can be only one corresponding value for C
- The closure of F is the set of all FDs that can be inferred from F, & is denoted by F+

Problem 1

- Consider the relation R(A,B,C) with functional dependencies ABC and CB.
- Is R in 2NF?
- Is R in 3NF?
- Is R in BCNF?

Download Presentation

Connecting to Server..