Loading in 2 Seconds...

Lecture 13: Relational Decomposition and Relational Algebra

Loading in 2 Seconds...

- 261 Views
- Uploaded on

Download Presentation
## Lecture 13: Relational Decomposition and Relational Algebra

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Lecture 13:Relational Decomposition and Relational Algebra

February 5th, 2003

Summary of Previous Discussion

- FD’s are given as part of the schema. You derived keys from them. You can also specify keys directly (they’re just another FD)
- First, compute the keys and the FD’s that hold on the relation you’re considering.
- Then, look for violations of BCNF. There may be a few. Choose one.

Summary (continued)

- You may need to decompose the decomposed relations:
- To decide that, you need to project the FD’s on the decomposed relations.
- The end result may differ, depending on which violation you chose to decompose by. They’re all correct.
- Relations with 2 attributes are in BCNF.

Summary (continued –2)

- If you have only a single FD representing a key, then the relation is in BCNF.
- But, you can still have another FD that forces you to decompose.
- This is all really pretty simple if you think about it long enough.

Boyce-Codd Normal Form

A simple condition for removing anomalies from relations:

A relation R is in BCNF if:

Whenever there is a nontrivial dependency A1, ..., An B

in R , {A1, ..., An} is a key for R

In English (though a bit vague):

Whenever a set of attributes of R is determining another attribute,

should determine all the attributes of R.

Example Decomposition

Person(name, SSN, age, hairColor, phoneNumber)

SSN name, age, hairColor

age hairColor

Decompose in BCNF (in class):

Step 1: find all keys ssn, phoneNumber

SSN, name, age, hairColor

SSN, phoneNumber

Step 2: now decompose

Other Example

- R(A,B,C,D) A B, B C
- Keys:
- Violations of BCNF:

Correct Decompositions

A decomposition is lossless if we can recover:

R(A,B,C)

R1(A,B) R2(A,C)

R’(A,B,C) should be the same as R(A,B,C)

Decompose

Recover

R’ is in general larger than R. Must ensure R’ = R

Correct Decompositions

- Given R(A,B,C) s.t. AB, the decomposition into R1(A,B), R2(A,C) is lossless

3NF: A Problem with BCNF

Unit Company Product

FD’s: Unit Company; Company, Product Unit

So, there is a BCNF violation, and we decompose.

Unit Company

Unit Company

Unit Product

No FDs

So What’s the Problem?

Unit Company

Unit Product

Galaga99 UW Galaga99 databases

Bingo UW Bingo databases

No problem so far. All local FD’s are satisfied.

Let’s put all the data back into a single table again:

Unit Company Product

Galaga99 UW databases

Bingo UW databases

Violates the dependency: company, product -> unit!

Solution: 3rd Normal Form (3NF)

A simple condition for removing anomalies from relations:

A relation R is in 3rd normal form if :

Whenever there is a nontrivial dependency A1, A2, ..., An Bfor R , then {A1, A2, ..., An } a super-key for R,

or B is part of a key.

Relational Algebra

- Formalism for creating new relations from existing ones
- Its place in the big picture:

Declartivequerylanguage

Algebra

Implementation

Relational algebraRelational bag algebra

SQL,relational calculus

Relational Algebra

- Five operators:
- Union:
- Difference: -
- Selection: s
- Projection: P
- Cartesian Product:
- Derived or auxiliary operators:
- Intersection, complement
- Joins (natural,equi-join, theta join, semi-join)
- Renaming: r

1. Union and 2. Difference

- R1 R2
- Example:
- ActiveEmployees RetiredEmployees
- R1 – R2
- Example:
- AllEmployees -- RetiredEmployees

What about Intersection ?

- It is a derived operator
- R1 R2 = R1 – (R1 – R2)
- Also expressed as a join (will see later)
- Example
- UnionizedEmployees RetiredEmployees

3. Selection

- Returns all tuples which satisfy a condition
- Notation: sc(R)
- Examples
- sSalary > 40000(Employee)
- sname = “Smithh”(Employee)
- The condition c can be =, <, , >,, <>

Find all employees with salary more than $40,000.

s Salary > 40000(Employee)

4. Projection

- Eliminates columns, then removes duplicates
- Notation: P A1,…,An(R)
- Example: project social-security number and names:
- PSSN, Name (Employee)
- Output schema: Answer(SSN, Name)

5. Cartesian Product

- Each tuple in R1 with each tuple in R2
- Notation: R1 R2
- Example:
- Employee Dependents
- Very rare in practice; mainly used to express joins

Relational Algebra

- Five operators:
- Union:
- Difference: -
- Selection: s
- Projection: P
- Cartesian Product:
- Derived or auxiliary operators:
- Intersection, complement
- Joins (natural,equi-join, theta join, semi-join)
- Renaming: r

Renaming

- Changes the schema, not the instance
- Notation: rB1,…,Bn (R)
- Example:
- rLastName, SocSocNo (Employee)
- Output schema: Answer(LastName, SocSocNo)

Renaming Example

Employee

Name

SSN

John

999999999

Tony

777777777

- LastName, SocSocNo (Employee)

LastName

SocSocNo

John

999999999

Tony

777777777

Natural Join

- Notation: R1 ⋈ R2
- Meaning: R1 ⋈ R2 = PA(sC(R1 R2))
- Where:
- The selection sC checks equality of all common attributes
- The projection eliminates the duplicate common attributes

PName, SSN, Dname(sSSN=SSN2(Employee x rSSN2, Dname(Dependents))

Natural Join Example

Employee

Name

SSN

John

999999999

Tony

777777777

Dependents

SSN

Dname

999999999

Emily

777777777

Joe

Name

SSN

Dname

John

999999999

Emily

Tony

777777777

Joe

Natural Join

- R= S=
- R ⋈ S=

Natural Join

- Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ?
- Given R(A, B, C), S(D, E), what is R ⋈ S ?
- Given R(A, B), S(A, B), what is R ⋈ S ?

Theta Join

- A join that involves a predicate
- R1 ⋈q R2 = sq (R1 R2)
- Here q can be any condition

Eq-join

- A theta join where q is an equality
- R1 ⋈A=B R2 = s A=B (R1 R2)
- Example:
- Employee ⋈SSN=SSN Dependents
- Most useful join in practice

Semijoin

- R ⋉ S = PA1,…,An (R ⋈ S)
- Where A1, …, An are the attributes in R
- Example:
- Employee ⋉ Dependents

Semijoins in Distributed Databases

- Semijoins are used in distributed databases

Dependents

Employee

network

Employee ⋈ssn=ssn (s age>71 (Dependents))

T = PSSNs age>71 (Dependents)

R = Employee ⋉ T

Answer = R ⋈ Dependents

pid=pid

buyer-ssn=ssn

Complex RA ExpressionsPname

Person Purchase Person Product

Pssn

Ppid

sname=fred

sname=gizmo

Operations on Bags

A bag = a set with repeated elements

All operations need to be defined carefully on bags

- {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f}
- {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d}
- sC(R): preserve the number of occurrences
- PA(R): no duplicate elimination
- Cartesian product, join: no duplicate elimination

Important ! Relational Engines work on bags, not sets !

Reading assignment: 5.3 – 5.4

Finally: RA has Limitations !

- Cannot compute “transitive closure”
- Find all direct and indirect relatives of Fred
- Cannot express in RA !!! Need to write C program

Download Presentation

Connecting to Server..