lecture 13 relational decomposition and relational algebra l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Lecture 13: Relational Decomposition and Relational Algebra PowerPoint Presentation
Download Presentation
Lecture 13: Relational Decomposition and Relational Algebra

Loading in 2 Seconds...

play fullscreen
1 / 36

Lecture 13: Relational Decomposition and Relational Algebra - PowerPoint PPT Presentation


  • 261 Views
  • Uploaded on

Lecture 13: Relational Decomposition and Relational Algebra. February 5 th , 2003. Summary of Previous Discussion. FD’s are given as part of the schema. You derived keys from them. You can also specify keys directly (they’re just another FD)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Lecture 13: Relational Decomposition and Relational Algebra' - shania


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
summary of previous discussion
Summary of Previous Discussion
  • FD’s are given as part of the schema. You derived keys from them. You can also specify keys directly (they’re just another FD)
  • First, compute the keys and the FD’s that hold on the relation you’re considering.
  • Then, look for violations of BCNF. There may be a few. Choose one.
summary continued
Summary (continued)
  • You may need to decompose the decomposed relations:
    • To decide that, you need to project the FD’s on the decomposed relations.
    • The end result may differ, depending on which violation you chose to decompose by. They’re all correct.
  • Relations with 2 attributes are in BCNF.
summary continued 2
Summary (continued –2)
  • If you have only a single FD representing a key, then the relation is in BCNF.
  • But, you can still have another FD that forces you to decompose.
  • This is all really pretty simple if you think about it long enough.
boyce codd normal form
Boyce-Codd Normal Form

A simple condition for removing anomalies from relations:

A relation R is in BCNF if:

Whenever there is a nontrivial dependency A1, ..., An B

in R , {A1, ..., An} is a key for R

In English (though a bit vague):

Whenever a set of attributes of R is determining another attribute,

should determine all the attributes of R.

example decomposition
Example Decomposition

Person(name, SSN, age, hairColor, phoneNumber)

SSN  name, age, hairColor

age  hairColor

Decompose in BCNF (in class):

Step 1: find all keys ssn, phoneNumber

SSN, name, age, hairColor

SSN, phoneNumber

Step 2: now decompose

other example
Other Example
  • R(A,B,C,D) A B, B C
  • Keys:
  • Violations of BCNF:
correct decompositions
Correct Decompositions

A decomposition is lossless if we can recover:

R(A,B,C)

R1(A,B) R2(A,C)

R’(A,B,C) should be the same as R(A,B,C)

Decompose

Recover

R’ is in general larger than R. Must ensure R’ = R

correct decompositions9
Correct Decompositions
  • Given R(A,B,C) s.t. AB, the decomposition into R1(A,B), R2(A,C) is lossless
3nf a problem with bcnf
3NF: A Problem with BCNF

Unit Company Product

FD’s: Unit  Company; Company, Product  Unit

So, there is a BCNF violation, and we decompose.

Unit Company

Unit  Company

Unit Product

No FDs

so what s the problem
So What’s the Problem?

Unit Company

Unit Product

Galaga99 UW Galaga99 databases

Bingo UW Bingo databases

No problem so far. All local FD’s are satisfied.

Let’s put all the data back into a single table again:

Unit Company Product

Galaga99 UW databases

Bingo UW databases

Violates the dependency: company, product -> unit!

solution 3rd normal form 3nf
Solution: 3rd Normal Form (3NF)

A simple condition for removing anomalies from relations:

A relation R is in 3rd normal form if :

Whenever there is a nontrivial dependency A1, A2, ..., An Bfor R , then {A1, A2, ..., An } a super-key for R,

or B is part of a key.

relational algebra
Relational Algebra
  • Formalism for creating new relations from existing ones
  • Its place in the big picture:

Declartivequerylanguage

Algebra

Implementation

Relational algebraRelational bag algebra

SQL,relational calculus

relational algebra14
Relational Algebra
  • Five operators:
    • Union: 
    • Difference: -
    • Selection: s
    • Projection: P
    • Cartesian Product: 
  • Derived or auxiliary operators:
    • Intersection, complement
    • Joins (natural,equi-join, theta join, semi-join)
    • Renaming: r
1 union and 2 difference
1. Union and 2. Difference
  • R1  R2
  • Example:
    • ActiveEmployees  RetiredEmployees
  • R1 – R2
  • Example:
    • AllEmployees -- RetiredEmployees
what about intersection
What about Intersection ?
  • It is a derived operator
  • R1  R2 = R1 – (R1 – R2)
  • Also expressed as a join (will see later)
  • Example
    • UnionizedEmployees  RetiredEmployees
3 selection
3. Selection
  • Returns all tuples which satisfy a condition
  • Notation: sc(R)
  • Examples
    • sSalary > 40000(Employee)
    • sname = “Smithh”(Employee)
  • The condition c can be =, <, , >,, <>
4 projection
4. Projection
  • Eliminates columns, then removes duplicates
  • Notation: P A1,…,An(R)
  • Example: project social-security number and names:
    • PSSN, Name (Employee)
    • Output schema: Answer(SSN, Name)
5 cartesian product
5. Cartesian Product
  • Each tuple in R1 with each tuple in R2
  • Notation: R1  R2
  • Example:
    • Employee  Dependents
  • Very rare in practice; mainly used to express joins
relational algebra23
Relational Algebra
  • Five operators:
    • Union: 
    • Difference: -
    • Selection: s
    • Projection: P
    • Cartesian Product: 
  • Derived or auxiliary operators:
    • Intersection, complement
    • Joins (natural,equi-join, theta join, semi-join)
    • Renaming: r
renaming
Renaming
  • Changes the schema, not the instance
  • Notation: rB1,…,Bn (R)
  • Example:
    • rLastName, SocSocNo (Employee)
    • Output schema: Answer(LastName, SocSocNo)
renaming example
Renaming Example

Employee

Name

SSN

John

999999999

Tony

777777777

  • LastName, SocSocNo (Employee)

LastName

SocSocNo

John

999999999

Tony

777777777

natural join
Natural Join
  • Notation: R1 ⋈ R2
  • Meaning: R1 ⋈ R2 = PA(sC(R1  R2))
  • Where:
    • The selection sC checks equality of all common attributes
    • The projection eliminates the duplicate common attributes
slide27

Employee Dependents =

PName, SSN, Dname(sSSN=SSN2(Employee x rSSN2, Dname(Dependents))

Natural Join Example

Employee

Name

SSN

John

999999999

Tony

777777777

Dependents

SSN

Dname

999999999

Emily

777777777

Joe

Name

SSN

Dname

John

999999999

Emily

Tony

777777777

Joe

natural join28
Natural Join
  • R= S=
  • R ⋈ S=
natural join29
Natural Join
  • Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ?
  • Given R(A, B, C), S(D, E), what is R ⋈ S ?
  • Given R(A, B), S(A, B), what is R ⋈ S ?
theta join
Theta Join
  • A join that involves a predicate
  • R1 ⋈q R2 = sq (R1  R2)
  • Here q can be any condition
eq join
Eq-join
  • A theta join where q is an equality
  • R1 ⋈A=B R2 = s A=B (R1  R2)
  • Example:
    • Employee ⋈SSN=SSN Dependents
  • Most useful join in practice
semijoin
Semijoin
  • R ⋉ S = PA1,…,An (R ⋈ S)
  • Where A1, …, An are the attributes in R
  • Example:
    • Employee ⋉ Dependents
semijoins in distributed databases
Semijoins in Distributed Databases
  • Semijoins are used in distributed databases

Dependents

Employee

network

Employee ⋈ssn=ssn (s age>71 (Dependents))

T = PSSNs age>71 (Dependents)

R = Employee ⋉ T

Answer = R ⋈ Dependents

complex ra expressions

seller-ssn=ssn

pid=pid

buyer-ssn=ssn

Complex RA Expressions

Pname

Person Purchase Person Product

Pssn

Ppid

sname=fred

sname=gizmo

operations on bags
Operations on Bags

A bag = a set with repeated elements

All operations need to be defined carefully on bags

  • {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f}
  • {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b,d}
  • sC(R): preserve the number of occurrences
  • PA(R): no duplicate elimination
  • Cartesian product, join: no duplicate elimination

Important ! Relational Engines work on bags, not sets !

Reading assignment: 5.3 – 5.4

finally ra has limitations
Finally: RA has Limitations !
  • Cannot compute “transitive closure”
  • Find all direct and indirect relatives of Fred
  • Cannot express in RA !!! Need to write C program