This presentation is the property of its rightful owner.
1 / 32

# Discussion #23 Relational Algebra PowerPoint PPT Presentation

Discussion #23 Relational Algebra. Topics. Algebras Relational Algebra use of standard notation set operators , ,  renaming  selection  projection  cross product  join |  | Queries (from English) Query optimization SQL. Relational Algebra. What is an algebra?

Discussion #23 Relational Algebra

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

### Topics

• Algebras

• Relational Algebra

• use of standard notation

• set operators , , 

• renaming 

• selection 

• projection 

• cross product 

• join ||

• Queries (from English)

• Query optimization

• SQL

### Relational Algebra

• What is an algebra?

• a pair: (set of values, set of operations)

•  ADT  type  Class  Object

e.g. stack: (set of all stacks, {pop, push, top, …})

integer: (set of all integers, {+, -, *, })

• What is relational algebra?

• (set of relations, set of relational operators)

• {, , , , , , , ||}

### Relational Algebra is Closed

• Closed: all operations produce values in the value set

• (reals, {+, *, })  closed

• (reals, {+, *, , })  not closed (divide by 0)

• (reals, {+, *, >})  not closed (T/F not in value set)

• (computer reals, {+, *, })  not closed (overflow, roundoff)

• (relations, relational operators)  closed

• Implication: we can always nest relational operators; can’t for algebras that are not closed.

• e.g. after overflow, can do nothing

• e.g. can’t always nest: (2 < 3) + 5 = ?

### Set Operations: , , and 

• Relations are sets; thus set operations should work.

• Examples:

S = A B

2 2

2 3

4 2

5 5

R = A B

1 2

2 2

2 3

RS = A B

2 2

2 3

RS = A B

1 2

2 2

2 3

4 2

5 5

SR = A B

4 2

5 5

RS = A B

1 2

### Set Operations (continued …)

• Definition: schema(R) = {A, B} = AB, i.e. the set of attributes

• We sometimes write R(AB) to mean the relation R with schema AB.

• Definition: union compatible

• schema(R) = schema(S)

• required precondition for , , 

• Definitions:

• R  S = { t | t  R  t  S}

• R  S = { t | t  R  t  S}

• R  S = { t | t  R  t  S}

### Tuple Restriction: [X]

• Restriction is a tuple operator (not a relational operator).

• t[X] restricts tuple t to the attributes in X.

A B C

t = 1 2 3

t[A] = (1)

t[AC] = (1,3)

t= (1,2,3)

t[A]= (1,2,3)[A]

= {(A,1), (B, 2), (C,3)}[A]

= {(A,1)}

= (1)

### Renaming: 

• ABR renames attribute A to be B.

• A must be in schema(R)

• B must not be in schema(R)

• Example: let

R =A B

1 2

2 2

2 3

Q =A C

2 2

3 2

RQ = ?

Not union compatible

• But with :

RCBQ= A B

1 2

2 2

2 3

3 2

CBQ =A B

2 2

3 2

### Renaming (continued…)

• Q = ABR renames attribute A to B; the result is Q.

• Precondition:

• A  schema(R)

• B  schema(R)

• Postcondition:

• schema(Q) = (schema(R)  {A})  {B}

• Q = {t' | t (tR  t' = (t – {(A, t[A])})  {(B, t[A])})}

Q = ABR = {{(B,1), (C,2)}

{(B,2), (C,2)}}

R = {{(A,1), (C,2)}

{(A,2), (C,2)}}

### Selection: 

• The selection operation selects the tuples that satisfy a condition.

R =A B

1 2

2 2

2 3

B=2R =A B

1 2

2 2

A=1R =A B

1 2

A=2B2R =A B

2 2

2 3

PR = { t | t  R  P(t) }

A=3R =A B

Note: empty, but

still retain the schema

Meaning: apply predicate P to tuple t by substituting into P appropriate t values.

• Precondition: each attribute mentioned in P must be in schema(R).

• Postcondition: PR = { t | t  R  P(t) }

• schema(PR) = schema(R)

### Projection: 

The projection operation restricts tuples in a relation to those designated in the operation.

R =A B

1 2

2 2

2 3

AR =A

1

2

BR =B

2

3

ABR =R=A,BR ={A,B}R

Q =A B C

1 1 1

2 1 1

3 4 5

BCQ =B C

1 1

4 5

Precondition: X  schema(R)

Postcondition: XR = { t' | t (t  R  t' = t[X]) }

schema(XR) = X

### Cross Product: 

Standard cartesian product adapted for relational algebra

R  S =A B C D

1 2 1 1

1 2 2 2

1 2 3 3

2 2 1 1

2 2 2 2

2 2 3 3

S =C D

1 1

2 2

3 3

R =A B

1 2

2 2

### Cross Product (continued…)

Precondition: schema(R)  schema(S) = 

Postcondition: R  S = { t | t' t''(t' R  t'' S  t = t'  t'')}

schema(R  S) = schema(R)  schema(S)

R =A B

1 2 = t'

2 2

t' = { (A,1), (B,2) }

S =C D

1 1

2 2

3 3 = t''

t'' = { (C,3), (D,3) }

t'  t'' = { (A,1), (B,2), (C,3), (D,3) }

### Cross Product (continued…)

What if R and S have the same attribute, e.g. A?

S =C A

1 1 = t'' = { (C,1), (A,1) }

2 2

3 3 = t''' = { (C,3), (A,3) }

R =A B

1 2 = t' = { (A,1), (B,2) }

2 2

Can’t do cross product

Solution: Rename

AAS

t'  t'' = { (A,1), (B,2), (C,1), (A,1) }

R  AAS =A B C A

1 2 1 1

1 2 2 2

1 2 3 3

2 2 1 1

2 2 2 2

2 2 3 3

1

1

2 2

1 2

A B

1 2

1 2

1 2

2 2 1 2

2 2 2 1

2 2 3 2

Cross Product

B' C

1 2

2 1

3 2

(R  )

B=B'

BB'S

R || S = ABC

Selection

Projection

Renaming

R || S =A B C

1 2 1 2 2 1

R =A B

1 2

2 2

S =B C

1 2

2 1

3 2

### Join (continued …)

• In general, we can equate 0, 1, 2, or more attributes using || .

• A join is defined as:

schema (R || S) = schema(R)  schema(S)

R || S = {t | t[schema(R)]  R

 t[schema(S)]  S}

• There are no preconditions  join always works.

### Join (continued…)

0 attributes in common (full cross product)

R || S =A B C D

1 1 1 1

1 1 1 5

2 3 1 1

2 3 1 5

4 1 1 1

4 1 1 5

R =A B

1 1

2 3

4 1

S =C D

1 1

1 5

1 attribute in common

R || S =A B C

1 2 2

2 2 2

2 3 3

R =A B

1 2

2 2

2 3

S =B C

1 1

2 2

3 3

2 attributes in common

R || S =A B C D

2 2 4 2

2 2 4 1

R =A B C

1 2 3

2 2 4

2 3 5

S =A B D

1 1 1

2 2 2

2 2 1

### Join (continued…)

• We can use renaming to control the ||

R || CAS =A B

1 2

R =A B

1 2

2 2

S =B C

1 2

2 1

3 2

S' = B A

1 2

2 1

3 2

= A B

2 1

1 2

2 3

R || S' =A B

1 2

• BTW, observe equivalence with intersection

### Relational Algebra Expressions

• Relational operators are closed. Thus we can nest expressions:

R =A B

1 2

3 4

S =B C D

2 5 1

2 7 2

3 2 3

4 5 4

DC=5(R || S) =A B C D

1 2 5 1

1 2 7 2 3 4 5 4

= D

1

4

• Unary operators have precedence over binary operators; binary operators are left associative.

• We can now do something very useful: ask and answer with relational algebra (almost) any query we can dream up.

### Relational Algebra Queries

PrerequisiteCourse='EE200'cp =Prerequisite

EE005

CS100

• List the prerequisites for EE200.

• When does CS101 meet?

Day,HourCourse='CS101'cdh =Day Hour

M 9AM

W 9AM

F 9AM

• When and where does EE200 meet?

Day,Hour,RoomCourse='EE200'(cdh || cr) =Day Hour Room

Tu 10AM 25 Ohm Hall W 1PM 25 Ohm Hall Th 10AM 25 Ohm Hall

• Our answers are in (cdh || cr).

• We select Course to be EE200.

• Then, project on Day, Hour, Room.

### Queries (continued…)

• Where can I find Snoopy at 9 am on Monday?

• Can we rewrite the query more optimally?

• What rules should we use?

• Associativity and commutivity of join

• Distributive laws for select and project

• What strategy should we use?

• Eliminate unnecessary operations

• Make joins as small as possible before execution

Course Room*

Course Day'M' Hour'9AM'

RoomName='Snoopy'  Day='M'  Hour='9AM' (snap || csg || cr || cdh)

=Room

Turing Aud.

### Query Optimization

RoomName='Snoopy'  Day='M'  Hour='9AM' (snap || csg || cr || cdh)

• “Intuitively” we can write

as

Room(Name='Snoopy'snap || csg || cr || Day='M'  Hour='9AM'cdh)

• Why does this execute faster?

• What laws hold that will let us do this?

• R || S = S || R

• P1P2E = P1P2E

• P(R |×| S) = R || PS (if all the attributes of P are in S)

• How do we know they hold?

### Proofs for Laws

• To prove P1P2E = P1P2E,we need to prove that two sets are equal. We prove A = B by showing AB  BA. We show that AB by showing that xA  xB.

• Thus, we can do two proofs to prove P1P2E = P1P2Eas follows:

• t  P1P2E premise

• t  E  (P1P2)(t)def.: PR = {t | tR  P(t)}

• t  E  P1(t)  P2(t)identical substitutions & operations

• t  E  P2(t)  P1(t)commutative

• t  P2E  P1(t)def. of 

• t  P1P2Edef. of 

• t  P1P2Epremise

• …just go backwards from 6 to 1 in the proof above

### Alternate Proof

(Derive the right-hand side from the left-hand side.)

Thus, we can prove P1P2E = P1P2Eas follows:

P1P2E

= {t | t  E  (P1P2)(t)}def.: PR = {t | tR  P(t)}

= {t | t  E  P1(t)  P2(t)}identical substitutions & operations

= {t | t  E  P2(t)  P1(t)}commutative

= {t | t  P2E  P1(t)}def. of 

= {t | t  P1P2E}def. of 

= P1P2Edef. of a relation

### Proofs for Laws (continued …)

• To prove P(R || S) = R || PS, where all attributes of P are in S, we again need to prove that two sets are equal.

• As before, we can convert the lhs to the rhs.

P(R || S)

= {t | t  P(R || S)}def. of a relation

= {t | t  R || S  P(t)}def.: PR={t | tRP(t)}

= {t | t[schema(R)]  R  t[schema(S)]  S  P(t)}

def.: R||S={t | t[schema(R)]Rt[schema(S)]S}

= {t | t[schema(R)]  R 

t[schema(S)]  S  P(t[schema(S)])}

all attributes of P are in S

= {t | t[schema(R)]  R  t[schema(S)]  PS}def. of 

= {t | t  R || PS}def. of ||

= R || PSdef. of a relation

### SQL

Correspondence with Relational Algebra

Assume we have relations R(AB) and S(BC).

AB = 1 R

select A

from R

where B = 1

B R B S

select B from R

except

select B from S

A, R.B, CR.B = S.B (R  S)

= R || S

select A, R.B, C

from R, S

where R.B = S.B

### SQL

Correspondence with Relational Algebra

Assume we have relations R(AB) and S(BC).

AB = 1 R

select A

from R

where B = 1

B R B S

select R.B from R

where R.B not in

(select S.B from S)

R || S

select *

from R natural join S

### SQL Queries

select PrerequisitePrerequisite

from cp EE005

where Course='EE200'CS100

• List the prerequisites for EE200.

• When does CS101 meet?

select Day, HourDay Hour

from cdhM 9AM

where Course= 'CS101'W 9AM

F 9AM

• When and where does EE200 meet?

select cdh.Course, Day, Hour, Room Course Day Hour Room

from cdh, cr EE200 Tu 10AM 25 Ohm Hall where cdh.Course= 'EE200' EE200 W 1PM 25 Ohm Hall

and cdh.Course=cr.Course EE200 Th 10AM 25 Ohm Hall

### SQL Queries

select PrerequisitePrerequisite

from cp EE005

where Course='EE200'CS100

• List the prerequisites for EE200.

• When does CS101 meet?

select Day, HourDay Hour

from cdhM 9AM

where Course= 'CS101'W 9AM

F 9AM

• When and where does EE200 meet?

select Course, Day, Hour, RoomCourse Day Hour Room

from cdh natural join cr EE200 Tu 10AM 25 Ohm Hall where cdh.Course= 'EE200' EE200 W 1PM 25 Ohm Hall

EE200 Th 10AM 25 Ohm Hall

### SQL Queries

select PrerequisitePrerequisite

from cp CS100

EE005

CS100

CS101

CS120

CS101

CS121

CS205

select distinct PrerequisitePrerequisite

from cpCS100

CS101

CS120

CS121

CS205

EE005

• List all prerequisite courses.

### SQL Queries

• Where can I find Snoopy at 9 am on Monday?

select RoomRoom

from snap, csg, cr, cdhTuring Aud.

where Name='Snoopy' and Day='M'

and Hour='9AM' and snap.StudentID=csg.StudentID

and csg.Course=cr.Course and cr.Course=cdh.Course

• List all prereqs of CS750 (including prereqs of prereqs.)

• Not possible with standard SQL (unless nesting depth is known)

• Is possible with Datalog

• Rules:prereqOf(x, y) :- cp(y, x).

• prereqOf(x, y) :- prereqOf(x, z), cp(y, z).

• Query:prereqOf(x, 'CS750')?

• To gain more power and flexibility, we typically embed SQL in a high-level language.

### SQL Queries

• List all prereqs of CS750 (including prereqs of prereqs.)

select cp.Prerequisite

from cp

where cp.Course = 'CS750'

union

select cp1.Prerequisite

from cp cp1, cp cp2

where cp1.Course = cp2.Prerequisite

and cp2.Course = 'CS750'

union