Discussion 23 relational algebra
This presentation is the property of its rightful owner.
Sponsored Links
1 / 32

Discussion #23 Relational Algebra PowerPoint PPT Presentation


  • 120 Views
  • Uploaded on
  • Presentation posted in: General

Discussion #23 Relational Algebra. Topics. Algebras Relational Algebra use of standard notation set operators , ,  renaming  selection  projection  cross product  join |  | Queries (from English) Query optimization SQL. Relational Algebra. What is an algebra?

Download Presentation

Discussion #23 Relational Algebra

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Discussion 23 relational algebra

Discussion #23Relational Algebra


Topics

Topics

  • Algebras

  • Relational Algebra

    • use of standard notation

    • set operators , , 

    • renaming 

    • selection 

    • projection 

    • cross product 

    • join ||

  • Queries (from English)

  • Query optimization

  • SQL


Relational algebra

Relational Algebra

  • What is an algebra?

    • a pair: (set of values, set of operations)

    •  ADT  type  Class  Object

      e.g. stack: (set of all stacks, {pop, push, top, …})

      integer: (set of all integers, {+, -, *, })

  • What is relational algebra?

    • (set of relations, set of relational operators)

    • {, , , , , , , ||}


Relational algebra is closed

Relational Algebra is Closed

  • Closed: all operations produce values in the value set

    • (reals, {+, *, })  closed

    • (reals, {+, *, , })  not closed (divide by 0)

    • (reals, {+, *, >})  not closed (T/F not in value set)

    • (computer reals, {+, *, })  not closed (overflow, roundoff)

    • (relations, relational operators)  closed

  • Implication: we can always nest relational operators; can’t for algebras that are not closed.

    • e.g. after overflow, can do nothing

    • e.g. can’t always nest: (2 < 3) + 5 = ?


Set operations and

Set Operations: , , and 

  • Relations are sets; thus set operations should work.

  • Examples:

S = A B

2 2

2 3

4 2

5 5

R = A B

1 2

2 2

2 3

RS = A B

2 2

2 3

RS = A B

1 2

2 2

2 3

4 2

5 5

SR = A B

4 2

5 5

RS = A B

1 2


Set operations continued

Set Operations (continued …)

  • Definition: schema(R) = {A, B} = AB, i.e. the set of attributes

  • We sometimes write R(AB) to mean the relation R with schema AB.

  • Definition: union compatible

    • schema(R) = schema(S)

    • required precondition for , , 

  • Definitions:

    • R  S = { t | t  R  t  S}

    • R  S = { t | t  R  t  S}

    • R  S = { t | t  R  t  S}


Tuple restriction x

Tuple Restriction: [X]

  • Restriction is a tuple operator (not a relational operator).

  • t[X] restricts tuple t to the attributes in X.

A B C

t = 1 2 3

t[A] = (1)

t[AC] = (1,3)

t= (1,2,3)

t[A]= (1,2,3)[A]

= {(A,1), (B, 2), (C,3)}[A]

= {(A,1)}

= (1)


Renaming

Renaming: 

  • ABR renames attribute A to be B.

    • A must be in schema(R)

    • B must not be in schema(R)

  • Example: let

R =A B

1 2

2 2

2 3

Q =A C

2 2

3 2

RQ = ?

Not union compatible

  • But with :

RCBQ= A B

1 2

2 2

2 3

3 2

CBQ =A B

2 2

3 2


Renaming continued

Renaming (continued…)

  • Q = ABR renames attribute A to B; the result is Q.

  • Precondition:

    • A  schema(R)

    • B  schema(R)

  • Postcondition:

    • schema(Q) = (schema(R)  {A})  {B}

    • Q = {t' | t (tR  t' = (t – {(A, t[A])})  {(B, t[A])})}

Q = ABR = {{(B,1), (C,2)}

{(B,2), (C,2)}}

R = {{(A,1), (C,2)}

{(A,2), (C,2)}}


Selection

Selection: 

  • The selection operation selects the tuples that satisfy a condition.

R =A B

1 2

2 2

2 3

B=2R =A B

1 2

2 2

A=1R =A B

1 2

A=2B2R =A B

2 2

2 3

PR = { t | t  R  P(t) }

A=3R =A B

Note: empty, but

still retain the schema

Meaning: apply predicate P to tuple t by substituting into P appropriate t values.

  • Precondition: each attribute mentioned in P must be in schema(R).

  • Postcondition: PR = { t | t  R  P(t) }

  • schema(PR) = schema(R)


Projection

Projection: 

The projection operation restricts tuples in a relation to those designated in the operation.

R =A B

1 2

2 2

2 3

AR =A

1

2

BR =B

2

3

ABR =R=A,BR ={A,B}R

Q =A B C

1 1 1

2 1 1

3 4 5

BCQ =B C

1 1

4 5

Precondition: X  schema(R)

Postcondition: XR = { t' | t (t  R  t' = t[X]) }

schema(XR) = X


Cross product

Cross Product: 

Standard cartesian product adapted for relational algebra

R  S =A B C D

1 2 1 1

1 2 2 2

1 2 3 3

2 2 1 1

2 2 2 2

2 2 3 3

S =C D

1 1

2 2

3 3

R =A B

1 2

2 2


Cross product continued

Cross Product (continued…)

Precondition: schema(R)  schema(S) = 

Postcondition: R  S = { t | t' t''(t' R  t'' S  t = t'  t'')}

schema(R  S) = schema(R)  schema(S)

R =A B

1 2 = t'

2 2

t' = { (A,1), (B,2) }

S =C D

1 1

2 2

3 3 = t''

t'' = { (C,3), (D,3) }

t'  t'' = { (A,1), (B,2), (C,3), (D,3) }


Cross product continued1

Cross Product (continued…)

What if R and S have the same attribute, e.g. A?

S =C A

1 1 = t'' = { (C,1), (A,1) }

2 2

3 3 = t''' = { (C,3), (A,3) }

R =A B

1 2 = t' = { (A,1), (B,2) }

2 2

Can’t do cross product

Solution: Rename

AAS

t'  t'' = { (A,1), (B,2), (C,1), (A,1) }

R  AAS =A B C A

1 2 1 1

1 2 2 2

1 2 3 3

2 2 1 1

2 2 2 2

2 2 3 3


Natural join

1

1

2 2

1 2

A B

1 2

1 2

1 2

2 2 1 2

2 2 2 1

2 2 3 2

Cross Product

B' C

1 2

2 1

3 2

(R  )

B=B'

BB'S

R || S = ABC

Selection

Projection

Renaming

Natural Join: ||

R || S =A B C

1 2 1 2 2 1

R =A B

1 2

2 2

S =B C

1 2

2 1

3 2


Join continued

Join (continued …)

  • In general, we can equate 0, 1, 2, or more attributes using || .

  • A join is defined as:

    schema (R || S) = schema(R)  schema(S)

    R || S = {t | t[schema(R)]  R

     t[schema(S)]  S}

  • There are no preconditions  join always works.


Join continued1

Join (continued…)

0 attributes in common (full cross product)

R || S =A B C D

1 1 1 1

1 1 1 5

2 3 1 1

2 3 1 5

4 1 1 1

4 1 1 5

R =A B

1 1

2 3

4 1

S =C D

1 1

1 5

1 attribute in common

R || S =A B C

1 2 2

2 2 2

2 3 3

R =A B

1 2

2 2

2 3

S =B C

1 1

2 2

3 3

2 attributes in common

R || S =A B C D

2 2 4 2

2 2 4 1

R =A B C

1 2 3

2 2 4

2 3 5

S =A B D

1 1 1

2 2 2

2 2 1


Join continued2

Join (continued…)

  • We can use renaming to control the ||

R || CAS =A B

1 2

R =A B

1 2

2 2

S =B C

1 2

2 1

3 2

S' = B A

1 2

2 1

3 2

= A B

2 1

1 2

2 3

R || S' =A B

1 2

  • BTW, observe equivalence with intersection


Relational algebra expressions

Relational Algebra Expressions

  • Relational operators are closed. Thus we can nest expressions:

R =A B

1 2

3 4

S =B C D

2 5 1

2 7 2

3 2 3

4 5 4

DC=5(R || S) =A B C D

1 2 5 1

1 2 7 2 3 4 5 4

= D

1

4

  • Unary operators have precedence over binary operators; binary operators are left associative.

  • We can now do something very useful: ask and answer with relational algebra (almost) any query we can dream up.


Relational algebra queries

Relational Algebra Queries

PrerequisiteCourse='EE200'cp =Prerequisite

EE005

CS100

  • List the prerequisites for EE200.

  • When does CS101 meet?

Day,HourCourse='CS101'cdh =Day Hour

M 9AM

W 9AM

F 9AM

  • When and where does EE200 meet?

Day,Hour,RoomCourse='EE200'(cdh || cr) =Day Hour Room

Tu 10AM 25 Ohm Hall W 1PM 25 Ohm Hall Th 10AM 25 Ohm Hall

  • Our answers are in (cdh || cr).

  • We select Course to be EE200.

  • Then, project on Day, Hour, Room.


Queries continued

Queries (continued…)

  • Where can I find Snoopy at 9 am on Monday?

  • Can we rewrite the query more optimally?

  • What rules should we use?

    • Associativity and commutivity of join

    • Distributive laws for select and project

  • What strategy should we use?

    • Eliminate unnecessary operations

    • Make joins as small as possible before execution

StudentID Name'Snoopy' Address Phone

CourseStudentID Grade

Course Room*

Course Day'M' Hour'9AM'

RoomName='Snoopy'  Day='M'  Hour='9AM' (snap || csg || cr || cdh)

=Room

Turing Aud.


Query optimization

Query Optimization

RoomName='Snoopy'  Day='M'  Hour='9AM' (snap || csg || cr || cdh)

  • “Intuitively” we can write

as

Room(Name='Snoopy'snap || csg || cr || Day='M'  Hour='9AM'cdh)

  • Why does this execute faster?

  • What laws hold that will let us do this?

  • R || S = S || R

  • P1P2E = P1P2E

  • P(R |×| S) = R || PS (if all the attributes of P are in S)

  • How do we know they hold?


Proofs for laws

Proofs for Laws

  • To prove P1P2E = P1P2E,we need to prove that two sets are equal. We prove A = B by showing AB  BA. We show that AB by showing that xA  xB.

  • Thus, we can do two proofs to prove P1P2E = P1P2Eas follows:

    • t  P1P2E premise

    • t  E  (P1P2)(t)def.: PR = {t | tR  P(t)}

    • t  E  P1(t)  P2(t)identical substitutions & operations

    • t  E  P2(t)  P1(t)commutative

    • t  P2E  P1(t)def. of 

    • t  P1P2Edef. of 

    • t  P1P2Epremise

    • …just go backwards from 6 to 1 in the proof above


Alternate proof

Alternate Proof

(Derive the right-hand side from the left-hand side.)

Thus, we can prove P1P2E = P1P2Eas follows:

P1P2E

= {t | t  E  (P1P2)(t)}def.: PR = {t | tR  P(t)}

= {t | t  E  P1(t)  P2(t)}identical substitutions & operations

= {t | t  E  P2(t)  P1(t)}commutative

= {t | t  P2E  P1(t)}def. of 

= {t | t  P1P2E}def. of 

= P1P2Edef. of a relation


Proofs for laws continued

Proofs for Laws (continued …)

  • To prove P(R || S) = R || PS, where all attributes of P are in S, we again need to prove that two sets are equal.

  • As before, we can convert the lhs to the rhs.

    P(R || S)

    = {t | t  P(R || S)}def. of a relation

    = {t | t  R || S  P(t)}def.: PR={t | tRP(t)}

    = {t | t[schema(R)]  R  t[schema(S)]  S  P(t)}

    def.: R||S={t | t[schema(R)]Rt[schema(S)]S}

    = {t | t[schema(R)]  R 

    t[schema(S)]  S  P(t[schema(S)])}

    all attributes of P are in S

    = {t | t[schema(R)]  R  t[schema(S)]  PS}def. of 

    = {t | t  R || PS}def. of ||

    = R || PSdef. of a relation


Discussion 23 relational algebra

SQL

Correspondence with Relational Algebra

Assume we have relations R(AB) and S(BC).

AB = 1 R

select A

from R

where B = 1

B R B S

select B from R

except

select B from S

A, R.B, CR.B = S.B (R  S)

= R || S

select A, R.B, C

from R, S

where R.B = S.B


Discussion 23 relational algebra

SQL

Correspondence with Relational Algebra

Assume we have relations R(AB) and S(BC).

AB = 1 R

select A

from R

where B = 1

B R B S

select R.B from R

where R.B not in

(select S.B from S)

R || S

select *

from R natural join S


Sql queries

SQL Queries

select PrerequisitePrerequisite

from cp EE005

where Course='EE200'CS100

  • List the prerequisites for EE200.

  • When does CS101 meet?

select Day, HourDay Hour

from cdhM 9AM

where Course= 'CS101'W 9AM

F 9AM

  • When and where does EE200 meet?

select cdh.Course, Day, Hour, Room Course Day Hour Room

from cdh, cr EE200 Tu 10AM 25 Ohm Hall where cdh.Course= 'EE200' EE200 W 1PM 25 Ohm Hall

and cdh.Course=cr.Course EE200 Th 10AM 25 Ohm Hall


Sql queries1

SQL Queries

select PrerequisitePrerequisite

from cp EE005

where Course='EE200'CS100

  • List the prerequisites for EE200.

  • When does CS101 meet?

select Day, HourDay Hour

from cdhM 9AM

where Course= 'CS101'W 9AM

F 9AM

  • When and where does EE200 meet?

select Course, Day, Hour, RoomCourse Day Hour Room

from cdh natural join cr EE200 Tu 10AM 25 Ohm Hall where cdh.Course= 'EE200' EE200 W 1PM 25 Ohm Hall

EE200 Th 10AM 25 Ohm Hall


Sql queries2

SQL Queries

select PrerequisitePrerequisite

from cp CS100

EE005

CS100

CS101

CS120

CS101

CS121

CS205

select distinct PrerequisitePrerequisite

from cpCS100

CS101

CS120

CS121

CS205

EE005

  • List all prerequisite courses.


Sql queries3

SQL Queries

  • Where can I find Snoopy at 9 am on Monday?

select RoomRoom

from snap, csg, cr, cdhTuring Aud.

where Name='Snoopy' and Day='M'

and Hour='9AM' and snap.StudentID=csg.StudentID

and csg.Course=cr.Course and cr.Course=cdh.Course

  • List all prereqs of CS750 (including prereqs of prereqs.)

    • Not possible with standard SQL (unless nesting depth is known)

    • Is possible with Datalog

    • Rules:prereqOf(x, y) :- cp(y, x).

    • prereqOf(x, y) :- prereqOf(x, z), cp(y, z).

    • Query:prereqOf(x, 'CS750')?

    • To gain more power and flexibility, we typically embed SQL in a high-level language.


Sql queries4

SQL Queries

  • List all prereqs of CS750 (including prereqs of prereqs.)

select cp.Prerequisite

from cp

where cp.Course = 'CS750'

union

select cp1.Prerequisite

from cp cp1, cp cp2

where cp1.Course = cp2.Prerequisite

and cp2.Course = 'CS750'

union


  • Login