 Download Presentation Lecture 07: Relational Algebra # Lecture 07: Relational Algebra - PowerPoint PPT Presentation

Download Presentation ##### Lecture 07: Relational Algebra

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Lecture 07:Relational Algebra

2. Outline • Relational Algebra (Section 6.1)

3. Relational Algebra • Formalism for creating new relations from existing ones • Its place in the big picture: Declarativequerylanguage Algebra Implementation Relational algebra SQL,relational calculus

4. Relational Algebra • Five operators: • Union:  • Difference: - • Selection: s • Projection: P • Cartesian Product:  • Derived or auxiliary operators: • Intersection, complement • Joins (natural,equi-join, theta join, semi-join) • Renaming: r

5. 1. Union and 2. Difference • R1  R2Example: • ActiveEmployees  RetiredEmployees • R1 – R2Example: • AllEmployees − RetiredEmployees

6. What about Intersection ? • It is a derived operatorR1  R2 = R1 – (R1 – R2) • Also expressed as a join (will see later)Example • UnionizedEmployees  RetiredEmployees

7. 3. Selection • Returns all tuples which satisfy a condition • Notation: sc(R) • Examples • sSalary > 40000(Employee) • sname = “Smith”(Employee) • The condition c can be =, <, , >,, <> [in SQL: SELECT * FROM Employee WHERE Salary > 40000]

8. Find all employees with salary more than \$40,000. s Salary > 40000(Employee)

9. 4. Projection • Eliminates columns, then removes duplicates • Notation: P A1,…,An(R) • Example: project to social-security number and names: • PSSN, Name (Employee) • Output schema: Answer(SSN, Name) [In SQL: SELECT DISTINCT SSN, Name FROM Employee]

10. PSSN, Name (Employee)

11. 5. Cartesian Product • Combine each tuple in R1 with each tuple in R2 • Notation: R1  R2 • Example: • Employee  Dependents • Very rare in practice; mainly used to express joins [In SQL: SELECT * FROM R1, R2]

12. Relational Algebra • Five operators: • Union:  • Difference: - • Selection: s • Projection: P • Cartesian Product:  • Derived or auxiliary operators: • Intersection, complement • Joins (natural,equi-join, theta join, semi-join) • Renaming: r

13. Renaming • Changes the schema, not the instance • Schema: R(A1, …, An ) • Notation: rB1,…,Bn (R) • Example: • rLastName, SocSocNo (Employee) • Output schema: Answer(LastName, SocSocNo) [in SQL: SELECT Name AS LastName, SSN AS SocSocNo FROM Employee]

14. Renaming Example Employee Name SSN John 999999999 Tony 777777777 • LastName, SocSocNo (Employee) LastName SocSocNo John 999999999 Tony 777777777

15. Natural Join • Notation: R1 ⋈ R2 • Meaning: R1 ⋈ R2 = PA(sC(R1  R2)) • Where: • The selection sC checks equality of all common attributes • The projection eliminates the duplicate common attributes [in SQL: SELECT DISTINCT R1.A, R1. B, R2.C FROM R1, R2 WHERE R1.B = R2.B Schema: R1(A,B), R2(B,C)]

16. Employee Dependents = PName, SSN, Dname(sSSN=SSN2(Employee x rSSN2, Dname(Dependents)) Natural Join Example Employee Name SSN John 999999999 Tony 777777777 Dependents SSN Dname 999999999 Emily 777777777 Joe Name SSN Dname John 999999999 Emily Tony 777777777 Joe

17. Natural Join • R= S= • R ⋈ S=

18. Natural Join • Given the schemas R(A, B, C, D), S(A, C, E), what is the schema of R ⋈ S ? • Given R(A, B, C), S(D, E), what is R ⋈ S ? • Given R(A, B), S(A, B), what is R ⋈ S ?

19. Theta Join • A join that involves a predicate • R1 ⋈q R2 = sq (R1  R2) • Here q can be any condition

20. Eq-join • A theta join where q is an equality R1 ⋈A=B R2 = s A=B (R1  R2) • Example: • Employee ⋈SSN=SSN Dependents • Most useful join in practice(difference to natural join?)

21. Semijoin • R ⋉ S = PA1,…,An (R ⋈ S) • Where A1, …, An are the attributes in R • Example: • Employee ⋉ Dependents

22. Semijoins in Distributed Databases • Semijoins are used in distributed databases Dependents Employee network Employee ⋈ssn=ssn (s age>71 (Dependents)) T = PSSNs age>71 (Dependents) R = Employee ⋉ T Answer = R ⋈ Dependents

23. seller-ssn=ssn pid=pid buyer-ssn=ssn Complex RA Expressions Pname Person Purchase Person Product Pssn Ppid sname=fred sname=gizmo

24. sname sid=sid (Scan; (Scan; write to write to rating > 5 bid=100 temp T2) temp T1) Reserves Sailors sname rating > 5 bid=100 sid=sid Sailors Reserves Application: Query Rewriting for Optimization The earlier we process selections, less tuples we need to manipulate higher up in the tree (predicate pushdown) Disadvantages?

25. Algebraic Laws (Examples) • Commutative and Associative Laws • R ∩ S = S ∩ R, R ∩ (S ∩ T) = (R ∩ S) ∩ T • R S = S R, R (S T) = (R S) T • Laws involving selection • sC AND C’(R) = sC(sC’(R)) = sC(R) ∩ sC’(R) • sC (R S) = sC (R) S • When C involves only attributes of R • Laws involving projections • PM(PN(R)) = PM,N(R) ⋈ ⋈ ⋈ ⋈ ⋈ ⋈ ⋈ ⋈

26. Operations on Bags A bag = a set with repeated elements All operations need to be defined carefully on bags • {a,b,b,c}{a,b,b,b,e,f,f}={a,a,b,b,b,b,b,c,e,f,f} • {a,b,b,b,c,c} – {b,c,c,c,d} = {a,b,b} • sC(R): preserve the number of occurrences • PA(R): no duplicate elimination • Cartesian product, join: no duplicate elimination Important ! Relational Engines work on bags, not sets !

27. Finally: RA has Limitations ! • Cannot compute “transitive closure” • Find all direct and indirect relatives of Fred • Cannot express in RA !!! Need to write C program