1 / 61

Database Systems Chapter 6 The Relational Algebra

Database Systems Chapter 6 The Relational Algebra. Relation Schema. A relation schema is used to describe a relation Denoted by S ( A 1 , A 2 , A 3 , …, A n ), where S : Relation schema name A 1 , …, A n : attributes of S

badu
Download Presentation

Database Systems Chapter 6 The Relational Algebra

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database SystemsChapter 6The Relational Algebra

  2. Relation Schema • A relation schema is used to describe a relation • Denoted by S(A1, A2, A3, …, An), where • S: Relation schema name • A1, …, An: attributes of S • The degree of a relation is the number of attributes in a relation schema

  3. Relation Instance (1) • A relation state (or relation instance) r of the relation schema S(A1, A2, …, An), denoted by r(S), is a set of n-tuples • r(S) = {t1, t2, …, tm) • each n-tuple ti is an ordered list of n values, • ti = <v1, v2, …, vn> where • each value vi, 1 in, is an element of dom(Ai) or a special null value

  4. Basic Relational Algebra Operations • Is a collection of operators that are used to manipulate entire relations. The result of each operation is a new relation which can be further manipulated.

  5. SELECT Operation •  --- Select (sigma) • Format: selection-condition(R) • Semantics: • returns all tuples of relation R that satisfy the selection-condition • Select operation is unary. It applies to a single relation • is used to select a subset of the tuples in a relation that satisfy a selection condition.

  6. Formats of Selection Conditions • (a) A op v: A is an attribute, op is an operator (=, , <, , >, ), and v is a constant. • Age  20, Name = `Bill' • (b) A op B: A and B are two attributes in R. • Persons(SSN, Name, Birthplace, Residence) • Birthplace = Residence • (c) Combinations of (a) and (b) connected by and, or or not. • Age  20 and Birthplace = Residence

  7. An Example of SELECT (1) • Example: Find all students who are 20 years old or younger, and whose birthplace is the same as his/her residence. • Age  20 and Birthplace = Residence(Students)

  8. An Example of Select (2) If the current Students is: SSN Name Age GPA Birthplace Residence 123456789 John 20 3.2 Vestal Vestal 234567891 Mary 18 2.9 Binghamton Vestal 345678912 Bill 19 2.7 Endwell Endwell 456789123 Nancy 24 3.6 Binghamton NYC then the result is a new relation: SSN Name Age GPA Birthplace Residence 123456789 John 20 3.2 Vestal Vestal 345678912 Bill 19 2.7 Endwell Endwell

  9. SELECT Operation • Commutativity of select: • condition-1(condition-2(R)) = condition-2(condition-1(R)) = condition-1 and condition-2(R) • city = “Irbid” AND GPA > 65 (STUDENT) or • city = “Irbid” (GPA > 65 (STUDENT)

  10. PROJECT Operation •  --- project (pi) • Format: • attribute-list(R), • where attribute-list is a subset of all attributes in R • Semantics: • Returns all tuples of relation R but for each tuple, only values under attribute-list are returned • Project removes duplicate tuples automatically

  11. Project (2) • Selects certain columns from the table and discards other columns •  <attribute-list> (<relation-name>) • degree of resulting relation is equal to the number of attributes in the <attribute-list> • The number of tuples of the result of project is less than or equal to the number of tuples in the original relation. (It removes the duplicates)

  12. Project (3) Example: Find the name and GPA of all students. Name,GPA(Students) Students SSNNameAgeGPAName GPA 123456789 John 20 3.2 John 3.2 234567891 Mary 18 2.9 Mary 2.9 345678912 John 19 3.2 Input Relation Output Relation

  13. Project (4) • If attribute-list-1attribute-list-2, then • attribute-list-1(attribute-list-2(R)) = attribute-list-1(R) • The Project operation is notcommutative • Retrieve all student numbers and names who live in Amman. • STNO, ST-Name(City = “Amman”(STUDENT))

  14. Project (5) • In complex queries, it becomes necessary to store intermediate results, therefore we should know how to give names to relations and attributes • Amman-students =city = “Amman” (STUDENT) • Result =  STNO, ST-Name (Amman-Students) • or • Result(Number, Name) =  STNO, ST-Name (Amman-Students) Renaming of attributes

  15. Select and Project • Example: • Find the name and GPA of all students who are 20 years oldor younger and whose birthplace and residence are the same • Name, GPA(Age20 and Birthplace=Residence(Students))

  16. RENAME Operation • --- rename (rho) • Format: S(R) • Semantics: • Make a copy of relation R and name the copy as S • S(R): • Rename R only • S(B1, B2, …, Bn)(R): • Rename R and its attributes • (B1, B2, …, Bn)(R): • Rename attributes only

  17. Set Theoretic Operations • UNION, INTERSECTION, DIFFERENCE. • binary (applied to two relations at a time) • To apply any of these operators to relations, relations should be union-compatible. • Two relations R(A1, A2, …, An) and S(B1, B2, …, Bm) are said to be union-compatible if: • they have the same degree (n = m) and • dom(Ai) = dom(Bi) for 1  i n. • Both R and S have the same number of attributes and the corresponding attributes have the same domain

  18. Union (1) •  --- union • Format: • R1 R2 • Semantics: • Returns all tuples that belong to either R1 or R2. • Formally: • R1 R2 = { t | t R1 or t  R2 } • Condition of union: • R1 and R2 must be union compatible. • The union operator removes duplicate tuples automatically.

  19. Union (2) Example: R1 R2 R1  R2 A B C A B C A B C a1 b1 c1 a0 b0 c0 a1 b1 c1 a2 b2 c2a1 b1 c1 a2 b2 c2 a3 b3 c3a2 b2 c2 a3 b3 c3 a4 b4 c4 a0 b0 c0 a4 b4 c4

  20. Set Difference (1) • - set difference • Format: • R1 - R2 • Semantics: • Returns all tuples that belong to R1 but not R2. • Formally: • R1 - R2 = { t | t  R1 and t R2 } • Set difference also requires union compatibility between R1 and R2

  21. Set Difference (2) Example: R1 R2 R1 - R2 A B C A B C A B C a1 b1 c1 a0 b0 c0 a3 b3 c3 a2 b2 c2a1 b1 c1 a3 b3 c3a2 b2 c2 a4 b4 c4

  22. INTERSECTION • --- set intersection • Format: • R1  R2 • Semantics: • Returns all tuples that belong to both R1 and R2. • Formally: • R1 R2 = { t | t R1 and t R2 } • Derivation from existing operators: • R1 R2 = R1 - (R1 - R2) = R2 - (R2 - R1)

  23. INTERSECTION • Union and Intersection are commutative operations • R  S = S  R and • R  S = S  R • Union and Intersection can be applied to any number of relations and both are associative: • R  (S  Q) = (S  R)  Q • R  (S  Q) = (S  R)  Q • Difference operator is not commutative: • R - S  S - R in general.

  24. Cartesian Product (1) •  --- Cartesian product • Format: • R1  R2 • Semantics: • Returns every tuple that can be formed by concatenating a tuple in R1 with a tuple in R2 • Binary operation, but the relations on which it is applied do not have to be union compatible

  25. Cartesian Product (2) Example: R1 A B C R2 B D E a1 b1 c1 b1 d1 e1 a2 b2 c2 b2 d2 e2 a3 b3 c3 R1 R2 AR1.BC R2.BD E a1 b1 c1 b1 d1 e1 a1 b1 c1 b2 d2 e2 a2 b2 c2 b1 d1 e1 a2 b2 c2 b2 d2 e2 a3 b3 c3 b1 d1 e1 a3 b3 c3 b2 d2 e2

  26. Cartesian Product (3) • If R1 and R2 have common attributes, then the full names of these attributes must be used • Example: • Use R.A instead of A • To prevent identical attribute names from occurring in the same relation schema, R R is not allowed. However, R S(R) is allowed • Commutativity: R1  R2 = R2 R1

  27. Cartesian Product (4) • Given R(A1, A2, …, An) and S(B1, B2, …, Bm) • RS = Q(A1, A2, …, An, B1, B2, …, Bm) • degree of Q = n + m • If R1 has N tuples and R2 has M tuples, then • R1 R2 has N*M tuples • Cartesian product is extremely expensive • If R1 and R2 are both large, then each relation may need to be scanned many times to perform the Cartesian product. • Writing out the result can be very expensive due to the large size of the result

  28. Example Retrieve for each female employee a list of names of her dependents FEMALE-EMPS=  Sex = “F” (EMPLOYEE) EMP-NAMES = Fname,Lname,SSN(FEMALE-EMPS) EMP-DEPENDENTS = EMP-NAMES  DEPENDENT ACTUAL-DEP= SSN = ESSN (EMP-DEPENDENTS) RESULT = Fname,Lname,Dependent-name(ACTUAL-DEP)

  29. Relational Algebra Example (1) Example: Find the names of each employee and his/her manager Employees: SSNNameAgeDept-Name 123456789 John 34 Sales 234567891 Mary 42 Service 345678912 Bill 39 null Departments: Name Location Manager Sales XYZ Bill Inventory YZX Charles Service ZXY Maria

  30. Relational Algebra Example (2) • A relational algebra expression is: Employees.Name, Departments.Manager (Employees.Dept_Name = Departments.Name (Employees  Departments)) • A simplified version (don't use full name when you don't have to): Employees.Name, Manager (Dept_Name =Departments.Name(Employees Departments))

  31. Relational Algebra Example (3) • Use assignment operator ( = ) to save the intermediate result into a temporary relation • Example: The following expression Employees.Name, Manager(Dept_Name = Departments.Name (Employees  Departments) ) • is equivalent to the following series of expressions: TEMP1 = Employees Departments TEMP2 = Dept_Name=Departments.Name(TEMP1) RESULT = Employees.Name, Manager (TEMP2)

  32. Relational Algebra Example (4) Example: Find the names of all students who have the highest GPA STUDENTS SSN Name GPA 123456789 John 3.8 234567891 Maria 3.2 345678912 Mike 3.0

  33. Relational Algebra Example (5) • Step 1: Find the GPAs that are not the highest • TEMP1 = Students.GPA(Students.GPA <S2.GPA (Students S2(Students))) Students S2(Students) SSN Name GPA S2.SSN S2.Name S2.GPA 123456789 John 3.8 123456789 John 3.8 123456789 John 3.8 234567891 Maria 3.2 123456789 John 3.8 345678912 Mike 3.0 234567891 Maria 3.2 123456789 John 3.8 234567891 Maria 3.2 234567891 Maria 3.2 234567891 Maria 3.2 345678912 Mike 3.0 345678912 Mike 3.0 123456789 John 3.8 345678912 Mike 3.0 234567891 Maria 3.2 345678912 Mike 3.0 345678912 Mike 3.0

  34. Relational Algebra Example (6) • Step 2: Find the highest GPA • TEMP2 = GPA(Students) - TEMP1 • Step 3: Find the names of students who have the highest GPA • RESULT = Name(Students.GPA = EMP2.GPA(StudentsTEMP2))

  35. Join (1) • --- join • Format: • R1 join-condition R2 • Semantics: • Returns all tuples in R1 R2 which satisfy the join condition • Derivation from existing operators: • R1 join-condition R2 = join-condition(R1 R2) • Format of join condition: • R1.A op R2.B • R1.A1 op R2.B1and R1.A2 op R2.B2 . . . • Tuples whose join attributes are NULL do not appear in the result

  36. Join (2) Example: Find the names of all employees and theirdepartment locations Employees: SSNNameAgeDept-Name 123456789 John 34 Sales 234567891 Mary 42 Service 345678912 Bill 39 null Departments: NameLocationManager Sales Binghamton Bill Inventory Endicott Charles Service Vestal Maria

  37. Join (3) Employees.Name, Location(Employees Dept-Name = Departments.Name Departments) Result Name Location John Binghamton Mary Vestal

  38. Join (4) Example: Find the names of all employees who earn more than his/her manager Employees: SSNNameSalary Manager-SSN 123456789 John 34k 234567891 234567891 Bill 40k null 345678912 Mary 38k null 456789123 Mike 41k 345678912 Employees.Name (Employees Employees.Manager-SSN = EMP.SSN and Employees.Salary > EMP.Salary EMP(Employees))

  39. Equijoin • Definition: A join is called an equijoin if only equality operator is used in all join conditions. R1 R2 R1 R1.B = R2.BR2 A B B C A R1.B R2.B C a b b c a b b c d b c d d b b c b c a d b c c d • Most joins in practice are equijoins.

  40. Natural Join (1) • Definition: • A join between R1 and R2 is a natural join if • There is an equality comparison between every pair of identically named attributes from the two relations • Among each pair of identically named attributes from the two relations, only one remains in the result • Natural join is denoted by  with no join conditions explicitly specified

  41. Natural Join (2) • Example: • R1(A, B, C) R2(A, C, D) • has attributes (A, B, C, D) in the result • Questions: • How to express natural join in terms of equijoin and other relational operator? • R1  R2 = R1.A, B, R2.C, D (R1R1.A=R2.Aand R1.C= R2.C R2)

  42. A Complete Set of Relational Algebra Operations • The relational algebra is a set of expressions as defined below: • A relation is an expression. • If E1 and E2 are expressions, so are P(E1), A(E1), S(E1), E1E2, E1-E2, E1E2 • That is, any expression that can be formed from base relations and the six relational operators is a relational algebra expression

  43. Aggregate Functions & Grouping • Well know Aggregate Functions: • SUM, AVERAGE, MAXIMUM, MINIMUM, COUNT • Format:  (Script F) • <grouping attributes>  <function list> (R) • <grouping attributes> is a list of attributes in R • <function list> is a list of (<function> <attribute>) pairs • <function> is one of aggregate functions • <attribute> is an attribute in R

  44. Examples • Retrieve each department number, the number of employees in the department, and their average salary: • DNOCOUNT( SSN), AVERAGE(SALARY) (EMPLOYEE) DNO COUNT_SSN AVERAGE_SALARY 1 1 55000 4 3 31000 5 4 33250

  45. Examples • Retrieve for each department number, the number of employees in the department, and their average salary: R(DNO, NO_OF_EMP, AVERAGE_SAL)(DNO  COUNT( SSN), AVERAGE(SALARY) (EMPLOYEE)) DNO NO_OF_EMP AVERAGE_SAL 1 1 55000 4 3 31000 5 4 33250

  46. Examples • If no grouping attributes are specified, the functions are applied to attribute values of all the tuples in the relation • Retrieve the number of employees, and their average salary: • COUNT( SSN), AVERAGE(SALARY) (EMPLOYEE)) COUNT_SSN AVERAGE_SALARY 8 35125

  47. Outerjoin (1) R1 R2 R1  R2 A B C C D E A B C D E a1 b1 c1c1 d1 e1 a1 b1 c1 d1 e1 a4 b3 c2 c6 d3 e2 • The second tuples of R1 and R2 are not present in the result (called dangling tuples) • Applications exist that require to retain dangling tuples

  48. Outerjoin (2) O --- outer join Format: R1 O R2 Semantics: like join except • it retains dangling tuples from both R1 and R2 • it uses null to fill out missing entries R1 O R2 A B C D E a1 b1 c1 d1 e1 a4 b3 c2 null null null null c6 d3 e2

  49. Left Outerjoin and Right Outerjoin • LO --- left outer join • Format: R1 LO R2 • Semantics: like outerjoin but retains only dangling tuples of the relation on the left • RO --- right outer join • Format: R1 RO R2 • Semantics: like outerjoin but retains only dangling tuples of the relation on the right

  50. Left Outerjoin and Right Outerjoin (2) R1 R2 R1  R2 A B C C D E A B C D E a1 b1 c1c1 d1 e1 a1 b1 c1 d1 e1 a4 b3 c2 c6 d3 e2

More Related