200 Views

Download Presentation
##### Relational Algebra - Chapter 6.1-6.5

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Relational Algebra**• Theoretical basis for SQL (E.F. Codd) • Relational algebra (algebraic notation) and relational calculus (logical notation) • Created to demonstrate the potential for a query language of the relational model • Algebra and calculus are equivalent in expressive power • Can represent complex queries compactly, but too mathematical for the average person**What does it do?**• Provide DML and DDL • In relational algebra, a series of operations are combined to form a relational algebra expression (query)**Set theoretic operations**• Union, Intersection, Difference - Binary • Applied to 2 sets (relations) - no duplicates in result - mathematical set • Must be same type of tuples - Compatibility • same degree n • dom(Ai) = dom(Bi) • Fig. 6.4 • Resulting relation - same attribute names as first relation • Which operations are: • Commutative ? • R U S = S U R • Associative ? • R U (S U T) = (R U S) U T**Cartesian Product X - Binary**• Also binary, but does not require union compatibility • R(A1, A2, … An) X S(B1, B2, …, Bm) • Creates a tuple with the combined attributes of 2 tables • Q(A1, A2, …, An, B1, B2, …, Bm) Fig. 6.5 • Degree of resulting relation? • n+m**Select Operation s**• s - unary operation (Where in SQL) • A subset of tuples satisfying a selection condition • Selects rows • Equivalent to select condition in WHERE clause s<selection condition> (<Relation name>) sdno=4(Employee) s salary>30000(Employee)**Select Operation**• Select condition is a Boolean expression <attr_name> <comparison op> <constant value> <attr_name> <comparison op> <attr_name> comparison op - =, <, <=, etc. • You can use boolean conditions to connect clauses • Can combine cascade of selects into single select with ANDs s(dno=4 and salary>25000) or (dno=5 and salary>30000)(Employee) Fig. 6.1**Select Operation**• Degree of resulting relation? • Selectivity - the fraction of tuples selected • number of tuples total tuples • Is Select is commutative?**Project Operation p or Õ**• p - unary operation • Equivalent to the SELECT clause in SQL(Select) • Keeps only certain attributes (columns) from a relation • Selects columns • Form of operation: p<attr_list> (<relation name>) pfname, lname, salary Employee Fig 6.1 • Resulting relation has only those attributes specified • Degree of relation ? • attributes in attr_list**Project Operation**• The project operation eliminates duplicate tuples in the resulting relation so that it remains a mathematical set psex, salary Employee Fig 6.1 • If several male employees have salary $30,000 only single tuple <M, 30000> is kept in the resulting relation. • Is the project operation commutative?**Operations**• DML Operations: • set theory operations • relational DB operations • Relational DB operations: • Select • Project • Join**Sequences of operations**• Several relation algebra operations can be combined to form a relational algebra expression (query). • Retrieve the names and salaries of employees who work in department 5. Q ←pfname, lname, salary (sdno=5 Employee) • Alternately, explicit intermediate relations can be specified for each step: Dept5 ← sdno=5 Employee R ← pfname,lname,salary Dept5**Write the following in Relational Algebra**• Select * from Employee • Select bdate from Employee • Select * from Employee where sex='F'**Write the following in Relational Algebra**Select ssn from Employee where not exists (select * from Dependent where ssn=essn) (Select ssn from employee) minus (Select essn from dependent)**Renaming**• Attributes can optionally be renamed in the resulting relation: Dept5 ¬sdno=5 Employee T(firstname,lastname,salary) ¬pfname,lname,salaryDept5 Fig. 6.2 • Alternative notation in textbook: rR(firstname, lastname, salary) pfname,lname,salaryDept5**The Join operation |X|**• Similar to a Cartesian Product followed by a select • Form of operation: R |X|<join condition> S • Result is: Q (A1, A2, …, An, B1, B2, …, Bm) A1, A2 … are the attributes of R B1, B2, .. are the attributes of S • For all tuples that satisfy the join condition • join condition: <cond> and <cond> and … Fig. 6.6 • Resulting number of tuples? • Different types of joins - theta join, natural join, equijoin**Theta Join**• R |X|Ai q Bi S • where the join condition is of the form: Ai q Bi q is =, < , £ , etc. • Example: Scholarship(SNameGPA_ReqDesc) Student (Name CWID GPA Major) Select Name, SName From Student, Scholarship Where GPA >= GPA_Req**Natural join**• We will use the * notation (some others use |X| without subscript) • Like an equijoin, except attributes for the equijoin in the second relation are deleted from result • (Why have 2 columns with the same value?) Q ¬ R * (<list1>),(<list2>) S Fig 6.7 • Equivalent to equijoin but keep only list1 • If attributes have the same name in both relations, list1 and list2 are not needed. • In the original definition of natural join, the join attributes required to have the same names in both relations.**Equijoin**• R |X|Ai=Bi S • requires identical values in every tuple for each pair of join attributes • (one or more equality comparisons) • Join conditions are all of the form Ai = Bi and Aj = Bj … • Retrieve each department’s name and manager’s name. T¬ Department |X|mgrssn=ssnEmployee Result ¬pdname,fname,lname (T)**Order of precedence**• Unary: Select and project (highest precedence) • Binary: • Joins, Cartesian product • Intersection • Union, minus • Use lots of parenthesis!**Write the following in Relational Algebra**• Select * from Employee, Department where dno=dnumber • List employee SSNs who are female and work for the research department • Select *From Employee, dept_locations Where dno = dnumber and dlocation = 'Houston'**Renaming attributes**• A relation can have a set of join attributes with itself • List all employee names and their supervisor names S(soc, first, last)¬pssn,fname,lname Employee Temp ¬ Employee |X|superssn=socS Result ¬pfname,lname,first,last(Temp) • Usually, don't see qualification of attributes in relational algebra**Complete Set of Relational Algebra Operations**{ s , p , È , - , ×} • All other relational algebra operations can be expressed as a sequence of operations from this set. • Other operations are for convenience. R |X| S = s<cond> (R X S) R Ç S = (RÈ S) - ((R - S) È (S - R))**Do we need anymore relational algebra operations to satisfy**queries?**How about?**Select COUNT(*) From Project Select pname, COUNT(ssn) From Project, Works_on Where pnumber=pno**Additional relation algebra Operations**• Aggregate function - SUM, COUNT, AVG, MIN, MAX [<grouping attribute>] Á <function list> (<relation name>) R ¬Ácount ssn, avg salary(Employee) The following uses the optional grouping attribute R ¬dnoÁcount ssn, avg salary(Employee) Fig. 6.9 • The attributes returned from an aggregate function are the attributes in the function listand any grouping attributes listed**Outer Join**• Extension of join and union • In a regular equijoin or natural join, tuples in R1 or R2 that do not have matching tuples in the other relation do not appear in the result. • Some queries require all tuples in R1 (or R2 or both) to appear in the result • When no matching tuples are found, nulls are placed for the missing attributes.**Outer Join**• Left outer join: R1 ]X| R2 keeps every tuple in R1 in result. • List all employees and if they are a manager, list dname Temp <- (Employee ]X| ssn=mgrssn Department) R <- pfname, minit, lname, dname (Temp) Fig. 6.11 • Right outer join: R1 |X[ R2 keeps every tuple in R2 in result. • Full outer join: R1 ]X[ R2 keeps every tuple in R1 and R2 in result. • Think about how this is different from R1 X R2.**Division operation**• Part of original relational algebra • T(Y) = R(Z) ¸ S(X) • tuple t is in result if t is in R for every tuple in S • More generally, result is a relation T(Y) that includes t if t appears in R with the value of X for every tuple in S. Fig. 6.8 • The attributes Y in table T = attributes of R in Z - attributes S in X, where Y is the set of attributes in R not in S. Result <- R ¸ S**Division operation**• For example, Retrieve names of all employees who work on all projects John smith works on. smith <- sfname='John' and lname='Smith'(Employee)smith_pnos <- Works_on |X| ssn=essn smithssn_pnos <- pessn,pnoWorks_onssns <- ssn_pnos¸ smith_pnos results <- pfname, lname (ssns * employee)**Write the following in Relational Algebra**• Compute the average number of dependents over employees with dependents • Select * From Employee Where salary > all (Select salary From Employee Where sex = 'F')**Write in Relational Algebra**• For each project on which more than two employees work, retrieve the project number, project name, and the number of employees who work on that project. Select pnumber, pname, COUNT(*) From Project, Works_on Where pnumber =pno Group By pnumber, pname Having COUNT(*) > 2**DDL - Also provided**• Declare Schema for database • Declare Relation for Schema • Insert <values> into Relation • Delete Relation tuple with specified condition • Modify col. of Relation tuple with specified condition