1 / 108

CIS 550, Fall 2001 Handout 2.

CIS 550, Fall 2001 Handout 2. SQL, Relational Calculus and Datalog. What we cannot compute with RA. Recursive queries . Given a relation Parent(Parent, Child) compute the Ancestor relation. (Can do this in Datalog.)

elata
Download Presentation

CIS 550, Fall 2001 Handout 2.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CIS 550, Fall 2001Handout 2. SQL, Relational Calculus and Datalog CIS 550, Fall 2001

  2. What we cannot compute with RA • Recursive queries. Given a relation Parent(Parent, Child) compute the Ancestor relation. (Can do this in Datalog.) • Aggregate operations. E.g. ``The number of climbers who have climbed `Last Tango' '' or ``The average age of • Computing with non 1NF relationse.g. lists, arrays, multisets, nested relations. CIS 550, Fall 2001

  3. SELECT [DISTINCT] target-list FROMrelation-list WHERE qualification Basic Query • relation-list A list of relation names (possibly with a range-variable after each name). • target-list A list of attributes of relations in relation-list. * can be used to denote all atts. • qualification Comparisons (Attr op const or Attr1 op Attr2, where op is one of combined using AND, OR and NOT. • DISTINCT(optional) keyword indicates that the answer should not contain duplicates. Default is that duplicates are not eliminated! CIS 550, Fall 2001

  4. Conceptual Evaluation Strategy • Compute the product of relation-list • Discard tuples that fail qualification • Project over attributes in target-list • If DISTINCT then eliminate duplicates This is probably a very bad way of executing the query, and a good query optimizer will use all sorts of tricks to find efficient strategies to compute the same answer. CIS 550, Fall 2001

  5. Sample tables Routes: RId RName Grade Rating Height 1 Last Tango II 12 100 2 Garden Path I 2 60 3 The Sluice I 8 60 4 Picnic III 3 400 Climbers: Climbs: Cid CName Skill Age CId RId Date Duration 123 Edmund EXP 80 123 1 10/10/88 5 214 Arnold BEG 25 123 3 11/08/87 1 313 Bridget EXP 33 313 1 12/08/89 5 212 James MED 27 214 2 08/07/92 2 313 1 06/07/94 3 CIS 550, Fall 2001

  6. SELECT * FROM Routes WHERE Height < 200; RID RNAME GRADE RATING HEIGHT 1 Last Tango II 12 100 2 Garden Path I 2 60 3 The Sluice I 8 60 GRADE HEIGHT I 100 I 60 I 60 III 400 Select/project queries SELECT Grade, Height FROM Routes; CIS 550, Fall 2001

  7. GRAD HEIGHT I 60 II 100 III 400 Distinct • Note that SQL did not eliminate duplicates. We need to request this explicitly. SELECT DISTINCT Grade, Height FROM Routes; CIS 550, Fall 2001

  8. RId RName Grade Rating Height 1 Last Tango II 12 100 String Matching • Can be used in where clause. “_” denotes any character, “%” 0 or more characters. SELECT * FROM Routes WHERE RName LIKE 'L_%o' CIS 550, Fall 2001

  9. Arithmetic • “as” can be used to label columns in the output; arithmetic can be used to compute results SELECT DISTINCT Grade, Height/10 as H FROM Routes; Grade H II 10 I 6 III 40 CIS 550, Fall 2001

  10. CID 123 212 214 313 Set operations -- union SELECT CId FROM Climbers WHERE Age < 40 UNION SELECT CId FROM Climbs WHERE RID = 1 ; • Duplicates do not occur in the union. CIS 550, Fall 2001

  11. CID 214 313 212 123 313 313 The UNION ALL operator preserves duplicates SELECT Cid FROM Climbers WHERE Age < 40 UNION ALL SELECT Cid FROM Climbs WHERE RID = 1 ; CIS 550, Fall 2001

  12. What does “union compatible” mean? SELECT CId FROM Climbers UNION SELECT RId FROM Routes; Ok SELECT CName FROM Climbers UNION SELECT RId FROM Routes; Error CIS 550, Fall 2001

  13. CID 123 CID 212 214 Intersection and difference SELECT CId FROM Climbers WHERE Age > 40 INTERSECT SELECT CId FROM Climbs WHERE RId = 1 ; SELECT CId FROM Climbers WHERE Age < 40 MINUS SELECT CId FROM Climbs WHERE RId = 1 ; CIS 550, Fall 2001

  14. Nested queries • We could also have written the previous queries as follows: SELECT CId FROM Climbers WHERE Age > 40 AND CId IN (SELECT CId FROM Climbs WHERE RId = 1) ; SELECT CId FROM Climbers WHERE Age < 40 AND CId NOT IN (SELECT CId FROM Climbs WHERE RId = 1) ; CIS 550, Fall 2001

  15. Nested queries with correlation SELECT CId FROM Climbers c WHERE EXISTS (SELECT * FROM Climbs b WHERE c.CId=b.CId AND b.RID = 1); SELECT CId FROM Climbers c WHERE NOT EXISTS (SELECT * FROM Climbs b WHERE c.CId=b.CId); SELECT CId FROM Climbers c WHERE EXISTS UNIQUE (SELECT * FROM Climbs b WHERE c.CId=b.CId AND RID = 1); CIS 550, Fall 2001

  16. CName Age Edmund 80 More on set comparison ops • Besides IN, NOT IN, EXISTS, NOT EXISTS, UNIQUE and NOT UNIQUE we can also say: <op> ANY, <op> ALL, where <op> is any of • What does the following mean in English? SELECT CName, Age FROM Climbers WHERE Age >= ALL (SELECT Age FROM Climbers) CIS 550, Fall 2001

  17. Cid CName Skill Age 123 Edmund EXP 80 313 Bridget EXP 33 212 James MED 27 Set comparison ops, cont. • What does the following mean in English? SELECT CName, Age FROM Climbers WHERE Age > ANY (SELECT Age FROM Climbers WHERE CName='Arnold') CIS 550, Fall 2001

  18. CNAME James Using expressions for relation names • Consider the following query: “Find the names of climbers who have not climbed any route.” SELECT CName FROM (SELECT CId FROM Climbers MINUS SELECT CId FROM Climbs) Temp, Climbers WHERE Temp.CId = Climbers.CId; CIS 550, Fall 2001

  19. CID CNAME SKILL AGE CID RID DAY DURATION 123 Edmund EXP 80 123 1 10-OCT-88 5 214 Arnold BEG 25 123 1 10-OCT-88 5 313 Bridget EXP 33 123 1 10-OCT-88 5 212 James MED 27 123 1 10-OCT-88 5 123 Edmund EXP 80 123 3 08-NOV-87 1 214 Arnold BEG 25 123 3 08-NOV-87 1 ... Products SELECT * FROM Climbers,Climbs; • Note that the CID column name is duplicated in the output. CIS 550, Fall 2001

  20. Conditional join SELECT * FROM Climbers,Climbs WHERE Climbers.CId = Climbs.CId; CID CNAME SKIL AGE CID RID DAY DURATION 123 Edmund EXP 80 123 1 10-OCT-88 5 123 Edmund EXP 80 123 3 08-NOV-87 1 313 Bridget EXP 33 313 1 08-DEC-89 5 214 Arnold BEG 25 214 2 07-AUG-92 2 313 Bridget EXP 33 313 1 07-JUN-94 3 CIS 550, Fall 2001

  21. CNAME Edmund Bridget Bridget Example 1 • The names of climbers who have climbed route 1. SELECT CName FROM Climbers, Climbs WHERE Climbers.CId = Climbs.CId AND RId= 1; CIS 550, Fall 2001

  22. CNAME Edmund Bridget Bridget Example 2 • The names of climbers who have climbed the route named “Last Tango”. SELECT CName FROM Climbers, Climbs, Routes WHERE Climbers.CId = Climbs.CId AND Routes.RId = Climbs.RID AND RName = 'Last Tango'; CIS 550, Fall 2001

  23. CID 313 313 Example 3 • The IDs of climbers who have climbed the same route twice. • Note the use of aliases for relations. SELECT C1.CId FROM Climbs C1, Climbs C2 WHERE C1.CId = C2.CId AND C1.RId = C2.RId AND (C1.Day <> C2.Day OR C1.DURATION <> C2.DURATION)); CIS 550, Fall 2001

  24. CNAME James Example 4 • Recall: The names of climbers who have not climbed any route SELECT CName FROM (SELECT CId FROM Climbers MINUS SELECT CId FROM Climbs) Temp, Climbers WHERE Temp.CId = Climbers.CId; CIS 550, Fall 2001

  25. CNAME James Example 4, cont. • A simpler alternative: SELECT CName FROM Climbers WHERE CId NOT IN (SELECT CId FROM Climbs); CIS 550, Fall 2001

  26. Universal Quantification • The IDs of climbers who have climbed all routes. SELECT CId FROM Climbs c1 WHERE NOT EXISTS (SELECT RId  Routes not climbed FROM Routes r by c1. WHERE NOT EXISTS (SELECT * FROM Climbs c2 WHERE c1.CId=c2.CId and c2.RId=r.RId) CIS 550, Fall 2001

  27. RNAME DIFFICULTY Last Tango 1200 Garden Path 120 The Sluice 480 Picnic 1200 Non-algebraic operations • SQL has a number of operations that cannot be expressed in relational algebra. The first is to express arithmetic in queries. SELECT RName, Rating * Height AS Difficulty FROM Routes; CIS 550, Fall 2001

  28. Arithmetic, cont • Arithmetic (and other expressions) cannot be used at the top level. E.g. 2+2 isn't an SQL query. • Question -- how would you get SQL to compute 2+2? CIS 550, Fall 2001

  29. COUNT(GRADE) 4 Counting SELECT COUNT(RId) FROM Routes; • Surprisingly, the answer to both of these is the following: SELECT COUNT(Grade) FROM Routes; CIS 550, Fall 2001

  30. COUNT(GRADE) 3 Counting, cont. • To fix this, we use the keyword “DISTINCT”: • Can also use SUM, AVG, MIN and MAX. SELECT COUNT(DISTINCT Grade) FROM Routes; CIS 550, Fall 2001

  31. Group by • So far, these aggregate operators have been applied to all qualifying tuples. Sometimes we want to apply them to each of several groups of tuples. • For example: “Print the number of routes in each grade.” CIS 550, Fall 2001

  32. GRADE COUNT(*) I 2 II 1 III 1 Group by SELECT Grade, COUNT(*) FROM Routes GROUP BY Grade; • Note that only the columns that appear in the GROUP BY statement and “aggregated” columns can appear in the output. So the following would generate an error. SELECT Grade, RName, COUNT(*) FROM Routes GROUP BY Grade; CIS 550, Fall 2001

  33. HEIGHT AVG(RATING) 60 5 100 12 Group by … having • HAVING is to GROUP BY as WHERE is to FROM • “HAVING” is used to restrict the groups that appear in the result. SELECT Height, AVG(Rating) FROM Routes GROUP BY Height HAVING Height < 300; CIS 550, Fall 2001

  34. HEIGHT AVG(RATING) 60 5 400 3 Another example SELECT Height, AVG(Rating) FROM Routes GROUP BY Height HAVING MAX(Rating) < 10; CIS 550, Fall 2001

  35. Null Values • The value of an attribute can be unknown (e.g., a rating has not been assigned) or inapplicable (e.g., no spouse). • SQL provides a special valuenullfor such situations. • The presence of nullcomplicates many issues. E.g.: • Special operators needed to check if value is/is not null. • Is rating>8 true or false when rating is equal to null? What about AND, OR and NOT connectives? 3-valued logic (true, false and unknown). • Meaning of constructs must be defined carefully. (e.g., WHERE clause eliminates rows that don’t evaluate to true.) CIS 550, Fall 2001

  36. Outer Join • A variant of the join that relies on null values: • Tuples of Climbers that do not match some tuple in Climbs would normally be excluded from the result; the “left” outer join preserves them with null values for the missing Climbs attributes. SELECT Climbers.CId, Climbs.RId FROM Climbers NATURAL LEFT OUTER JOIN Climbs CIS 550, Fall 2001

  37. CId CName Skill Age RId Date Duration 123 Edmund EXP 80 1 10/10/88 5 123 Edmund EXP 80 3 11/08/87 1 214 Arnold BEG 25 2 08/07/92 2 313 Bridget EXP 33 1 12/08/89 5 313 Bridget EXP 33 1 06/07/94 3 212 James MED 27   Result of left outer join • Null values can be disallowed in a query result • by specifying NOT NULL. CIS 550, Fall 2001

  38. Summary • SQL is “relationally complete”: all of the operators of the relational algebra can be simulated. • Additional features: string comparisons, set membership, arithmetic and grouping. CIS 550, Fall 2001

  39. Views in SQL • A view is a query with a name that can be used in SELECT statements. • Note that ExpClimbers is not a stored relation! CREATE VIEW ExpClimbers AS SELECT CId, CName, Age FROM Climbers WHERE Skill=‘EXP’; SELECT CName FROM ExpClimbers WHERE Age<50; CIS 550, Fall 2001

  40. Querying views • The system would perform the following translation: • This is done using the relational algebra “operator tree” representation of the query, and relational algebra equivalences. SELECT CName FROM ExpClimbers WHERE Age<50; is translated to SELECT CName FROM Climbers WHERE Skilll=‘EXP’ and Age<50; CIS 550, Fall 2001

  41. ExpClimbers Climbers Climbers The “how” of translation • The operator tree for is expanded to  SELECT CName FROM ExpClimbers WHERE Age<50; CIS 550, Fall 2001

  42. Changing the database • How do we initialize the database? How do we update and modify the database state? • SQL supports an update language for insertions, deletions and modifications of tuples. • INSERT INTO R(A1,…,An) VALUES (V1,…,Vn); • DELETE FROM R WHERE <condition>; • UPDATE R SET <new-value assignments> WHERE <condition>; CIS 550, Fall 2001

  43. RId RName Grade Rating Height 1 Last Tango II 12 100 2 Garden Path I 2 60 3 The Sluice I 8 60 4 Picnic III 3 400 Tuple insertion • Recall our rock climbing database, with the following instance of Routes: • To insert a new tuple into Routes: INSERT INTO Routes(RId, Rname, Grade, Rating, Height) VALUES (5, “Desperation”, III,12,600); CIS 550, Fall 2001

  44. RId RName Grade Rating Height 1 Last Tango II 12 100 2 Garden Path I 2 60 3 The Sluice I 8 60 4 Picnic III 3 400 5 Desperation III 12 600 Tuple insertion, cont. • Alternatively, we could omit the attributes since the order given matches the DDL for Routes: INSERT INTO Routes VALUES (5, “Desperation”, III,12,600); CIS 550, Fall 2001

  45. HardClimbs: Route Rating FeetHigh SlimyClimb 9 200 The Sluice 8 60 Set insertion • Suppose we had the following relation and wanted to add all the routes with rating > 8: INSERT INTO HardClimbs(Route,Rating,FeetHigh) SELECT DISTINCT Rname, Grade, Rating, Height FROM Routes WHERE rating>8; Route Rating FeetHigh SlimyClimb 9 200 The Sluice 8 60 Last Tango 12 100 CIS 550, Fall 2001

  46. HardClimbs: Route Rating FeetHigh SlimyClimb 9 200 Deletion • Deletion is set-oriented: the only way to delete a single tuple is to specify its key. • Suppose we wanted to get rid of all tuples in HardClimbs that are in Routes: DELETE FROM HardClimbs WHERE Route in (SELECT Name FROM Routes) CIS 550, Fall 2001

  47. Modifying tuples • Non-key values of a relation can be changed using UPDATE. • Suppose we want to increase the age of all experienced climbers by 1: • NOTE: SQL uses an “old-value” semantics. New values are calculated using the old state, not a partially modified state. UPDATE Climbers SET Age = Age+1 WHERE Skill = “EXP”; CIS 550, Fall 2001

  48. Emp Manager Salary 1 3 20,000 2 3 21,500 3  21,000 Old-value semantics • “Give a $1000 raise to every employee who earns less than their manager.” • Old-value semantics: employees 1 and 3 are given a raise. • Otherwise: employee 2 will get a raise if they are considered after employee 3 receives a raise! CIS 550, Fall 2001

  49. Modifying views • Since the view definition is not stored, the view “changes” as the relations in the FROM clause change. • We could also think of making changes to the view itself: • Unfortunately, this particular view definition is not updatable! INSERT INTO ExpClimbers VALUES (7,‘Jean’, 48); CIS 550, Fall 2001

  50. Modifying views, cont. INSERT INTO ExpClimbers VALUES (7,‘Jean’, 48); • This would imply the following insertion, since we are not given a value for skill: • If the view were computed after this update, the new tuple would not appear because ‘EXP’= does not evaluate to true! INSERT INTO Climbers VALUES (7,‘Jean’, , 48); CIS 550, Fall 2001

More Related