1 / 46

CS4432: Database Systems II

CS4432: Database Systems II. Query Processing. Query in SQL  Query Plan in Algebra. Example. Data: relation R (A, B, C) relation S (C, D, E) Query: SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C.

dard
Download Presentation

CS4432: Database Systems II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS4432: Database Systems II query processing

  2. Query Processing Query in SQL  Query Plan in Algebra query processing

  3. Example Data: relation R (A, B, C) relation S (C, D, E) Query: SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C query processing

  4. SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C Answer B D 2 x R A B C S C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 query processing

  5. How do we execute query? - Form Cartesian product of all tables in FROM-clause - Select tuples that match WHERE-clause - Project columns that occur in SELECT-clause One idea query processing

  6. Bingo! Got one... R X S R.A R.B R.C S.C S.D S.E a 1 10 10 x 2 a 1 10 20 y 2 . . C 2 10 10 x 2 . . query processing

  7. But ? • Performance would be unacceptable! • We need a better approach to reasoning about queries, their execution orders and their respective costs query processing

  8. Formal Relational Query Languages • Two mathematical Query Languages form basis for “real” languages (e.g. SQL), and for implementation: • Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non-operational, declarative.) • Relational Algebra: More operational, very useful for representing execution plans. query processing

  9. Relational Algebra • Tuple : ordered set of data values • Relation: a set of tuples • Algebra: formal mathematical system consisting of a set of objects and operations on those objects • Relational algebra: Algebra whose objects are relations and operators transform relations into other relations query processing

  10. Relational Algebra ? Query: SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C query processing

  11. Relational Algebra - can be used to describe plans... Ex: Plan I B,D sR.A=“c” S.E=2  R.C=S.C X R S OR: B,D [sR.A=“c” S.E=2  R.C = S.C (RXS)] query processing

  12. R1 Example Instances “Sailors” and “Reserves” relations S1 S2 query processing

  13. Relational Algebra • Basic operations: • Selection ( ) Selects a subset of rows from relation. • Projection ( ) Deletes unwanted columns from relation. • Cross-product( ) Allows us to combine two relations. • Set-difference ( ) Tuples in reln. 1, but not in reln. 2. • Union( ) Tuples in reln. 1 and in reln. 2. • Additional operations: • Intersection, join, division, renaming: Not essential ! • Algebra is “closed”: Since each operation returns a relation, operationscan be composed ! query processing

  14. Projection query processing

  15. Selection query processing

  16. Union, Intersection, Set Difference • Operate on two union-compatible relations: • Same number of fields. • `Corresponding’ fields have same type. query processing

  17. Cross-Product • Each row of S1 is paired with each row of R1. • Conflict: Both S1 and R1 have a field called sid. • Renaming operator: query processing

  18. Joins • Condition Join : • Result schema same as that of cross-product. query processing

  19. Joins • Equi-Join: condition contains only equalities. • Result schema only one copy of fields for which equality is specified. • Natural Join: Equijoin on all common fields. query processing

  20. Division • Not primitive operator, but useful: Find sailors who have reserved allboats. • A has 2 fields x and y; B has only field y: • A/B = • i.e., A/B contains all x tuples (sailors) such that for everyy tuple (boat) in B, there is an xy tuple in A. query processing

  21. B1 B2 B3 A Examples of Division A/B A/B1 A/B2 A/B3 query processing

  22. Disqualified x values: A/B: all disqualified tuples Expressing A/B Using Basic Operators • Division is useful shorthand. • Idea: For A/B, compute all x values that are not `disqualified’ by some y value in B. query processing

  23. Solution 1: • Solution 2: • Solution 3: Find names of sailors who’ve reserved boat #103 query processing

  24. A more efficient solution: Find names of sailors who’ve reserved a red boat • Information about boat color only available in Boats; so need an extra join: A query optimizer can find this, given the first solution! query processing

  25. What happens if is replaced by in this query? Find sailors who’ve reserved a red or a green boat • Can identify all red or green boats, then find sailors who’ve reserved one of these boats: • Can also define Tempboats using union! (How?) query processing

  26. Find sailors who’ve reserved a red and a green boat • Must identify sailors who’ve reserved red boats, sailors who’ve reserved green boats, then find the intersection (sid is a key for Sailors): query processing

  27. Find names of sailors who’ve reserved all boats • Uses division; schemas of input relations to / must be carefully chosen: • To find sailors who’ve reserved all ‘Interlake’ boats: ..... query processing

  28. Relational Algebra representation used to describe plans... query processing

  29. Example Data: relation R (A, B, C) relation S (C, D, E) Query: SELECT B, D FROM R, S WHERE R.A = “c” and S.E = 2 and R.C=S.C query processing

  30. Relational Algebra - to describe plan Ex: Plan I B,D sR.A=“c” S.E=2  R.C=S.C X R S OR: B,D [sR.A=“c” S.E=2  R.C = S.C (RXS)] query processing

  31. Another idea: Plan II B,D sR.A = “c”sS.E = 2 R S natural join query processing

  32. R S A B C s (R) s(S) C D E a 1 10 A B C C D E 10 x 2 b 1 20 c 2 10 10 x 2 20 y 2 c 2 10 20 y 2 30 z 2 d 2 35 30 z 2 40 x 1 e 3 45 50 y 3 SELECT B,D FROM R,S WHERE R.A = “c” and S.E = 2 and R.C=S.C query processing

  33. Yet another idea: Plan III B,D sS.E = 2 sR.A = “c” R S natural join query processing

  34. Plan III Use R.A and S.C Indexes (1) Use R.A index to select R tuples with R.A = “c” (2) For each R.C value found, use S.C index to find matching tuples (3) Eliminate S tuples S.E  2 (4) Join matching R,S tuples, project B,D attributes and place in result query processing

  35. =“c” <c,2,10> <10,x,2> check=2? output: <2,x> next tuple: <c,7,15> R S A B C C D E a 1 10 10 x 2 b 1 20 20 y 2 c 2 10 30 z 2 d 2 35 40 x 1 e 3 45 50 y 3 A C I1 I2 query processing

  36. Overview of Query Optimization query processing

  37. SQL query parse parse tree convert answer logical query plan execute apply laws statistics Pi “improved” l.q.p pick best estimate result sizes {(P1,C1),(P2,C2)...} l.q.p. +sizes estimate costs consider physical plans {P1,P2,…..} query processing

  38. Example: SQL query Query : Find the movies with stars born in 1960 SELECT title FROM StarsIn WHERE starName IN ( SELECT name FROM MovieStar WHERE birthdate LIKE ‘%1960’ ); query processing

  39. <Query> <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Tuple> IN <Query> title StarsIn <Attribute> ( <Query> ) starName <SFW> SELECT <SelList> FROM <FromList> WHERE <Condition> <Attribute> <RelName> <Attribute> LIKE <Pattern> name MovieStar birthDate ‘%1960’ Example: Parse Tree query processing

  40. Example: Generating Relational Algebra title  StarsIn <condition> <tuple> IN name <attribute> birthdate LIKE ‘%1960’ starName MovieStar Fig. 16.14: An expression using a two-argument , midway between a parse tree and relational algebra query processing

  41. Example: Logical Query Plan title starName=name  StarsIn name birthdate LIKE ‘%1960’ MovieStar Fig. 16.16: Applying the rule for IN conditions query processing

  42. Example: Improved Logical Query Plan title Question: Push project to StarsIn? starName=name StarsIn name birthdate LIKE ‘%1960’ MovieStar Fig. 16.16: An improvement on prev fig query processing

  43. Example: Estimate Result Sizes Need expected size StarsIn MovieStar P s query processing

  44. Example: One Physical Plan Parameters: join order, memory size, project attributes,... Hash join SEQ scan index scan Parameters: Select Condition,... StarsIn MovieStar query processing

  45. Example: Estimate costs L.Q.P P1 P2 …. Pn C1 C2 …. Cn Pick best! query processing

  46. Query Optimization • Relational algebra level … • Detailed query plan level … • Estimate costs • Generate and compare plans query processing

More Related