1 / 34

CS 405G: Introduction to Database Systems

CS 405G: Introduction to Database Systems. Lecture 6: Relational Algebra Instructor: Chen Qian Spring 2014. p. Review. Update Operations on Relations. Update operations INSERT a tuple. DELETE a tuple. MODIFY a tuple. Constraints should not be violated in updates. Example.

tauret
Download Presentation

CS 405G: Introduction to Database Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 405G: Introduction to Database Systems Lecture 6: Relational Algebra Instructor: Chen Qian Spring 2014 p

  2. Review Chen Qian, University of Kentucky

  3. Update Operations on Relations • Update operations • INSERT a tuple. • DELETE a tuple. • MODIFY a tuple. • Constraints should not be violated in updates Chen Qian, University of Kentucky

  4. Example • We have the following relational schemas • Student(sid: string, name: string, gpa: float) • Course(cid: string, department: string) • Enrolled(sid: string, cid: string, grade: character) • We have the following sequence of database update operations. (assume all tables are empty before we apply any operations) • INSERT<‘1234’, ‘John Smith’, ‘3.5> into Student Chen Qian, University of Kentucky

  5. Example (Cont.) • INSERT<‘647’, ‘EECS’> into Courses • INSERT<‘1234’, ‘647’, ‘B’> into Enrolled • UPDATE the grade in the Enrolled tuple with sid = 1234 and cid = 647 to ‘A’. • DELETE the Enrolled tuple with sid 1234 and cid 647 Chen Qian, University of Kentucky

  6. Exercise • INSERT<‘108’, ‘MATH’> into Courses • INSERT<‘1234’, ‘108’, ‘B’> into Enrolled • INSERT<‘1123’, ‘Mary Carter’, ‘3.8’> into Student Chen Qian, University of Kentucky

  7. Exercise (cont.) • A little bit tricky • INSERT<‘1125’, ‘Bob Lee’, ‘good’> into Student • Fail due to domain constraint • INSERT<‘1123’, NULL, ‘B’> into Enrolled • Fail due to entity integrity • INSERT <‘1233’,’647’, ‘A’> into Enrolled • Failed due to referential integrity Chen Qian, University of Kentucky

  8. Exercise (cont.) • A more tricky one • UPDATE the cid in the tuple from Course where cid = 108 to 109 Chen Qian, University of Kentucky

  9. Update Operations on Relations • In case of integrity violation, several actions can be taken: • Cancel the operation that causes the violation (REJECT option) • Perform the operation but inform the user of the violation • Trigger additional updates so the violation is corrected (CASCADE option, SET NULL option) • Execute a user-specified error-correction routine Chen Qian, University of Kentucky

  10. Relational Query Languages • Query languages: Allow manipulation and retrieval of data from a database. • Relational model supports simple, powerful QLs: • Strong formal foundation based on logic. • Allows for much optimization. • Query Languages != programming languages! • QLs not intended to be used for complex calculations and inference (e.g. logical reasoning) • QLs support easy, efficient access to large data sets.

  11. Formal Relational Query Languages Two mathematical Query Languages form the basis for “real” languages (e.g. SQL), and for implementation: Relational Algebra: More operational, very useful for representing execution plans. Relational Calculus: Lets users describe what they want, rather than how to compute it. (Non-procedural, declarative.) • Understanding Algebra & Calculus is key to understanding SQL, query processing!

  12. RelOp RelOp Relational algebra • Core set of operators: • Selection, projection, cross product, union, difference, and renaming • Additional, derived operators: • Join, natural join, intersection, etc. • Compose operators to make complex queries A language for querying relational databases based on operators: Chen Qian, University of Kentucky

  13. Selection • Input: a table R • Notation: pR • p is called a selection condition/predicate • Purpose: filter rows according to some criteria • Output: same columns as R, but only rows of R that satisfy p Chen Qian, University of Kentucky

  14. GPA > 3.0 Selection example • Students with GPA higher than 3.0  GPA > 3.0Student Chen Qian, University of Kentucky

  15. More on selection • Selection predicate in general can include any column of R, constants, comparisons (=, >, etc.), and Boolean connectives (: and, : or, and ¬ : negation (not) ) • Example: straight A students under 18 or over 21 GPA= 4.0  (age < 18 age > 21)Student • But you must be able to evaluate the predicate over a single row of the input table • Example: student with the highest GPA GPA>= all GPA in Student tableStudent Chen Qian, University of Kentucky

  16. Projection • Input: a table R • Notation: πLR • L is a list of columns in R • Purpose: select columns to output • Output: same rows, but only the columns in L • Order of the rows is preserved • Number of rows may be less (depends on where we have duplicates or not) Chen Qian, University of Kentucky

  17. πSID, name Projection example • ID’s and names of all students πSID, nameStudent Chen Qian, University of Kentucky

  18. πage More on projection • Duplicate output rows are removed (by definition) • Example: student ages πageStudent Chen Qian, University of Kentucky

  19. Cross product • Input: two tables R and S • Notation: R×S • Purpose: pairs rows from two tables • Output: for each row r in R and each row s in S, output a row rs (concatenation of r and s) Chen Qian, University of Kentucky

  20. × Cross product example • Student × Enroll Chen Qian, University of Kentucky

  21. A note on column ordering • The ordering of columns in a table is considered unimportant (as is the ordering of rows) = • That means cross product is commutative, i.e.,R × S = S × R for any R and S Chen Qian, University of Kentucky

  22. Derived operator: join • Input: two tables R and S • Notation: R pS • p is called a join condition/predicate • Purpose: relate rows from two tables according to some criteria • Output: for each row r in R and each row s in S, output a row rs if r and s satisfy p • Shorthand for σp ( RXS )

  23. Student.SID = Enroll.SID Join example • Info about students, plus CID’s of their courses Student(Student.SID = Enroll.SID)Enroll Use table_name. column_name syntaxto disambiguate identically named columns from different input tables

  24. Derived operator: natural join • Input: two tables R and S • Notation: R S • Purpose: relate rows from two tables, and • Enforce equality on all common attributes • Eliminate one copy of common attributes • Shorthand for πL ( R pS ), where • p equates all attributes common to R and S • L is the union of all attributes from R and S, with duplicate attributes removed

  25. Natural join example • Student Enroll = πL ( Student pEnroll ) = πSID, name, age, GPA, CID ( Student Student.SID = Enroll.SIDEnroll )

  26. Union • Input: two tables R and S • Notation: R S • R and S must have identical schema • Output: • Has the same schema as R and S • Contains all rows in R and all rows in S, with duplicate rows eliminated

  27. Difference • Input: two tables R and S • Notation: R-S • R and S must have identical schema • Output: • Has the same schema as R and S • Contains all rows in R that are not found in S

  28. Derived operator: intersection • Input: two tables R and S • Notation: R\S • R and S must have identical schema • Output: • Has the same schema as R and S • Contains all rows that are in both R and S • Shorthand for R- ( R-S ) • Also equivalent to S - ( S-R ) • And to RS Jinze Liu @ University of Kentucky

  29. Renaming • Input: a table R • Notation: ρSR, ρ(A1, A2, …)R or ρS(A1, A2, …)R • Purpose: rename a table and/or its columns • Output: a renamed table with the same rows as R • Used to • Avoid confusion caused by identical column names • Create identical columns names for natural joins

  30. Enroll1(SID1, CID1,Grade1) Renaming Example • Enroll1(SID1, CID1,Grade1) Enroll

  31. Review: Summary of core operators • Selection: • Projection: • Cross product: • Union: • Difference: • Renaming: • Does not really add “processing” power σpR πLR RXS RS R- S ρS(A1, A2, …)R

  32. Review Summary of derived operators • Join: • Natural join: • Intersection: RpS RS RS • Many more • Outer join, Division, • Semijoin, anti-semijoin, …

  33. Red parts pid of red parts Catalog having red parts

  34. sid of suppliers who support Red parts names of suppliers who support Red parts

More Related