1 / 39

C20.0046: Database Management Systems Lecture #10

C20.0046: Database Management Systems Lecture #10. Matthew P. Johnson Stern School of Business, NYU Spring, 2004. Agenda. Last time: R.A., Bags This time: Finish R.A. Begin SQL Project Part 2 due now Something else assigned soon. Relational Algebra Review. Five basic operators:

adriel
Download Presentation

C20.0046: Database Management Systems Lecture #10

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. C20.0046: Database Management SystemsLecture #10 Matthew P. Johnson Stern School of Business, NYU Spring, 2004 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  2. Agenda • Last time: R.A., Bags • This time: • Finish R.A. • Begin SQL • Project Part 2 due now • Something else assigned soon M.P. Johnson, DBMS, Stern/NYU, Sp2004

  3. Relational Algebra Review • Five basic operators: • Union:  Intersection: Difference: - • Selection: s • Projection: P • Cartesian Product:  • Extended operators: • Joins (equijoin, theta join, semijoin, outerjoin) • Renaming: r • Extended projection P • Sorting t • Grouping-and-aggregation op g M.P. Johnson, DBMS, Stern/NYU, Sp2004

  4. Sorting • So far, everything’s an unordered bag • But sometimes order is nice • Sort op tL(R) produces a list, not a bag • No operators operate on lists •  if sort called, generally last op • Subscript L = a1,a2,… in op is the list of attributes to sort on • Rows sorted by attributes • Rows with same a1 value sorted by a2, etc. M.P. Johnson, DBMS, Stern/NYU, Sp2004

  5. Outerjoin • Like L ⋈ R except that dangling tuples are included, padded with nulls • Left outerjoin: dangling tuples from L are include • Nulls appear “on the right” • Right outerjoin: dangling tuples from R are included • Nulls appear “on the left” M.P. Johnson, DBMS, Stern/NYU, Sp2004

  6. Constraints on Relations (5.5) • Ref. integ., FDs, other constraints are expressible in RA • Two basic tools: • R = Æ • Assert R is empty • R Í S • Assert R is a subset of S • NB: They’re equivalent • R Í S iff R – S = Æ • R = Æ iff R Í S-S (for arbitrary S) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  7. Expressing referential integrity • Relations: • Reps(ssn, name, etc.) • Clients(ssn, name, rssn) • Suppose we require: each client gets a sales rep • a client’s row contains an rssn  must have a rep with that ssn • How to require this in RA? • Every Clients.rssn must be in the set of Reps.ssns • Prssn(Clients) ÍPssn(Reps) • Or: Prssn(Clients) – Pssn(Reps) = Æ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  8. Expressing referential integrity • Also works for multiple attributes • Relations: • StarsIn(SName,Title,Year) • Movies(Title, Year, Length, Studio) • Require: every movie referenced by StarsIn to exist • Write: PTitle,Year(StarsIn) ÍPTitle,Year(Movies) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  9. Expressing FDs • Relation: • Employees(name,ssn,address,gender, etc.) • Has FD: ssnaddress • What does the FD mean? • No matter how we choose two rows, if they agree on ssn, then they agree on address • So, strategy: choose pairs all possible ways; • Select pairs that agree on ssn but not address; • Check how many we get • First, rename one copy to E1 and one to E2 • rE1(Employees), rE2(Employees) • Then: sE1.ssn=E2.ssn AND E1.address != E2.address(E1 x E2) = Æ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  10. Expressing domain constraints • Constraint on legal values for attributes • Employees(name,ssn,address,gender etc.) • Gender should be M/F • Select bad ones and check count • sgender!=‘F’ AND gender!=‘M’(Employees) = Æ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  11. Expressing other constraints • Relations: • MovieExecs(name, address, ssn, netWorth) • Studios(name, address, presSsn) • Constraint: Studio presidents must be worth at leat $10,000,000 • First, theta-join presSsn to ssn, then select ones w/ < $10M, then check count: • snetWorth<10000000(Studio ⋈presSsn=ssn MEs) = Æ • Or: Select MEs w/ >= $10M, then check that they contain all studio presidents: • PpresSsn(Studios) Í Pssn(snetWorth<10000000(MEs)) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  12. Recap: You are here • First part of course is done: conceptual foundations • You now know: • E/R Model • Relational Model • Relational Algebra • You now know how to: • Capture part of world as an E/R model • Convert E/R models to relational models • Convert relational models to good (normal) forms • Express queries in relational algebra • Next: • Create, update, query SQL tables • Write SQL/DB-connected applications M.P. Johnson, DBMS, Stern/NYU, Sp2004

  13. Next topic: SQL (6.1) • Standard language for querying and manipulating data Structured Query Language • Many standards: ANSI SQL, SQL92/SQL2, SQL3/SQL99 • Vendors support various subsets/extensions • We’ll do SQL99/Oracle • Basic form (many more bells and whistles in addition): SELECT attributes FROM relations (possibly multiple, joined) WHERE conditions (selections) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  14. Data Types in SQL • Characters: • CHAR(20) -- fixed length • VARCHAR(40) -- variable length • Numbers: • BIGINT, INT, SMALLINT, TINYINT • REAL, FLOAT -- differ in precision • MONEY • Times and dates: • DATE • DATETIME -- SQL Server M.P. Johnson, DBMS, Stern/NYU, Sp2004

  15. Table name “Tables” Attribute names Product Tuples or rows M.P. Johnson, DBMS, Stern/NYU, Sp2004

  16. Simple SQL Query Product SELECT *FROM ProductWHERE category=‘Gadgets’ “selection” M.P. Johnson, DBMS, Stern/NYU, Sp2004

  17. Simple SQL Query Product SELECT PName, Price, ManufacturerFROM ProductWHERE Price > 100 “selection” and “projection” M.P. Johnson, DBMS, Stern/NYU, Sp2004

  18. A Notation for SQL Queries Input Schema Product(PName, Price, Category, Manfacturer) SELECT Name, Price, ManufacturerFROM ProductWHERE Price > 100 Answer(PName, Price, Manfacturer) Output Schema M.P. Johnson, DBMS, Stern/NYU, Sp2004

  19. R.A.  SQL • R.A. Projection P SQL SELECT • R.A. Selection s  SQL WHERE • R.A. Join  SQL FROM • Comma-separated list… • What goes in the WHERE clause: • x = y, x < y, x <= y, etc. • For number, they have the usual meanings • For CHAR and VARCHAR: lexicographic ordering • Expected conversion between CHAR and VARCHAR • For dates and times, what you expect M.P. Johnson, DBMS, Stern/NYU, Sp2004

  20. R.A.  SQL • Movies(Title,Year,Length,inColor,Studio,Prdcr#) • Q: How long was Star Wars (1977), in R.A.? • Q: In SQL? • Q: Which Fox movies are are at least 100 minutes long, in R.A.? • Q: In SQL? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  21. R.A.  SQL • Reps(ssn, name, etc.) • Clients(ssn, name, rssn) • Q: Who are George’s clients, in R.A.? • Second answer from last time: • PClients.name(sReps.name=“George” and Reps.ssn=rssn(Reps x Clients)) • In SQL? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  22. The LIKE operator • s LIKE p: pattern matching on strings • p may contain two special symbols: • _ = any single character • % = zero or more chars • Product(Name, Price, Category, Manufacturer) • Find all products whose name contains ‘gizmo’: SELECT *FROM ProductsWHERE PName LIKE ‘%gizmo%’ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  23. The LIKE operator • Q: What it want to search for values containing a ‘%’? • PName LIKE ‘%%%’ won’t work • Instead, must use escape chars • In C/C++/J, prepend ‘\’ • In SQL, prepend an arbitrary escape char: • PName LIKE ‘x%%x%’ ESCAPE ‘x’ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  24. Eliminating Duplicates SELECT category FROM Product Compare to: SELECTDISTINCT category FROM Product M.P. Johnson, DBMS, Stern/NYU, Sp2004

  25. Ordering the Results • Ordering is ascending, unless you specify the DESC keyword per attribute. • Ordering is ascending, unless you specify the DESC keyword per attribute. SELECT pname, price, manufacturer FROM Product WHERE category=‘gizmo’ AND price > 50 ORDER BY price, pname SELECT pname, price, manufacturer FROM Product WHERE category=‘gizmo’ AND price > 50 ORDER BY price DESC, pname ASC M.P. Johnson, DBMS, Stern/NYU, Sp2004

  26. Ordering the Results SELECT Category FROM Product ORDER BY PName ? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  27. Ordering the Results SELECT DISTINCT category FROM Product ORDER BY category Compare to: ? SELECT DISTINCT category FROM Product ORDER BY PName M.P. Johnson, DBMS, Stern/NYU, Sp2004

  28. Joins in SQL (6.2) • Connect two or more tables: Product Company What is the connection between them? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  29. Joinbetween Productand Company Joins in SQL Product (pname, price, category, manufacturer) Company (cname, stockPrice, country) Find all products under $200 manufactured in Japan;return their names and prices. SELECT PName, PriceFROM Product, CompanyWHEREManufacturer=CName AND Country=‘Japan’ AND Price <= 200 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  30. Joins in SQL Product Company SELECT PName, PriceFROM Product, CompanyWHEREManufacturer=CName AND Country=‘Japan’ AND Price <= 200 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  31. Joins in SQL Product (pname, price, category, manufacturer) Company (cname, stockPrice, country) Find all countries that manufacture some product in the ‘Gadgets’ category. SELECT CountryFROM Product, CompanyWHEREManufacturer=CName AND Category=‘Gadgets’ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  32. Joins in SQL Product Company SELECT CountryFROM Product, CompanyWHEREManufacturer=CName AND Category=‘Gadgets’ What is the problem? What’s thesolution? M.P. Johnson, DBMS, Stern/NYU, Sp2004

  33. Joins Product (pname, price, category, manufacturer) Purchase (buyer, seller, store, product) Person(name, phone, city) Find names of Seattleites who bought Gadgets, and the names of the stores they bought such product from. SELECT DISTINCT name, storeFROM Person, Purchase, ProductWHEREpersname=buyer AND product = pname AND city=‘Seattle’ AND category=‘Gadgets’ M.P. Johnson, DBMS, Stern/NYU, Sp2004

  34. Disambiguating Attributes • Sometimes two relations have the same attr:Person(pname, address, worksfor)Company(cname, address) Whichaddress ? SELECT DISTINCT pname, addressFROM Person, CompanyWHEREworksfor = cname SELECT DISTINCT Person.pname, Company.addressFROM Person, CompanyWHEREPerson.worksfor = Company.cname M.P. Johnson, DBMS, Stern/NYU, Sp2004

  35. Tuple Variables Product (pname, price, category, manufacturer) Purchase (buyer, seller, store, product) Person(persname, phoneNumber, city) Find all stores that sold at least one product that the store‘BestBuy’ also sold: SELECT DISTINCTx.store FROM Purchase AS x, Purchase AS y WHERE x.product = y.product AND y.store = ‘BestBuy’ Answer (store) M.P. Johnson, DBMS, Stern/NYU, Sp2004

  36. Tuple Variables • Tuple variables introduced automatically: • Product ( name, price, category, manufacturer) • Becomes: • Doesn’t work when Product occurs more than once • In that case the user needs to define variables explicitly SELECTname FROMProduct WHEREprice > 100 SELECTProduct.name FROMProduct AS Product WHEREProduct.price > 100 M.P. Johnson, DBMS, Stern/NYU, Sp2004

  37. SQL Query Semantics SELECTa1, a2, …, ak FROMR1 AS x1, R2 AS x2, …, Rn AS xn WHEREConditions 1. Nested loops: Answer = {} for x1 in R1 do for x2 in R2 do ….. for xn in Rn do if Conditions then Answer = Answer  {(a1,…,ak)} return Answer M.P. Johnson, DBMS, Stern/NYU, Sp2004

  38. SQL Query Semantics SELECTa1, a2, …, ak FROMR1 AS x1, R2 AS x2, …, Rn AS xn WHEREConditions 2. Parallel assignment Doesn’t impose any order! Answer = {} for all assignments x1 in R1, …, xn in Rn do if Conditions then Answer = Answer  {(a1,…,ak)} return Answer M.P. Johnson, DBMS, Stern/NYU, Sp2004

  39. First Unintuitive SQLism • SELECT R.A FROM R, S, T WHERE R.A=S.A OR R.A=T.A • Looking for R  (S  T) • But what happens if T is empty? • See transcript of this in Oracle on sales M.P. Johnson, DBMS, Stern/NYU, Sp2004

More Related