1 / 36

Schedule

Schedule. Today Oct. 9 (T) Multivalued Dependencies, Relational Algebra Read Sections 3.7, 5.1-5.2. Assignment 2 due. Oct. 11 (TH) SQL Queries. Read Sections 6.1-6.3. Project Part 2 due. Oct. 16 (T) Duplicates, Aggregation, Modifications. Read Sections 5.3-5.4, 6.4-6.5. Assignment 3 due.

xia
Download Presentation

Schedule

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Schedule • Today Oct. 9 (T) Multivalued Dependencies, Relational Algebra • Read Sections 3.7, 5.1-5.2. Assignment 2 due. • Oct. 11 (TH) SQL Queries. • Read Sections 6.1-6.3. Project Part 2 due. • Oct. 16 (T) Duplicates, Aggregation, Modifications. • Read Sections 5.3-5.4, 6.4-6.5. Assignment 3 due. Arthur Keller – CS 180

  2. Multivalued Dependencies The multivalued dependencyXY holds in a relation R if whenever we have two tuples of R that agree in all the attributes of X, then we can swap their Y components and get two new tuples that are also in R. Arthur Keller – CS 180

  3. Example Drinkers(name,addr,phones,beersLiked) with MVD Namephones. If Drinkers has the two tuples: it must also have the same tuples with phones components swapped: Note: we must check this condition for all pairs of tuples that agree on name, not just one pair. Arthur Keller – CS 180

  4. MVD Rules 1. Every FD is an MVD. • Because if X Y, then swapping Y’s between tuples that agree on X doesn’t create new tuples. • Example, in Drinkers: nameaddr. 2. Complementation: if XY, then XZ, where Z is all attributes not in X or Y. • Example: since namephonesholds in Drinkers, so doesNameaddr beersLiked. Arthur Keller – CS 180

  5. Splitting Doesn’t Hold Sometimes you need to have several attributes on the right of an MVD. For example: Drinkers(name, areaCode, phones, beersLiked, beerManf) • nameareaCode phones holds, but neithernameareaCode nor namephones do. Arthur Keller – CS 180

  6. 4NF Eliminate redundancy due to multiplicative effect of MVD's. • Roughly: treat MVD's as FD's for decomposition, but not for finding keys. • Formally: R is in Fourth Normal Form if whenever MVD XY is nontrivial (Y is not a subset of X, and X Y is not all attributes), then X is a superkey. • Remember, X Y implies XY, so 4NF is more stringent than BCNF. • Decompose R, using4NF violation XY,into XY and X  (R—Y). Arthur Keller – CS 180

  7. Example Drinkers(name,addr,phones,beersLiked) • FD: nameaddr • Nontrivial MVD’s: namephones andnamebeersLiked. • Only key: {name, phones, beersLiked} • All three dependencies above violate 4NF. • Successive decomposition yields 4NF relations: D1(name, addr) D2(name, phones) D3(name, beersLiked) Arthur Keller – CS 180

  8. “Core” Relational Algebra A small set of operators that allow us to manipulate relations in limited but useful ways. The operators are: 1. Union, intersection, and difference: the usual set operators. • But the relation schemas must be the same. 2. Selection: Picking certain rows from a relation. 3. Projection: Picking certain columns. 4. Products and joins: Composing relations in useful ways. 5. Renaming of relations and their attributes. Arthur Keller – CS 180

  9. Relational Algebra • limited expressive power (subset of possible queries) • good optimizer possible • rich enough language to express enough useful things Finiteness  SELECT π PROJECT X CARTESIAN PRODUCT FUNDAMENTAL U UNION BINARY – SET-DIFFERENCE  SET-INTERSECTION THETA-JOIN CAN BE DEFINED NATURAL JOIN IN TERMS OF ÷ DIVISION or QUOTIENT FUNDAMENTAL OPS UNARY Arthur Keller – CS 180

  10. Extra Example Relations DEPOSIT(branch-name, acct-no,cust-name,balance) CUSTOMER(cust-name,street,cust-city) BORROW(branch-name,loan-no,cust-name,amount) BRANCH(branch-name,assets, branch-city) CLIENT(cust-name,empl-name) Borrow B-N L-# C-N AMT T1 Midtown 123 Fred 600 T2 Midtown 234 Sally 1200 T3 Midtown 235 Sally 1500 T4 Downtown 612 Tom 2000 Arthur Keller – CS 180

  11. Selection R1 = C(R2) where C is a condition involving the attributes of relation R2. Example Relation Sells: JoeMenu = bar=Joe's(Sells) Arthur Keller – CS 180

  12. SELECT () arity((R)) = arity(R) 0  card((R))  card(R) c (R) c (R) (R) c is selection condition: terms of form: attr op value attr op attr op is one of < = >  ≠ ≥ example of term: branch-name = "Midtown" terms are connected by  branch-name = "Midtown"  amount > 1000 (Borrow) cust-name = emp-name (client) Arthur Keller – CS 180

  13. Projection R1 = L(R2) where L is a list of attributes from the schema of R2. Example beer,price(Sells) • Notice elimination of duplicate tuples. Arthur Keller – CS 180

  14. Projection (π) 0  card (π A (R))  card (R) arity (π A (R)) = m  arity(R) = k π i1,...,im (R) 1  ij k distinct produces set of m-tuples a 1 ,...,am such that k-tuple b1,...,bk in R where aj = bij for j = 1,...,m π branch-name, cust-name(Borrow) Midtown Fred Midtown Sally Downtown Tom Arthur Keller – CS 180

  15. Product R = R1R2 pairs each tuple t1 of R1 with each tuple t2 of R2 and puts in R a tuple t1t2. Arthur Keller – CS 180

  16. Cartesian Product () arity(R) = k1 arity(R  S) = k1 + k2 arity(S) = k2 card(R  S) = card(R)  card(S) R  S is the set all possible (k1 + k2)-tuples whose first k1 attributes are a tuple in R last k2 attributes are a tuple in S R S R  S A B C D D E F A B C D D' E F Arthur Keller – CS 180

  17. Theta-Join R = R1CR2is equivalent to R = C(R1R2). Arthur Keller – CS 180

  18. Example Sells = Bars = BarInfo = Sells Sells.Bar=Bars.Name Bars Arthur Keller – CS 180

  19. Theta-Join R S arity(R) = r arity(S) = s arity (R S) = r + s 0  card(R S) card(R)  card(S) i j R(A B C) S(C D E) result has schema T(A B C C' D E)  $i $rj)R  S)  R S 1 . . . r 1 . . . s i j  can be < > = ≠  If equal (=), then it is anEQUIJOIN  (R  S) R S = c R(ABC) S(CDE) T(ABCC’DE) 1 3 5 2 1 1 1 3 5 1 2 2 2 4 6 1 2 2 1 3 5 3 3 4 3 5 7 3 3 4 1 3 5 4 4 3 4 6 8 4 4 3 2 4 6 3 3 4 2 4 6 4 4 3 3 5 7 4 4 3 c R.A<S.D Arthur Keller – CS 180

  20. Natural Join R = R1R2 calls for the theta-join of R1 and R2 with the condition that all attributes of the same name be equated. Then, one column for each pair of equated attributes is projected out. Example Suppose the attribute name in relation Bars was changed to bar, to match the bar name in Sells. BarInfo = Sells Bars Arthur Keller – CS 180

  21. Renaming S(A1,…,An)(R) produces a relation identical to R but named S and with attributes, in order, named A1,…,An. Example Bars = R(bar,addr) (Bars) = • The name of the second relation is R. Arthur Keller – CS 180

  22. Union (R  S) arity(R) = arity(S) = arity(R  S) max(card(R),card(S)) card(R  S)card(R) + card(S) set of tuples in R or S or both R R  S S R  S Find customers of Perryridge Branch πCust-Name (Branch-Name = "Perryridge" (BORROW  DEPOSIT) ) Arthur Keller – CS 180

  23. Difference(R S) arity(R) = arity(S) = arity(R–S) 0 card(R –S)card(R) R – S R is the tuples in R not in S Depositors of Perryridge who aren't borrowers of Perryridge πCust-Name (Branch-Name = "Perryridge" (DEPOSIT – BORROW) ) Deposit < Perryridge, 36, Pat, 500 > Borrow < Perryridge, 72, Pat, 10000 > πCust-Name (Branch-Name = "Perryridge" (DEPOSIT) ) —πCust-Name (Branch-Name = "Perryridge" (BORROW) ) Does (π (D)  π (B) ) work? Arthur Keller – CS 180

  24. Combining Operations Algebra = 1. Basis arguments + 2. Ways of constructing expressions. For relational algebra: 1. Arguments = variables standing for relations + finite, constant relations. 2. Expressions constructed by applying one of the operators + parentheses. • Query = expression of relational algebra. Arthur Keller – CS 180

  25. πCust-Name,Cust-City (CLIENT.Banker-Name = "Johnson" (CLIENT  CUSTOMER) ) = π Cust-Name,Cust-City (CUSTOMER) • Is this always true? πCLIENT.Cust-Name, CUSTOMER.Cust-City (CLIENT.Banker-Name = "Johnson"  CLIENT.Cust-Name = CUSTOMER.Cust-Name (CLIENT  CUSTOMER) ) πCLIENT.Cust-Name, CUSTOMERCust-City (CLIENT.Cust-Name =CUSTOMER.Cust-Name (CUSTOMER  πCust-Name CLIENT.Banker-Name="Johnson" (CLIENT) ) ) ) Arthur Keller – CS 180

  26. SET INTERSECTION arity(R) = arity(S) = arity (R S) (R S) 0 card (R S) min (card(R), card(S)) tuples both in R and in S R  (R S) = R S R  S R R  S S R S Arthur Keller – CS 180

  27. Operator Precedence The normal way to group operators is: 1. Unary operators , , and  have highest precedence. 2. Next highest are the “multiplicative” operators, , C , and . 3. Lowest are the “additive” operators, , , and —. • But there is no universal agreement, so we always put parentheses around the argument of a unary operator, and it is a good idea to group all binary operators with parentheses enclosing their arguments. Example Group R ST as R  ((S ) T ). Arthur Keller – CS 180

  28. Each Expression Needs a Schema • If , , — applied, schemas are the same, so use this schema. • Projection: use the attributes listed in the projection. • Selection: no change in schema. • Product RS: use attributes of R and S. • But if they share an attribute A, prefix it with the relation name, as R.A, S.A. • Theta-join: same as product. • Natural join: use attributes from each relation; common attributes are merged anyway. • Renaming: whatever it says. Arthur Keller – CS 180

  29. Example • Find the bars that are either on Maple Street or sell Bud for less than $3. Sells(bar, beer, price) Bars(name, addr) Arthur Keller – CS 180

  30. Example Find the bars that sell two different beers at the same price. Sells(bar, beer, price) Arthur Keller – CS 180

  31. Linear Notation for Expressions • Invent new names for intermediate relations, and assign them values that are algebraic expressions. • Renaming of attributes implicit in schema of new relation. Example Find the bars that are either on Maple Street or sell Bud for less than $3. Sells(bar, beer, price) Bars(name, addr) R1(name) := name(addr = Maple St.(Bars)) R2(name) := bar(beer=Bud AND price<$3(Sells)) R3(name) := R1  R2 Arthur Keller – CS 180

  32. Why Decomposition “Works”? What does it mean to “work”? Why can’t we just tear sets of attributes apart as we like? • Answer: the decomposed relations need to represent the same information as the original. • We must be able to reconstruct the original from the decomposed relations. Projection and Join Connect the Original and Decomposed Relations • Suppose R is decomposed into S and T. We project R onto S and onto T. Arthur Keller – CS 180

  33. Example R = • Recall we decomposed this relation as: Arthur Keller – CS 180

  34. Project onto Drinkers1(name,addr,favoriteBeer): Project onto Drinkers3(beersLiked, manf): Project onto Drinkers4(name, beersLiked): Arthur Keller – CS 180

  35. Reconstruction of Original Can we figure out the original relation from the decomposed relations? • Sometimes, if we natural join the relations. Example Drinkers3 Drinkers4 = • Join of above with Drinkers1 = original R. Arthur Keller – CS 180

  36. Theorem Suppose we decompose a relation with schema XYZ into XY and XZ and project the relation for XYZ onto XY and XZ. Then XYXZ is guaranteed to reconstruct XYZ if and only if XY (or equivalently, XZ). • Usually, the MVD is really a FD, XY or XZ. • BCNF: When we decompose XYZ into XY and XZ, it is because there is a FD X Y or X Z that violates BCNF. • Thus, we can always reconstruct XYZ from its projections onto XY and XZ. • 4NF: when we decompose XYZ into XY and XZ, it is because there is an MVD XY or XZ that violates 4NF. • Again, we can reconstruct XYZ from its projections onto XY and XZ. Arthur Keller – CS 180

More Related