1 / 67

CS411 Database Systems

shelley
Download Presentation

CS411 Database Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


    1. CS411 Database Systems Kazuhiro Minami

    2. Given a relation constructed from an ER diagram, what is its key? 2

    3. Given a relation constructed from an ER diagram, what is its key? 3

    4. Given a relation constructed from an ER diagram, what is its key? 4

    5. 5 The textbook has more rules on finding keys: Many-one, one-many, one-one relationships Multi-way relationships Weak entity sets (but there is no substitute for common sense)

    6. Reminder: redundancy causes trouble

    7. Non-solution: multiple values in one field

    8. Your common sense will tell you how to fix this schema

    9. What if you dont have common sense? There is a theory to tell you what to do!

    10. Functional Dependencies

    11. Functional dependencies generalize the idea of a key

    12. EmpID ? Name, Phone, Office Office ? Phone Phone ? Office Name ? EmpID isnt likely to hold in all instances of this schema, though it holds in this instance More generally, an instance can tell you many FDs that dont hold, but not all those that do.

    13. Use your common sense to find the FDs in the world around you

    14. We can define keys in terms of FDs Key of a relation R is a set of attributes that functionally determines all attributes of R none of its proper subsets have this property. Superkey = set of attributes that contains a key.

    15. Reasoning with FDs 1) Closure of a set of FDs 2) Closure of a set of attributes Slide numberSlide number

    16. The closure S+ of a set S of FDs is the set of all FDs logically implied by S. R = {A, B, C, G, H, I} S = {A ? B, A ? C, CG ? H, CG ? I, B ? H} Does A ? H hold? You can prove whether it does!

    17. Compute the closure S+ of S using Armstrongs Axioms 1. Reflexivity A1 ... An ? every subset of A1 ... An 2. Augmentation If A1 ... An ? B1 ... Bm, then A1 ... An C1 ... Ck ? B1 ... Bm C1 ... Ck 3. Transitivity If A1 ... An ? B1 ... Bm and B1 ... Bm ? C1 ... Ck, then A1 ... An ? C1 ... Ck

    18. How to compute S+ using Armstrong's Axioms S+ = S; loop { For each f in S, apply the reflexivity and augmentation rules and add the new FDs to S+. For each pair of FDs in S, apply the transitivity rule and add the new FDs to S+ } until S+ does not change any more.

    19. You can infer additional rules from Armstrongs Axioms Union If X ? Y and X ? Z, then X ? YZ (X, Y, Z are sets of attributes) Decomposition X ? YZ, then X ? Y and X ? Z Pseudo-transitivity X ? Y and YZ ? U, then XZ ? U - How to infer: e.g., the union rule: 1. X? Y; XX ? XY or X ? XY --- augmentation 2. X? Z; XY ? YZ augmentation 3. X ? XY ? YZ transitivity -- e.g., Decomposition 1. YZ ? Y, YZ ? Z, reflexivity 2. X ? YZ ? Y, X ? YZ ? Z transitivity - How to infer: e.g., the union rule: 1. X? Y; XX ? XY or X ? XY --- augmentation 2. X? Z; XY ? YZ augmentation 3. X ? XY ? YZ transitivity -- e.g., Decomposition 1. YZ ? Y, YZ ? Z, reflexivity 2. X ? YZ ? Y, X ? YZ ? Z transitivity

    20. The closure of a set of attributes contains everything they functionally determine

    21. It is easy to compute the closure of a set of attributes

    23. What is the attribute closure good for? Test if X is a superkey compute X+, and check if X+ contains all attrs of R Check if X ? Y holds by checking if Y is contained in X+ Another (not so clever) way to compute closure S+ of FDs for each subset of attributes X in relation R, compute X+ with respect to S for each subset of attributes Y in X+, output the FD X ? Y

    24. Reminder: intended goals of schema refinement Minimize redundancy Avoid information loss Easy to check dependencies Ensure good query performance

    25. Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = obsolete Boyce Codd Normal Form (BCNF) Third Normal Form (3NF) Fourth Normal Form (4NF) Others...

    26. Boyce-Codd Normal Form

    28. What if we are in a situation where Phone Number ? SSN?

    29. What about that alternative schema we recommended earlier---are they in BCNF?

    30. What about that alternative schema we recommended earlier---are they in BCNF?

    31. What about that alternative schema we recommended earlier---are they in BCNF?

    32. Announcements Course project Stage 1 is due today Stage 2 (E/R modeling) is due on Sep. 16 Stage 3 (Relational schema design & implementation plan) is due on Oct. 2 Homework 1 will be posted today Due is at 3:15pm on Sep. 23

    33. Tips for Working Successfully in a Group (by Randy Pausch) Meet people properly Find things in common Try for optimal meeting conditions Let everyone talk Check egos at the door Praise each other Phrase alternatives as questions

    34. Boyce-Codd Normal Form

    36. What about that alternative schema we recommended earlier---are they in BCNF?

    37. Name ? Price, Category What are the keys for this one? Is it in BCNF?

    38. Name ? Price, Category What are the keys for this one? Is it in BCNF?

    39. If relation R is not in BCNF, you can pull out the violating part(s) until it is.

    40. 2. Break R into R1 and R2 as follows.

    41. 3. Repeat until all relations are in BCNF.

    42. Can you turn this one into BCNF?

    43. One more split needed to reach BCNF

    44. An Official BCNF Decomposition Algorithm Input: relation R, set S of FDs over R. Output: a set of relations in BCNF. 1. Compute S+. 2. Compute keys for R (from ER or from S+). 3. Use S+ and keys to check if R is in BCNF. If not: a. Pick a violation FD A ? B. b. Expand B as much as possible, by computing A+. c. Create R1 = A+, and R2 = A ? (R ? A+). d. Find the FDs over R1, using S+. Repeat for R2. e. Recurse on R1 & its set of FDs. Repeat for R2. 4. Else R is already in BCNF; add R to the output.

    45. Any good schema decomposition should be lossless. 45

    46. Natural Join R= S= R S =

    47. A lossy decomposition gives you too many tuples! 47

    48. BCNF decompositions are always lossless.

    49. Why dont we get garbage? 49

    50. Why dont we get garbage?

    51. BCNF doesnt always have a dependency-preserving decomposition. Third normal form may be preferable to having to take a join to check dependencies after an update.

    52. Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Boyce Codd Normal Form (BCNF) Third Normal Form (3NF) Fourth Normal Form (4NF) Others...

    53. A schema doesnt preserve dependencies if you have to do a join to check an FD

    54. A schema doesnt preserve dependencies if you have to do a join to check an FD

    55. Synthesis Algorithm for 3NF Schemas 1. Find a minimal basis G of the set of FDs for relation R 2. For each FD X?A in G, add a relation with attributes XA 3. If none of the relation schemas from Step 2 is a superkey for R, add a relation whose schema is a key for R

    56. If a BCNF decomposition doesnt preserve dependencies, use 3rd Normal Form instead.

    57. Youll rarely need to use 3NF. Normally, youll be able to get your schema into BCNF while preserving dependencies. If you cant, then use 3NF unless you have a good reason not to.

    58. Minimal Basis A set of FDs F is a minimal basis of a set of dependencies E if E = F+ Every dependency in F has a single attribute for its right-hand side Cannot remove any dependency from F or remove attributes from the left side of any FD in F (minimality) Example: E = {A?B, A?C, B?A, B?C, C?A, C?B} F = {A?B, B?C, C?A}

    59. Normal Forms First Normal Form = all attributes are atomic Second Normal Form (2NF) = old and obsolete Boyce Codd Normal Form (BCNF) Third Normal Form (3NF) Fourth Normal Form (4NF) Others...

    60. BCNF doesnt catch every kind of redundancy (much less every bad schema)

    61. Definition of Multi-valued Dependency 61

    62. You can tear apart a relation R with an MVD. If A1 An ? B1 Bm holds in R, then the decomposition R1(A1, , An, B1,, Bm) R2(A1, , An, C1 ,, Ck) is lossless. Note: an MVD A1 An ? B1 Bm implicitly talks about the other attributes C1, , Ck.

    63. The inference rules for MVDs are not the same as the ones for FDs. The most basic one: If A1 An ? B1 Bm, then A1 An ? B1 Bm. Other rules in the book.

    64. 4th Normal Form (4NF) R is in 4NF if for every nontrivial MVD A1,,An ? B1,, Bm, {A1,,An} is a superkey.

    65. MVD Summary: Parent ? Child X ? Y means that given X, there is a unique set of possible Y values (which do not depend on other attributes of the relation) MVD problems arise if there are two independent 1:N relationships in a relation. An FD is also a MVD. Theres lots more MVD theory, but we wont go there.

    66. Confused by Normal Forms ?

    67. What people actually do Object DBs and object-relational DBs Generally permit non-atomic attributes 1st normal form not required Data warehouses Huge append-only historical databases Joins expensive or impractical Argues against normalization Everyday relational DBs Aim for BCNF Use your understanding of the application to identify other problematic redundancies

More Related