1 / 33

C20.0046: Database Management Systems Lecture #3

C20.0046: Database Management Systems Lecture #3. Matthew P. Johnson Stern School of Business, NYU Spring, 2005. Admin. Textbooks? This afternoon. Agenda. Last time: E/R models, some design issues This time: More design “carving at the joints” Redundancy

shen
Download Presentation

C20.0046: Database Management Systems Lecture #3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. C20.0046: Database Management SystemsLecture #3 Matthew P. Johnson Stern School of Business, NYU Spring, 2005 M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  2. Admin • Textbooks? • This afternoon M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  3. Agenda • Last time: E/R models, some design issues • This time: More design “carving at the joints” • Redundancy • Whether an element should be an attribute or entity set • Replacing a relationships with entity sets • Constraints • Identifying & specifying key attributes to an entity set • Recognizing other types of single-valued constraints • Representing referential integrity constraints • Identifying & representing general constraints • Weak entity sets M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  4. Review • Multiplicity review: • Square-of? (e.g., (3,9)) • Cube-of? (e.g., (-3,-27)) • Wife-of? • Wife-of-in-certain-other-cultures? M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  5. Design Principles • Faithfulness • Simplicity • Avoiding redundancy • Choice of relationships • Picking elements M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  6. Owned-by Owns Movies Ownings Studios Simplicity • Einstein: Theories should be as simple as possible, but not simpler. • Use as few elements as possible • Minimum required relations • No unnecessary attributes (will you be using this attribute?) • Eliminate “spinning wheels” • Example: how can we simplify this? M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  7. Name Own Studios Movies Length Address StudioName Name Avoiding redundancy • Say everything exactly once • Minimize database storage requirements • More important: prevent possible update errors • simplest but not only e.g.: modify data one place but not the other – more later • Example: Spot the redundancy M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  8. Avoiding redundancy • Say everything exactly once • Minimize database storage requirements • More important: prevent possible update errors • simplest but not only e.g.: modify data one place but not the other – more later • Example: Spot the redundancy Name Own Studios Movies Length Address Phone StudioName Name Redundancy: Movies “knows” the studio two ways M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  9. Name Length Studio SAddress SPhone Pulp Fiction … Miramax NYC 212-… Sylvia … Miramax NYC 212-… Jay & Sil. Bob … Miramax NYC 212-… … Spot more redundancy Length SPhone Name Movies SAddress StudioName Different redundancy: studio info listed for every movie! M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  10. Enrolls Students Courses Assist TA-of TAs Don’t add relships that are implied Suppose each course again has <=1 TA Q: Is the following good design? A: If TAs other than the course’s TA can help students, then yes; if not, then no: we can connect Students and TAs by going through Courses; redundant! M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  11. Correct E/R models may contain loops name category • Person plays multiple roles: • employee of company • buyer of product name price makes Company Product stockprice buys employs Person name ssn address M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  12. More design Q: What’s wrong with this design? • Repeating TA names & IDs – redundant • TA is not TAing any course now  lose TA’s data! • TA should get its own ES Course-ID CName Enrolls Students Courses TA-Email TA-Name TA-ID A: M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  13. Opposite problem: Entity or attribute? • Some E/Rs improved by removing entities • Can convert Entity E into attributes of F if • R:FE is many-one • one-one counts because special case • Attributes for E are independent of each other • knowing one att val doesn’t tell us another att val • Then • remove E • add all attributes of E to F M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  14. Course-ID CName Enrolls Students Courses Room TA-Name Entity  attribute Course-ID CName Enrolls Students Courses Room Assists TA TA-Name M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  15. CName CID Room TA-Name DBMS 46 123 Howard DBMS 46 123 Wesley … Convert TA entity again? Course-ID Enrolls • No! Multiple TAs allowed • Violates condition (1) • Redundant course data Students Courses CName Assists Room TA TA-Name M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  16. CName TA-Name TA-ID TA-Color DBMS Ralph 678 Green A.Soft. Ralph 678 Green … Convert TA entity again? Course-ID • No! TA has dependent fields • Violates condition (2) • How can it tell? • Redundant TA data Enrolls Students Courses CName Assists Room TA TA-Name TA-ID TA-Favorite-Color M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  17. Entity or attributes? • Should student address be an entity or an attribute? • If student may have multiple addresses, must be entity • campus address, permanent address • attributes cannot be set-valued • If we need to examine structure of address, must be entity • find all students from NYS but not NYC • If attribute, then it’s probably a simple string • no structure! • NB: this choice is a microcosm of entire miniworld • (much) power of a DB comes from the structure imposed on the data M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  18. Larger example DB design • Application: library database. Authors have written books about various subjects; different libraries in the system may carry these books. • Entities (with attributes in parentheses): • Authors (ssn, name, phone, birthdate) • Books (ISDN, title) • Subjects (sname, sid) • Libraries (lname) • Relationships [associating entities in square brackets]: • Wrote-on [Authors, Subjects] • Cover [Libraries, Subjects] • On [Books, Subjects] M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  19. E/R of DB design Name ssn phone birthdate Author ISBN wrote-on On Book SName Subject Title Carries LName Library M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  20. Poor initial design • First design is a poor model of this system • Some info not captured: • How many copies does a lib. have of a given book? • What edition of a book does the library have? • Design problems: • no direct relship associating authors and books • no direct relship associating libraries and books • Common queries complex and difficult/expensive • What libraries carry books by a given author? • What books has a given author written? • Who is the author of a given book? M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  21. Larger example DB design 2 • Application: library database as before • Entities (with attributes in parentheses): • Authors (ssn, name, phone, birthdate) • Books (ISDN, title) • Subjects (sname, sid) • Libraries (lname) • Relations [associating entities in square brackets] (attributes in parentheses): • Wrote [Authors, Books] • Carries [Libraries, Books] (quantity, edition) • On [Books , Subjects] M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  22. E/R of improved DB design Name ssn phone birthdate Author SName wrote On Subject ISBN Book • Rule of thumb: often queried together  make closely connected Edition Title Carries Quantity LName Library M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  23. Next topic: Constraints • Review: programmer-defined rules stating what should always be true about consistent databases • Restrictions on data: • Keys (e.g. SSNs uniquely identify people) • Single value constraints (e.g. everyone has 1 father) • Referential Integrity (e.g. person’s record refers to father  father must exist) • Domain constraints (e.g. gender in M/F, age in 0..150) • General constraints (e.g. no more than 10 customers per sales rep) • Can’t infer constraints from data • may hold “accidentally” • they are a part of the schema M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  24. E/R keys • Uniquely identifies entity in ES • Attribute or set of attributes • Two entities cannot agree on all key attributes • These attributes determine all others • Every ES should have a key • possibly including all attributes • Primary key attributes underlined • More than one possible key: • Candidate keys, primary key • Practical tip: create intentional key attribute • E.g. SSN, course-id, employee-id, etc. • SSN likely shorter than (name,address) • Prevents quasi-redundancy Person name ssn address M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  25. Single-valued constraints • “at most one” value • sharp arrows • E.g. attributes: could be null or one • Many-one relationships: the “one” part is single-valued. • Can think of key atts as (non-null) single-valued Assists Course TA M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  26. Referential integrity • “Exactly one value” • NOT NULL attributes • Relationships • Non-null value refers to entity that exists • Refer to entity with foreign key • HTML analogy: no broken links • Programming analogy: no dangling pointers • Ways of handling deletion: • Prevent deletion as long as referrer exist • Enforce deletion of all referrers Taught Course Instructor M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  27. Referential integrity – E/R e.g. Enrolls • Insertion – must refer to existing entity • Suppose need to add • course: “Oracle” • instructor: MPJ • Q: Which order? • Q: What if relship were exactly-exactly, say, M(Hs,Ws)? • i.e., referential integrity in both directions? • A: Put both inserts in one xact – later Students Courses Taught Instructor M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  28. Other kinds of constraints • Domain constraints • E.g. date: must be after 1980 • Enumerated type: grades A through F, no E • No specific E/R notation: mention with attribute or relationship • General constraints: • A class may have no more than 100 students; a student may not have more than 6 courses: Enroll Students <=100 <=6 Courses M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  29. Next topic: Weak entity sets • Definition: • Some or all key attributes belong to another ES • Why: • An entity set is part of a hierarchy (not ISA) • Connecting entity sets • The key consists of • 0, 1 or more of its own attributes • Key attributes of entity sets from supporting relationships M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  30. Conditions of Supporting relationships • Supporting relationship R:EF • R is many-one (E-F) (or one-one) • R is binary • Referential integrity from E to F • a rounded arrow • Those atts supplied to E are thekey attributes of F • F itself may be weak • Another entity set G, and so on recursively R  A1 F E A2 M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  31. Requirements for weak entity sets • For several supporting relships from E to F • Keys of each F role appear as foreign key of E • Other many-one relationships • Not necessarily supporting From  Purchases A1 People  A2 By At-store Stores A3 M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  32. Weak entity sets • Example: Hierarchy – species & genus • Idea: species name unique per genus only  Species Belongs-to Genus name name M.P. Johnson, DBMS, Stern/NYU, Spring 2005

  33. Next time • We’ll finish E/R models and begin the relational model • Read chapter 3 through section 3.4 • Info on project, hw likely posted soon M.P. Johnson, DBMS, Stern/NYU, Spring 2005

More Related