C20 0046 database management systems lecture 3
Download
1 / 33

C20.0046: Database Management Systems Lecture #3 - PowerPoint PPT Presentation


  • 113 Views
  • Uploaded on

C20.0046: Database Management Systems Lecture #3. Matthew P. Johnson Stern School of Business, NYU Spring, 2005. Admin. Textbooks? This afternoon. Agenda. Last time: E/R models, some design issues This time: More design “carving at the joints” Redundancy

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' C20.0046: Database Management Systems Lecture #3' - shen


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
C20 0046 database management systems lecture 3

C20.0046: Database Management SystemsLecture #3

Matthew P. Johnson

Stern School of Business, NYU

Spring, 2005

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Admin
Admin

  • Textbooks?

    • This afternoon

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Agenda
Agenda

  • Last time: E/R models, some design issues

  • This time: More design “carving at the joints”

    • Redundancy

    • Whether an element should be an attribute or entity set

    • Replacing a relationships with entity sets

  • Constraints

    • Identifying & specifying key attributes to an entity set

    • Recognizing other types of single-valued constraints

    • Representing referential integrity constraints

    • Identifying & representing general constraints

  • Weak entity sets

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Review
Review

  • Multiplicity review:

    • Square-of? (e.g., (3,9))

    • Cube-of? (e.g., (-3,-27))

    • Wife-of?

    • Wife-of-in-certain-other-cultures?

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Design principles
Design Principles

  • Faithfulness

  • Simplicity

  • Avoiding redundancy

  • Choice of relationships

  • Picking elements

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Simplicity

Owned-by

Owns

Movies

Ownings

Studios

Simplicity

  • Einstein: Theories should be as simple as possible, but not simpler.

  • Use as few elements as possible

    • Minimum required relations

    • No unnecessary attributes (will you be using this attribute?)

    • Eliminate “spinning wheels”

  • Example: how can we simplify this?

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Avoiding redundancy

Name

Own

Studios

Movies

Length

Address

StudioName

Name

Avoiding redundancy

  • Say everything exactly once

    • Minimize database storage requirements

    • More important: prevent possible update errors

      • simplest but not only e.g.: modify data one place but not the other – more later

  • Example: Spot the redundancy

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Avoiding redundancy1
Avoiding redundancy

  • Say everything exactly once

    • Minimize database storage requirements

    • More important: prevent possible update errors

      • simplest but not only e.g.: modify data one place but not the other – more later

  • Example: Spot the redundancy

Name

Own

Studios

Movies

Length

Address

Phone

StudioName

Name

Redundancy: Movies “knows” the studio two ways

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Spot more redundancy

Name Length Studio SAddress SPhone

Pulp Fiction … Miramax NYC 212-…

Sylvia … Miramax NYC 212-…

Jay & Sil. Bob … Miramax NYC 212-…

Spot more redundancy

Length

SPhone

Name

Movies

SAddress

StudioName

Different redundancy: studio info listed for every movie!

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Don t add relships that are implied

Enrolls

Students

Courses

Assist

TA-of

TAs

Don’t add relships that are implied

Suppose each course again has <=1 TA

Q: Is the following good design?

A: If TAs other than the course’s TA can help students, then yes;

if not, then no: we can connect Students and TAs by going through Courses; redundant!

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Correct e r models may contain loops
Correct E/R models may contain loops

name

category

  • Person plays multiple roles:

    • employee of company

    • buyer of product

name

price

makes

Company

Product

stockprice

buys

employs

Person

name

ssn

address

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


More design
More design

Q: What’s wrong with this design?

  • Repeating TA names & IDs – redundant

  • TA is not TAing any course now  lose TA’s data!

  • TA should get its own ES

Course-ID

CName

Enrolls

Students

Courses

TA-Email

TA-Name

TA-ID

A:

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Opposite problem entity or attribute
Opposite problem: Entity or attribute?

  • Some E/Rs improved by removing entities

  • Can convert Entity E into attributes of F if

    • R:FE is many-one

      • one-one counts because special case

    • Attributes for E are independent of each other

      • knowing one att val doesn’t tell us another att val

  • Then

    • remove E

    • add all attributes of E to F

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Course-ID

CName

Enrolls

Students

Courses

Room

TA-Name

Entity  attribute

Course-ID

CName

Enrolls

Students

Courses

Room

Assists

TA

TA-Name

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Convert ta entity again

CName CID Room TA-Name

DBMS 46 123 Howard

DBMS 46 123 Wesley

Convert TA entity again?

Course-ID

Enrolls

  • No! Multiple TAs allowed

  • Violates condition (1)

  • Redundant course data

Students

Courses

CName

Assists

Room

TA

TA-Name

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Convert ta entity again1

CName TA-Name TA-ID TA-Color

DBMS Ralph 678 Green

A.Soft. Ralph 678 Green

Convert TA entity again?

Course-ID

  • No! TA has dependent fields

  • Violates condition (2)

    • How can it tell?

  • Redundant TA data

Enrolls

Students

Courses

CName

Assists

Room

TA

TA-Name

TA-ID

TA-Favorite-Color

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Entity or attributes
Entity or attributes?

  • Should student address be an entity or an attribute?

  • If student may have multiple addresses, must be entity

    • campus address, permanent address

    • attributes cannot be set-valued

  • If we need to examine structure of address, must be entity

    • find all students from NYS but not NYC

  • If attribute, then it’s probably a simple string

    • no structure!

    • NB: this choice is a microcosm of entire miniworld

    • (much) power of a DB comes from the structure imposed on the data

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Larger example db design
Larger example DB design

  • Application: library database. Authors have written books about various subjects; different libraries in the system may carry these books.

  • Entities (with attributes in parentheses):

    • Authors (ssn, name, phone, birthdate)

    • Books (ISDN, title)

    • Subjects (sname, sid)

    • Libraries (lname)

  • Relationships [associating entities in square brackets]:

    • Wrote-on [Authors, Subjects]

    • Cover [Libraries, Subjects]

    • On [Books, Subjects]

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


E r of db design
E/R of DB design

Name

ssn

phone

birthdate

Author

ISBN

wrote-on

On

Book

SName

Subject

Title

Carries

LName

Library

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Poor initial design
Poor initial design

  • First design is a poor model of this system

  • Some info not captured:

    • How many copies does a lib. have of a given book?

    • What edition of a book does the library have?

  • Design problems:

    • no direct relship associating authors and books

    • no direct relship associating libraries and books

  • Common queries complex and difficult/expensive

    • What libraries carry books by a given author?

    • What books has a given author written?

    • Who is the author of a given book?

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Larger example db design 2
Larger example DB design 2

  • Application: library database as before

  • Entities (with attributes in parentheses):

    • Authors (ssn, name, phone, birthdate)

    • Books (ISDN, title)

    • Subjects (sname, sid)

    • Libraries (lname)

  • Relations [associating entities in square brackets] (attributes in parentheses):

    • Wrote [Authors, Books]

    • Carries [Libraries, Books] (quantity, edition)

    • On [Books , Subjects]

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


E r of improved db design
E/R of improved DB design

Name

ssn

phone

birthdate

Author

SName

wrote

On

Subject

ISBN

Book

  • Rule of thumb: often queried together  make closely connected

Edition

Title

Carries

Quantity

LName

Library

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Next topic constraints
Next topic: Constraints

  • Review: programmer-defined rules stating what should always be true about consistent databases

  • Restrictions on data:

    • Keys (e.g. SSNs uniquely identify people)

    • Single value constraints (e.g. everyone has 1 father)

    • Referential Integrity (e.g. person’s record refers to father  father must exist)

    • Domain constraints (e.g. gender in M/F, age in 0..150)

    • General constraints (e.g. no more than 10 customers per sales rep)

  • Can’t infer constraints from data

    • may hold “accidentally”

    • they are a part of the schema

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


E r keys
E/R keys

  • Uniquely identifies entity in ES

  • Attribute or set of attributes

    • Two entities cannot agree on all key attributes

    • These attributes determine all others

  • Every ES should have a key

    • possibly including all attributes

  • Primary key attributes underlined

  • More than one possible key:

    • Candidate keys, primary key

  • Practical tip: create intentional key attribute

    • E.g. SSN, course-id, employee-id, etc.

    • SSN likely shorter than (name,address)

    • Prevents quasi-redundancy

Person

name

ssn

address

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Single valued constraints
Single-valued constraints

  • “at most one” value

    • sharp arrows

  • E.g. attributes: could be null or one

  • Many-one relationships: the “one” part is single-valued.

  • Can think of key atts as (non-null) single-valued

Assists

Course

TA

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Referential integrity
Referential integrity

  • “Exactly one value”

  • NOT NULL attributes

  • Relationships

    • Non-null value refers to entity that exists

    • Refer to entity with foreign key

    • HTML analogy: no broken links

    • Programming analogy: no dangling pointers

    • Ways of handling deletion:

      • Prevent deletion as long as referrer exist

      • Enforce deletion of all referrers

Taught

Course

Instructor

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Referential integrity e r e g
Referential integrity – E/R e.g.

Enrolls

  • Insertion – must refer to existing entity

  • Suppose need to add

    • course: “Oracle”

    • instructor: MPJ

  • Q: Which order?

  • Q: What if relship were exactly-exactly, say, M(Hs,Ws)?

    • i.e., referential integrity in both directions?

  • A: Put both inserts in one xact – later

Students

Courses

Taught

Instructor

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Other kinds of constraints
Other kinds of constraints

  • Domain constraints

    • E.g. date: must be after 1980

    • Enumerated type: grades A through F, no E

    • No specific E/R notation: mention with attribute or relationship

  • General constraints:

    • A class may have no more than 100 students; a student may not have more than 6 courses:

Enroll

Students

<=100

<=6

Courses

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Next topic weak entity sets
Next topic: Weak entity sets

  • Definition:

    • Some or all key attributes belong to another ES

  • Why:

    • An entity set is part of a hierarchy (not ISA)

    • Connecting entity sets

  • The key consists of

    • 0, 1 or more of its own attributes

    • Key attributes of entity sets from supporting relationships

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Conditions of supporting relationships
Conditions of Supporting relationships

  • Supporting relationship R:EF

    • R is many-one (E-F) (or one-one)

    • R is binary

    • Referential integrity from E to F

      • a rounded arrow

    • Those atts supplied to E are thekey attributes of F

    • F itself may be weak

      • Another entity set G, and so on recursively

R

A1

F

E

A2

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Requirements for weak entity sets
Requirements for weak entity sets

  • For several supporting relships from E to F

    • Keys of each F role appear as foreign key of E

  • Other many-one relationships

    • Not necessarily supporting

From

Purchases

A1

People

A2

By

At-store

Stores

A3

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Weak entity sets
Weak entity sets

  • Example: Hierarchy – species & genus

  • Idea: species name unique per genus only

Species

Belongs-to

Genus

name

name

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


Next time
Next time

  • We’ll finish E/R models and begin the relational model

  • Read chapter 3 through section 3.4

  • Info on project, hw likely posted soon

M.P. Johnson, DBMS, Stern/NYU, Spring 2005


ad