databases and database design
Download
Skip this Video
Download Presentation
Databases and Database Design

Loading in 2 Seconds...

play fullscreen
1 / 35

Databases and Database Design - PowerPoint PPT Presentation


  • 125 Views
  • Uploaded on

Databases and Database Design. William A. Yasnoff, MD, PhD Oregon Health Division. Databases & Database Design. Introduction Relational Databases Database Design: Data Models. Introduction. Database = large collection of information Stored on hard disk Retrieval

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Databases and Database Design' - onofre


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
databases and database design

Databases and Database Design

William A. Yasnoff, MD, PhD

Oregon Health Division

databases database design
Databases & Database Design
  • Introduction
  • Relational Databases
  • Database Design: Data Models
introduction
Introduction
  • Database = large collection of information
  • Stored on hard disk
  • Retrieval
    • 100,000 items X 10 ms = 17 minutes
    • Need to organize data to improve retrieval speed
    • How do you organize your paper files?
introduction continued
Introduction (continued)
  • Data Organization
    • sorting data: phonebook in alpha v. random order
    • “tabs” at key points: dictionary with tabs for each letter
    • direct pointers to information = index
      • table of key element value and address of other data
      • e.g. name & mailbox number --> letters in mailbox
relational database model6
Relational Database Model
  • Developed by EF Codd, CJ Date (70s)
  • Table = Entity = Relation
  • Table row = tuple = instance
  • Table column = attribute
  • Table linkage by values
  • Entity-Relationship Model
attributes
Attributes
  • Key
    • uniquely identify row/tuple
    • may be one or more attributes
  • Non-key
    • other properties of instance
    • dependent on key
retrieval in rdms
Retrieval in RDMS
  • SQL
    • Select ...
    • From ...
      • optional: can be computed
    • Where ...
  • Query by example
    • Fill in the blanks
    • WONDER
advantages of rdms
Advantages of RDMS
  • Very strong theoretical basis
    • storage
    • retrieval
  • Easy to implement
  • Conflicts and anomalies can be avoided
  • Intuitive appeal
  • Easy retrieval
relational model summary
Relational Model: Summary
  • popular method of organizing data
  • strong theoretical properties facilitate retrieval
  • relation = table
  • attribute = column
  • tuple = row
  • key = {attributes} that uniquely identify each row
relational model summary 2
Relational Model: Summary 2
  • Allows complex data relationships with multiple tables:
    • {patient id, patient demographics}
    • {patient id, patient visit date, blood pressure}
  • Structured Query Language (SQL) retrieval
      • Select patient _name where blood_lead_level > 10
data models
Data Models
  • Definition: A representation of the data and and data relationships of an activity
    • data: case report of serious E. Coli illness
    • data: E. Coli serotype O157
    • relationship: case report “contains” serotype (semantic relationship)
  • Database Design = Development of a quality data model
data model quality
Correctness

Conceptual (concepts represented properly)

real world representation

Syntactic (relationships represented properly)

real world language

Completeness (wholeness)

Conceptual (all concepts represented)

Syntactic (all relationships represented)

Data Model Quality
entity
Entity
  • “Any distinguishable object that is to be represented in a database” [C.J. Date]
  • Properties
    • within scope of model
    • single concept
    • a set of distinguishable “instances”
    • satisfies normalization rules
  • Also called a relation
entity guidelines
Entity Guidelines
  • Relationship with at least one other entity
  • Unique, descriptive name
  • Class or set of things (not just one)
  • Single meaning (no homonyms)
  • No synonyms (two entities describing same class)
attribute
Attribute
  • “a type or characteristic of an entity” (e.g. “gender” is an attribute of the entity “patient”)
  • Domain = the set of values from which an attribute may be selected (e.g. the domain of the attribute “gender” is [male, female])
  • An entity typically has many attributes
attribute characteristics
Attribute Characteristics
  • Key
    • value uniquely identifies entity
      • e.g. “Lab test ID” is key attribute of “specimen”
  • Non-Key
    • value does not uniquely identify entity
      • e.g. “author” does not uniquely identify “publication”
attribute characteristics19
Attribute Characteristics
  • Atomic = individual data value (one and only one fact)
  • Description = complete and clear definition
    • e.g. “professional privilege date” = The date on which a health care professional is granted privilege to practice in a particular health care facility, establishing the provider’s eligibility for patient care assignments and liability coverage
attribute guidelines
Attribute Guidelines
  • Unique name
    • no plurals, possessives, articles, conjunctions, verbs, or prepositions
  • Clear, complete, unambiguous description
  • Atomic (no positional information)
  • Domain with 2 or more values
  • Originates in only one entity
primary key
Primary Key
  • Attribute of an entity whose values uniquely determine its occurrences
    • {birth certificate number, person name, birth date, mother name, physician name}
    • {facility, patient name, physician name, date, temperature, pulse, blood pressure}
primary key characteristics
Primary Key Characteristics
  • Stable: does not change over time
  • Minimal: fewest attributes necessary
  • Factless: no hidden information
  • Definitive: value always exists
  • Accessible: available when data created
  • Unique: absolutely no duplicates
relationship
Relationship
  • Semantic: “contains”, “is part of”, “belongs to”
  • One-to-one
    • serotype of an organism
    • immunization status of a child
  • One-to-many
    • antibiotic resistances of an organism
    • vaccines administered to a child
relationship guidelines
Relationship Guidelines
  • No circular references (e.g. “health plans -> markets -> products -> health plans” should be “health plans -> health plan markets -> market products <- health plan products <- health plans”)
  • Single relationship between two entities
  • No recursive relationships
single relationship of entities
Single Relationship of Entities
  • Problem: employee <--> job assignment <--> job [double circular]
  • Solution: person --> job assignment <-- job [“person” includes “status” as employee or contractor]
recursive relationships
Recursive Relationships
  • Problem: supervisor <--> employee [may be circular]
  • Solution: person {person ID } --> employee-supervisor relationship {employee ID, supervisor ID}
    • multiple supervisory roles for each person
normalization
Normalization
  • Formalization of common sense rules of information organization
  • An attribute is functionally dependent on X only if each of its values is determined by the value of X (X may be composite)
  • Example: DOB is functionally dependent on Driver’s License number
key normalization concept
Key Normalization Concept
  • Functional dependence of each entity must be
    • based on entire primary key
    • NOT based on any other attributes
benefits of normalization
Benefits of Normalization
  • Aids in database design, integration
  • Ensures precise capture of business logic
  • Minimizes redundancy
  • Minimizes need for null values
  • Prevents
    • information loss
    • unintentional results
summary of normalization
Summary of Normalization
  • One Fact in One Place
common db design errors
Common DB Design Errors
  • Multiple instances in same row of table, e.g. first_value, second_value, third_value [Problem: what to do with 4th value?]
  • Same data item repeated in multiple places, e.g. address appears in two different tables [Problem: how to keep two values synchronized?]
data model quality32
Correctness

Conceptual (concepts represented properly)

real world representation

Syntactic (relationships represented properly)

real world language

Completeness (wholeness)

Conceptual (all concepts represented)

Syntactic (all relationships represented)

Data Model Quality
database design pearls
Database Design Pearls
  • Accommodate all data needed
  • Correct relationships between data items
  • No duplicate representation
  • Anticipated retrievals use indexes
  • Meet confidentiality requirements
references 1
References - 1
  • Reingruber MC & Gregory WW: The Data Modeling Handbook (New York: John Wiley & Sons, 1994)
  • Montgomery SL: Object-Oriented Information Engineering (Boston: AP Professional/Harcourt Brace, 1994)
references 2
References - 2
  • Codd EF: The Relational Model for Database Management (Reading, MA: Addison-Wesley, 1990)
  • Date CJ: An Introduction to Database Systems, 5th ed. (Reading, MA: Addison-Wesley, 1990)
  • Duncan KA: Health Information and Health Reform (San Francisco: Jossey-Bass, 1994)
ad