1 / 42

The Relational Model

Our Mathematical Foundation. The Relational Model. Origins. First proposed by E.F. Codd, 1969-70 subsequently modified and extended An abstract theory of data based on aspects of maths set theory Basis of most modern DBMS none implement it entirely we can compare them with the idea.

Download Presentation

The Relational Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Our Mathematical Foundation The Relational Model

  2. Origins First proposed by E.F. Codd, 1969-70 subsequently modified and extended An abstract theory of data based on aspects of maths set theory Basis of most modern DBMS none implement it entirely we can compare them with the idea

  3. 3 Aspects of the Model It concerns 1) data objects storing it 2) data integrity making sure it corresponds to reality 3) data manipulation working with it

  4. Storing information Relational Data Objects

  5. Tables – or is it? We say that databases have tables data are stored in them This is a simplification helps user understanding may be misleading

  6. Two sets of nomenclature

  7. Domain Pool of possible values for an attribute Each tuple has one of these values for the attribute Allows meaningful comparisons They are data types traditionally supported poorly in most systems area of recent development

  8. Relation Based on a collection of domains Heading a set of attribute:domain pairs, such that each attribute Ai has its own domain Di. {<A1:D1>, <A2:D2>, …, <Ai:Di>, … <An:Dn>} Body a set of tuples each tuple is a set of name:value pairs {<A1:V1>, <A2:V2>, …, <Ai:Vi>, … <An:Vn>}

  9. Illustration of notation As a table As a relation Heading {<UID:String(3)>, <Tel:Digit(4)>} Body { {<UID:xxh>, <Tel:7659>}, {<Tel:2436>, <UID:hoh>}, {<UID:nwh>, <Tel:2434>} }

  10. Venn diagram notation <Ext:Digit(4)> <UID:String(3)> {<UID:xxh>, <Tel:7659>} {<Tel:2436>, <UID:hoh>} {<UID:nwh>, <Tel:2434>}

  11. Are Relations Tables? A table is a practical way to write down a relation Relations are defined on sets a set of attributes in the heading a set of tuples in the body sets have no ordering attributes come in no particular sequence but columns do have sequence tuples come in no particular sequence but rows do have sequence

  12. Properties of relations Tuples and attributes are unordered There are no duplicate tuples All attributes are atomic

  13. Relations/attributes are unordered Tables seems to be Don’t believe it! We do not work in terms of: “next row” “previous record” “first tuple” We do not rely on: “next attribute” “first column” Do not try to write for loops Output can be made to be ordered

  14. No duplicate tuples (= rows) i.e.. No two tuples in a relation have all the same values for corresponding attributes Crucial point Can seem like a weakness It is a strength Learn to exploit it

  15. Example of duplication • We do not want two records for Jones • The DBMS will prevent this silly duplication • A simple example of exploiting “no duplication”

  16. All attribute values are atomic Jackie Chan Acting, Filming, Computing Jackie Chan Acting Bob Dylan Singing, Dancing Jackie Chan Filming Jackie Chan Computing Bob Dylan Singing Bob Dylan Dancing    

  17. Atomic Values Strings have characters, incl. spaces not the problem Problem:

  18. Bad solution • Which is “day 1”How do we search this?Could get the day and activity separatedThink about what we are modelling

  19. Solution: • Looks like adding more lines • Not complicated therefore • Can seem “cosmetic” • Will return to this

  20. End Relational Data Objects

  21. making sure the data corresponds to reality Relational Data Integrity

  22. A database as a model A DB “models” a real-world enterprise i.e. the DB must abstract from reality the attribute values and their combination must reflect the true state of the world We try to enforce plausibility We do this by implementing integrity rules (constraints)

  23. Data-specific integrity rules Domain specific Employee age is between 20 and 70 We only sell CDs in multiples of 10 Car registrations must be of the form: A DDD AAA where A = alphabetic, D = digit A temperature cannot be lower than 273.15 deg C Inter-attribute (inter-relational) Only senior managers and sales reps can have cars over 2000cc “Cardinality” constraints A team has 11 players

  24. General Integrity Rules There are two general integrity features Part of the relational model Entity integrity - tuple identification through candidate/primary keys Referential integrity – foreign keys There may be application-specific rules must be identified and implemented may be able to use DBMS we will return to this

  25. Candidate Keys (CKs) A candidate key for relation R is a subset K of the attributes of R such that no two tuples of R have the same value of K the “uniqueness property” no proper subset of K has the uniqueness property the “irreducibility property” All relations have at least 1 (everything) may have several Uniqueness applies to all possible tuples not just the current ones

  26. What are candidate keys for? Tuple-level addressing allows unambiguous identification of 1 tuple content addressing The tuple where X has value Y not unique unless X is a candidate key Access mechanisms are more general e.g. indexes although CKs may be implemented using them

  27. The primary key (PK) A specified candidate key The choice of PK is arbitrary There may only be one candidate Practical factors may help decide It is common to think only of “the” key (i.e. the primary key) but remember there may be other candidates

  28. Why bother with CKs ? Suppose that we have attributes: uid, ucas, national heath, surname, first name, d.o.b., city. Here are some data: Let us declare UCAS to be the primary key. This constrains UCAS numbers to be distinct for different people, but permits : !

  29. Penalty of not declaring CKs A real-world model would not want repetitions of UID and NH for distinct persons. We want UID to be unique for each person. Similarly for NH. Remedy: declare UID to be a candidate key. Similarly for NH. Then, either of UID, NH gives unique identification. This captures two constraints. Continue to use UCAS as the PK. UCAS, UID, NH are 3 candidate keys.

  30. Foreign Keys A reference mechanism between relations The target of a reference must exist in the referenced relation no “dangling references” Example, Consider a table of employees and a table of car allocations: Referential Integrity

  31. Foreign Keys No car - OK Violates referential integrity

  32. Foreign Keys (FKs)- definition Linking relation R2 to relation R1: A foreign key in a relation R2 is a subset of its attributes such that: there is a relation R1 with a candidate key CK For each value of FK in R2 there must exist an identical value in the CK in some tuple of R1. Interactive Fiction

  33. Foreign keys - notes All keys are sets of attributes A candidate key can contain a value not currently found in the foreign key Chains of references can build up Relations can reference themselves personnel relation can have a “manager” attribute - managers are personnel

  34. Foreign key - examples Earlier example: PERSON{Name(PK), Post} CAR_ALLOC{Car(PK), Allocated*} FK {Allocated} references PERSON Self referential example: a surgeon is supervised by a senior surgeon, called a “consultant”) SURGEON {Surgeon(PK), Consultant*} FK {Consultant} references SURGEON Notation

  35. Referential Integrity The database must not hold any unmatched foreign keys The DBMS should prevent the situation arising - most do today The DBMS can: reject operation which would compromise integrity or make other changes to retain integrity

  36. Maintaining R. Integrity Attempt to delete the target of a foreign key only allow if there is no matching FK value or cascade-delete tuples with FK matches Attempt to update the candidate key only allow if there is no matching FK value or cascade-update the FK in the matching tuples Interactive Fiction

  37. Beware Autogenerated Keys! Some systems readily offer to generate key values for you – e.g. every time another tuple is entered, a numeric key value is automatically allocated. This permits: ! • the same information inserted twice with distinct autokey values!

  38. Beware Autogenerated Keys! Remedy: make Autokey, UID, UCAS, NH candidate keys and select one as PK. Remedy: avoid introducing unnecessary new keys. Do we need an AutoKey in this example? Caution: if you do introduce a new key, you still need to identify other candidate keys (or risk bad modelling).

  39. MultipleMulti-attribute keys Consider a timetable, with entries for Day, Time, Module, Room, Building, LecturerID Assume that a Room label such as A6 can appear in different buildings. Some data: Candidate Keys? There might be several.

  40. Multiple Multi-attribute Keys With business rules: e.g. a module has only one lecturer, a lecturer lectures in 1 room at a time, only 1 module in a room at a time ... 3 Candidate keys in this case: {Day,Time, Room, Building}, – 4 attributes {Day, Time, Lecturer} – 3 attributes {Day, Time, Module}, – 3 attributes Choose one to be the PK.

  41. Multi-attribute Foreign Keys A multi-attribute PK may be referenced from another relation the referencing foreign key then needs to be declared with same structure as that PK (or generally, the CK) {Room, Building} above could be declared to be a FK that references a relation with PK {Room, Building} describing facilities:

  42. End Data Integrity

More Related