1 / 77

The Relational Model

The Relational Model. Relations. S.S.N. street. A relation is a more concrete construction, of something we have seen before, the ER diagram. A relation is (just!) a table! We will use table and relation interchangeably, except where there is a possibility of confusion. name. city.

eara
Download Presentation

The Relational Model

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Relational Model

  2. Relations S.S.N street A relation is a more concrete construction, of something we have seen before, the ER diagram. A relation is (just!) a table! We will use table and relation interchangeably, except where there is a possibility of confusion. name city students The students relation

  3. A relation consists of a relational schema and a relational instance. A relation schema is essentially a list of column names with their data types. In this case… students(name : string, S.S.N : string, street : string, city : string) • An relation instance is made up of zero of more tuples (rows, records)

  4. A schema specifies a relation’s name. students(name : string, S.S.N : string, street : string, city : string) A schema also specifies the name of each field, and its domain. Fields are often referred to as columns, attributes, dimensions

  5. A minor, but important point about relations, they are unordered. This is not a problem, since we refer to fields by name. However sometimes, we refer to the fields by their column number, in which case the ordering becomes important. I will point this out when we get there. Also, the tuples are unordered too!

  6. Note that every tuple in our instance is unique. This is not a coincidence. The definition of relation demands it. Later we will see how we can represent weak entities in relations.

  7. The number of fields is called the degree (or arity, or dimensionality of the relation). Below we have a table of degree 4. The number of tuples cardinality of the relation Of course, we don’t count the row that has the labels! To the right we have a table of cardinality 3.

  8. students(name : string, S.S.N : string, street : string, city : string) Note that relations have primary keys, just like ER diagrams. Remember that the primary key might not be one field, it may be a combination of two or more fields.

  9. Teaches Name PID days Professor Translating ER diagrams into Relations • We need to figure out how to translate ER diagrams into relations. • There are only three cases to worry about. • Strong entity sets • Weak entity sets • Relationship sets Name Number Course

  10. Name Number Teaches Course Name PID days Professor • Strong entity sets professor(PID : string, name : string) This is trivial, the primary key of the ER diagram becomes the primary key of the relation. All other fields are copied in (in any order)

  11. Name Number Teaches Course Name PID days Professor • Weak entity sets course(PID : string, number : string, name : string) The primary key of the relation consists of the union of the primary key of the strong entity set and the discriminator of the weak entity set. The “imported” key from the strong entity set is called the foreign key. All other fields are copied in (in any order)

  12. Name Number Teaches Course Name PID days Professor • Relationship entity sets teaches(PID : string, days : string ) • For one-to-one relationship sets, the relation’s primary key can be that of either entity set. • For many-to-many relationship sets, the union of the primary keys becomes the relation’s primary key • For the other cases, the the relation’s primary key is taken from the strong entity set.

  13. Name Number Teaches Course Name PID days Professor So, this ER Model… … maps to this database schema professor(PID : string, name : string) course(PID : string, number : string, name : string) teaches(PID : string, days : string)

  14. We have seen how to create a database schema, how do we create an actual database on our computers? professor(PID : string, name : string) course(PID : string, number : string, name : string) teaches(PID : string, days : string)

  15. …how do we create an actual database on our computers? We use SQL, a language that allows us to build, modify and query databases. professor(PID : string, name : string)

  16. SQL (Structured Query Language) • SQL is a language that allows us to build, modify and query databases. • SQL is an ANSI standard language. American National Standards Institute • SQL is the “engine” behind Oracle, Sybase, Microsoft SQL Server, Informix,Access, Ingres, etc. • Most of these systems have build GUIs on top of the command line interface, so you don’t normally write statements directly in SQL (although you can).

  17. Important Note • In our textbook, the authors introduce SQL at the same time as they introduce the relational model (Chapter 3). • My plan is a little different. I plan to discuss operations on databases (using relational algebra) in a more abstract way, and revisit SQL later in the course. • I encourage you to glance at the SQL material as you read about the relational model in Chapter 3, but don’t worry about the details of SQL just yet.

  18. Relational Algebra • Procedural language • Five basic operators • selectionselect • projectionproject • union (why no intersection?) • set differencedifference • Cross productCartesian product • The are some other operators which are composed of the above operators. These show up so often that we give them special names. • The operators take one or two relations as inputs and give a new relation as a result. SQL is closely based on relational algebra.

  19. Select Operation – Example A B C D • Relation r Intuition: The select operation allows us to retrieve some rows of a relation (by “some” I mean anywhere from none of them to all of them) Here I have retrieved all the rows of the relation r where either the value in field A equals the value in field B, or the value in field D is greater than 5.         1 5 12 23 7 7 3 10 • A=B ^ D > 5(r) A B C D     1 23 7 10 lowercase Greek sigma

  20. Select Operation • Notation: p(r) lowercase Greek sigma • p is called the selection predicate • Defined as: p(r) = {t | t  rand p(t)} Where p is a formula in propositional calculus consisting of terms connected by :  (and),  (or),  (not)Each term is one of: <attribute> op <attribute> or <constant> where op is one of: =, , >, . <.  • Example of selection:name=“Keogh(professor)

  21. Project Operation – Example I A B C • Relation r:     10 20 30 40 7 1 1 2 Intuition: The project operation allows us to retrieve some columns of a relation (by “some” I mean anywhere from none of them to all of them) Here I have retrieved columns A and C. • A,C (r) A C     7 1 1 2 Greek capital letter pi

  22. Project Operation – Example II A B C • Relation r:     10 20 30 40 1 1 1 2 Intuition: The project operation removes duplicate rows, since relations are sets. Here there are two rows with A =  and C = 1. So one was discarded. A C A C • A,C (r)     1 1 1 2    1 1 2 =

  23. Project Operation • Notation:A1, A2, …, Ak (r) Greek capital letter pi where A1, A2 are attribute names and r is a relation name. • The result is defined as the relation of k columns obtained by erasing the columns that are not listed • Duplicate rows removed from result, since relations are sets.

  24. Union Operation – Example A B A B    1 2 1   2 3 Relations r, s: s Intuition: The union operation concatenates two relations, and removes duplicate rows (since relations are sets). Here there are two rows with A =  and B = 2. So one was discarded. r A B     1 2 1 3 r  s:

  25. Union Operation • Notation: r s • Defined as: r s = {t | t  r or t  s} For r s to be valid. 1. r,s must have the same arity (same number of attributes) 2. The attribute domains must be compatible (e.g., 2nd column of r deals with the same type of values as does the 2nd column of s). Although the field types must be the same, the names can be different. For example I can union professorand lecturer where: professor(PID : string, name : string) lecturer(LID : string, first_name : string)

  26. Set Difference Operation – Example A B A B    1 2 1   2 3 Relations r, s: s r Intuition: The set difference operation returns all the rows that are in r but not in s. r – s: A B   1 1

  27. Set Difference Operation • Notation r – s • Defined as: r – s = {t | t rand t  s} • Set differences must be taken between compatible relations. • r and s must have the same arity • attribute domains of r and s must be compatible • Note that in general r – s  s – r

  28. Cross-Product Operation -Example A B C D E Relations r, s:   1 2     10 10 20 10 a a b b r s Intuition: The cross product operation returns all possible combinations of rows in r with rows in s. In other words the result is every possible pairing of the rows of r and s. r xs: A B C D E         1 1 1 1 2 2 2 2         10 19 20 10 10 10 20 10 a a b b a a b b

  29. A B C D E         1 1 1 1 2 2 2 2         10 19 20 10 10 10 20 10 a a b b a a b b Cross-Product Operation-Example A B C D E Relations r, s:   1 2     10 10 20 10 a a b b r s Intuition: The cross product operation returns all possible combinations of rows in r with rows in s. In other words the result is every possible pairing of the rows of r and s. r xs:

  30. Cross-Product Operation • Notation r x s • Defined as: r x s = {t q | t  r and q  s} • Assume that attributes of r(R) and s(S) are disjoint. (That is, R  S = ). • If attributes names of r(R) and s(S) are not disjoint, then renaming must be used.

  31. Composition of Operations r xs: • We can build expressions using multiple operations • Example: A= C(r x s) A B C D E         1 1 1 1 2 2 2 2         10 10 20 10 10 10 20 10 a a b b a a b b A B C D E   1 2     10 10 20 10 a a b b r s A B C D E       “take the cross product of r and s, then return only the rows where A equals B” 10 20 20 a a b 1 2 2 A=C(r x s)

  32. Rename Operation A B A B • Allows us to name, and therefore to refer to, the results of relational-algebra expressions. Example: myRelation (r – s)    1 2 1   2 3 s r Take the set difference of r and s, and call the result myRelation Renaming in relational algebra is essentiality the same as assignment in a programming language A B   1 1 myRelation

  33. Rename Operation A B A B    1 2 1   2 3 If a relational-algebra expression E has arity n, then x(A1, A2, …, An)(E) returns the result of expression E under the name X, and with the attributes renamed to A1, A2, …., An. Example myRelation(E,K) (r – s) s r E K   1 1 Take the set difference of r and s, and call the result myRelation, while renaming the first field E and the second field K. myRelation

  34. Banking Examples branch (branch-name, branch-city, assets) customer (customer-name, customer-street, customer-only) account (account-number, branch-name, balance) loan (loan-number, branch-name, amount) depositor (customer-name, account-number) borrower (customer-name, loan-number) Note that I have not indicated primary keys here for simplicity.

  35. Quick note on notation bad_customers good_customers • If we have two or more relations which feature the same attribute names, we could confuse them. To prevent this we can use dot notation. • For example • good_customers.loan-number

  36. Example Queries “select from the relation loan, only the rows which have a amount greater than 1200” • Find all loans of over $1200 amount> 1200 (loan) loan amount> 1200 (loan)

  37. Example Queries • Find the loan number for each loan of an amount greater than $1200 loan-number (amount> 1200 (loan)) “select from the relation loan, only the rows which have a amount greater than 1200, then project out just the loan_number” loan amount> 1200 (loan) loan-number (amount> 1200 (loan))

  38. Example Queries • Find all loans greater than $1200 or less than $75 amount> 1000 oramount< 75(loan) “select from the relation loan, only the rows which have a amount greater than 1000 or an amount less than 75 loan amount> 1000 oramount< 75(loan)

  39. Example Queries • Find the names of all customers who have a loan, an account, or both, from the bank customer-name (borrower)  customer-name (depositor) borrower depositor customer-name (borrower) • customer-name (depositor)

  40. Note this example is split over two slides! Example Queries Find the names of all customers who have a loan at the Riverside branch. customer-name (branch-name=“Riverside ” (borrower.loan-number = loan.loan-number(borrower x loan))) borrower loan We retrieve borrower and loan… …we calculate their cross product…

  41. customer-name (branch-name=“Riverside ” (borrower.loan-number = loan.loan-number(borrower x loan))) …we calculate their cross product… …we select the rows where borrower.loan-number is equal to loan.loan-number… …we select the rows where branch-name is equal to “Riverside” …we project out the customer-name.

  42. Note this example is split over three slides! Example Queries Find the largest account balance ...we will need to rename account relation as d... balance(account) - account.balance(account.balance < d.balance(account xrd (account))) d account We do a rename to get a “copy” of account which we call d… … next we will do a cross product…

  43. balance(account) - account.balance(account.balance < d.balance(account xrd (account))) … do a cross product… …select out all rows where account.balance is less than d.balance… .. next we project…

  44. balance(account) - account.balance(account.balance < d.balance(account xrd (account))) .. next we project out account.balance… …then we do a set difference between it an the original account.balance from the account relation… … the set difference leaves us with one number, the largest value! account

  45. Formal Definition • A basic expression in the relational algebra consists of either one of the following: • A relation in the database • A constant relation • Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions: • E1 E2 • E1 - E2 • E1 x E2 • p (E1), P is a predicate on attributes in E1 • s(E1), S is a list consisting of some of the attributes in E1 • x(E1), x is the new name for the result of E1

  46. Additional Operations We define additional operations that do not add any power to the relational algebra, but that simplify common queries. • Natural join • Conditional Join • Equi join • Division • Set intersection All joins are really special cases of conditional join

  47. Natural-Join Operation: Motivation borrower loan Very often we have a query and the answer is not contained in a single relation. For example, I might wish to know where Apu banks. The classic relational algebra way to do such queries is a cross product, followed by a selection which tests for equality on some pair of fields. borrower.l-number = loan.l-number(borrower x loan))) • While this works… • it is unintuitive • it requires a lot of memory • the notation is cumbersome Note that is this example the two relations are the same size (2 by 2), this does not have to be the case. So we have a more intuitive way of achieving the same effect, the natural join, denoted by the symbol

  48. borrower loan Natural-Join Operation: Intuition Natural join combines a cross product and a selection into one operation. It performs a selection forcing equality on those attributes that appear in both relation schemes. Duplicates are removed as in all relation operations. So if the relations have one attribute in common, as in the last slide (“l-number”), for example, we have… = borrower.l-number = loan.l-number(borrower x loan))) • There are two special cases: • If the two relations have no attributes in common, then their natural join is simply their cross product. • If the two relations have more than one attribute in common, then the natural join selects only the rows where all pairs of matching attributes match. (lets see an example on the next slide).

  49. B A Both the l-name and the f-name match, so select. Only the f-names match, so don’t select. Only the l-names match, so don’t select. We remove duplicate attributes… The natural join of A and B AB = Note that this is just a way to visualize the natural join, we don’t really have to do the cross product as in this example

  50. Natural-Join Operation • Notation: rs • Let r and s be relations on schemas R and S respectively.The result is a relation on schema RS which is obtained by considering each pair of tuples tr from r and ts from s. • If tr and ts have the same value on each of the attributes in RS, a tuple t is added to the result, where • t has the same value as tr on r • t has the same value as ts on s • Example: R = (A, B, C, D) S = (E, B, D) • Result schema = (A, B, C, D, E) • rs is defined as: r.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r x s))

More Related