1 / 77

Database Systems

Database Systems. Major Content & Grade. Introduction * The Relational Model *** SQL **** Transaction Management *** Database Design (E-R) *** Database Design (Normalization) ***. Unit 2 The Relational Model. Relational Model Relational Algebra( 关系代数)

chas
Download Presentation

Database Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Database Systems

  2. Major Content & Grade • Introduction * • The Relational Model *** • SQL **** • Transaction Management *** • Database Design (E-R) *** • Database Design (Normalization) ***

  3. Unit 2 The Relational Model • Relational Model • Relational Algebra(关系代数) • Relational Calculus(关系演算)

  4. Relational Model • Laid down in 1969-1970 by E.F.CODD “ A Relational Model of Data for Large Shared Data Bank” C ACM 1970 • Mathematical Bases ( Relational Theory) • Developed in 1980s • Most commercial DBMS are Relational

  5. Mathematical Relations • Formally, given sets D1, D2, …. Dn, a relation r is a subset of Cartesian product(笛卡尔积) D1×D2 × … ×DnThus a relation(关系) is a set of n-tuples (a1, a2, …, an) where ai Di • Example: if customer-name = {Jones, Smith, Curry, Lindsay}customer-street = {Main, North, Park}customer-city = {Harrison, Rye, Pittsfield} Then r = {(Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield)} is a relation over customer-name × customer-street × customer-city.

  6. Mathematical Relations • Each attribute of a relation has a name • The set of allowed values for each attribute is called the domain(域)of the attribute • Attribute values are (normally) required to be atomic; that is, indivisible • Note: multivalued attribute(多值属性) values are not atomic • Note: composite attribute (组合属性) values are not atomic • The special value null is a member of every domain • The null value causes complications in the definition of many operations

  7. Mathematical Relations • A1, A2, …, Anare attributes • R = (A1, A2, …, An ) is a relation schema(关系模式) E.g. Customer-schema = (customer-name, customer-street, customer-city) • r(R) is a relation on the relation schema R E.g. customer (Customer-schema)

  8. customer_name customer_street customer_city Jones Smith Curry Lindsay Main North North Park Harrison Rye Rye Pittsfield customer Mathematical Relations • The current values (relation instance, 关系实例) of a relation are specified by a table • An element t of r is a tuple(元组), represented by a row in a table • Order of tuples is irrelevant (元组是无序的 tuples may be stored in an arbitrary order) attributes (or columns) tuples (or rows)

  9. Relational Database • A Relational database consists of multiple relations • Information about an enterprise is broken up into parts, with each relation storing one part of the information • Example: The customer table stores information about customers

  10. (b) The account table (c) The depositor table Relational Database The account table stores information about accounts. The depositortable containing information about which customer owns which account • Storing all information as a single relation is NOT a good idea!

  11. Key • Let K  R • K is a superkey(超码)of R if values for K are sufficient to identify a unique tuple of each possible relation r(R). • Example: {customer_id, customer_name} and {customer_id} are both superkeys of Customer. • by “possible r” we mean a relation r that could exist in the enterprise we are modeling.

  12. Key • Let K  R • K is a superkey(超码)of R if values for K are sufficient to identify a unique tuple of each possible relation r(R). • K is a candidate key(候选码) if K is minimal. • Example: {customer_id} is a candidate key for Customer, since it is a superkey, and no subset of it is a superkey. • {customer_name} is a candidate key also, assuming no two customers can possibly have the same name.

  13. Key • Primary key(主码): The candidate key that is selected to identify tuples uniquely within the relation. And the others candidate key is called Alternate Keys(替换码). • Example: for relation customer (customer_id, customer_name, customer_street, customer_city) • Candidate key : {customer_id}, {customer_name} • Primary key: {customer_id}

  14. Key • For the banking database, we have several relations to store data: • Foreign Key(外码): An attribute, or set of attributes,within one relation R1 that matches the candidate key of some(possibly the same) relation R2. • Example: Attribute customer_name in table depositor is a Foreign Key references to customer_name in customer.

  15. Key • Usually, Primary Key and Foreign Key was indicated using undirected line and dashed line. • Example: Find out PK and FK of the following schema: Student(sno, sname, ssex, sbirth, sdept) Course(cno, cname, cpno, ccredit) SC(sno, cno, grade)

  16. Integrity of Relational Model • Entity integrity(实体完整性约束): In a base relation, NO attribute of a primary key can be null. • Referential integrity(参照完整性约束): If a foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation(被参照关系) or the foreign key value must be wholly null. • Example: customer (customer_name, customer_street, customer_city) depositor (customer_name, account_number) account (account_number, branch_name, balance)

  17. Integrity of Relational Model • Null(空值): Represents a value for an attribute that is currently unknown or is not applicable for this tuple. • It is possible for tuples to have a null value, denoted by null, for some of their attributes • The result of any arithmetic expression involving null is null. • Aggregate functions simply ignore null values (as in SQL) • For duplicate elimination and grouping, null is treated like any other value, and two nulls are assumed to be the same (as in SQL) • Null value is NOT 0 of integer, NOT empty string , it is JUST NULL.

  18. Integrity of Relational Model • Comparisons with null values return the special truth value: unknown • If false was used instead of unknown, then not (A < 5) would not be equivalent to A >= 5 • Three-valued logic using the truth value unknown: • OR: (unknownortrue) = true, (unknownorfalse) = unknown (unknown or unknown) = unknown • AND:(true and unknown) = unknown, (false and unknown) = false,(unknown and unknown) = unknown • NOT: (not unknown) = unknown • Result of select predicate is treated as false if it evaluates to unknown

  19. Unit 2 The Relational Model • Relational Model • Relational Algebra(关系代数) • Relational Calculus(关系演算)

  20. Manipulation of Relational Model • Query Language is a Language in which user requests information from the database. • Categories of languages • Procedural(过程化的) • Non-procedural, or declarative(非过程化的,声明性的) • Mathematical languages: • Relational algebra(关系代数) • Tuple relational calculus(元组演算) • Domain relational calculus (域演算) • Characteristics of Relational languages • all set-at-a-time (一次一集合) operands & results are whole tables • Closure property(封闭性) the output from one operation can become input to another.

  21. Relational Algebra • Relational algebra is a Procedural language • Basic operators • select(选择):  • project(投影): π • union(并): ∪ • set difference(集合差): – • Cartesian product(笛卡尔积): × • rename:  • The operators take two or more relations as inputs and give a new relation as a result.

  22. Basic operators • Select Operation(选择操作) • Notation: p(r) • p is called the selection predicate(条件谓词) • Defined as: p(r) = {t | t  r  p(t)} Where p is a formula in propositional calculus(命题演算) consisting of terms connected by :  (and),  (or),  (not)Each term is one of: <attribute> op <attribute> (or <constant>) where op is one of: =, , >, , <,  (comparison operator) • Example of selection:branch_name=“Perryridge”(account)

  23. A B C D         1 5 12 23 7 7 3 10 A B C D     1 23 7 10 Basic operators • Select Operation(选择操作) • Relation r • A=B  D > ‘5’ (r)

  24. Basic operators • Project Operation(投影操作) • Notation:πA1, A2, …, Ak (r) where A1, A2 are attribute names and r is a relation name. • The result is defined as the relation of k columns obtained by erasing the columns that are not listed • Duplicate rows removed from result, since relations are sets • Example: For relation account (account_number, branch_name, balance) To eliminate the branch_name attribute of account πaccount_number, balance (account)

  25. A B C     10 20 30 40 1 1 1 2 A C A C     1 1 1 2    1 1 2 Basic operators • Project Operation(投影操作) Relation r: πA,C (r) =

  26. Basic operators • Union Operation(并操作) • Notation: r s r union s • Defined as: r s = {t | t  r t  s} • For r s to be valid: 1. r,s must have the same degree(列数相同) 2. The attribute domains must be compatible (域相容) • Example: find all customers with either an account or a loanπcustomer_name (depositor)  πcustomer_name (borrower)

  27. A B A B    1 2 1   2 3 s r A B     1 2 1 3 Basic operators • Union Operation(并操作) Relations r, s: r s: The 1st column of r deals with the same type of values(domain) as does the 1st column of s ,and 2nd column also, and also for all attributes.

  28. Basic operators • Difference Operation(差操作) • Notation: r – sr minus s • Defined as: r – s = {t | t r t  s} • Set differences must be taken between compatible relations. • r and s must have the same degree • attribute domains of r and s must be compatible

  29. A B A B    1 2 1   2 3 s r A B   1 1 Basic operators • Difference Operation (差操作) Relations r, s: r – s :

  30. Basic operators • Cartesian-Product Operation(笛卡尔积操作) • Notation: r × s r times s • Defined as: r×s = {tq | t  r q  s} • Assume that attributes of r(R) and s(S) are disjoint. (That is, R  S =  , NO same attribute in R and S). • If attributes of r(R) and s(S) are not disjoint, then renaming must be used.

  31. A B C D E   1 2     10 10 20 10 a a b b r s A B C D E         1 1 1 1 2 2 2 2         10 19 20 10 10 10 20 10 a a b b a a b b Basic operators • Cartesian-Product Operation(笛卡尔积操作) Relations r, s: r×s:

  32. Basic operators • Rename Operation(重命名操作) • Allows us to name, and therefore to refer to, the results of relational-algebra expressions. • Allows us to refer to a relation by more than one name. • Example: x (E) returns the expression E under the name x • If a relational-algebra expression E has degree n, then x(A1, A2, …, An) (E) A1, A2, …., Anreturns the result of expression E under the name x, and with the attributes renamed to A1, A2, …., An.

  33. Example Queries • Find the largest account balance.  A1.balance >A2.balance (A1(account)×A2(account) ) ? 最小值不被选中! πbalance(account) – πA1.balance (A1.balance <A2.balance(A1(account)×A2(account) ) )

  34. Database Systems

  35. Unit 2 The Relational Model • Relational Model • Relational Algebra(关系代数) • Relational Calculus(关系演算)

  36. Relational Algebra expression • A basic expression in the relational algebra consists of either one of the following: • A relation in the database • A constant relation • Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions: • E1 E2 • E1–E2 • E1×E2 • p(E1), P is a predicate on attributes in E1 • πs(E1), S is a list consisting of some of the attributes in E1 • x(E1), x is the new name for the result of E1

  37. Example Queries • Banking DataBase: branch (branch_name, branch_city, assets) customer (customer_name, customer_street, customer_city) depositor (customer_name, account_number) account (account_number, branch_name, balance) borrower(customer_name, loan_number) loan (loan_number, branch_name, amount)

  38. Example Queries • Find the loan number for each loan of an amount greater than $1200. πloan_number( ) amount> ‘1200’ (loan) 注:除特别要求外,写查询表达式时不需要写出结果集!

  39. Example Queries • Find the customers name who have at least one deposit of a balance greater than $700. πCustomer_name(account.account_number = depositor.account_number( ) ) A1: balance>’700’(account) ×depositor πaccount_number() πCustomer_name(account.account_number = depositor.account_number  balance>’700’( account ×depositor ) ) A2:

  40. Example Queries • Find all customers who have at least two deposits. πD1.cname(D1.cname=D2.cname ∧ D1.a#<>D2.a#( ) ) D1(cname,a#)(depositor) × D2(cname,a#)( depositor)

  41. Example Queries • Find the largest account balance. A.b > B.b (A ×B)? A.b < B.b (A ×B)? πbalance(account) – πA1.balance (A1.balance <A2.balance(A1(account)×A2(account) ) )

  42. Additional Operations • We define additional operations that do not add any power to the relational algebra, but that simplify common queries. • Set intersection(集合交) • Natural join(自然连接) • Division(除法)

  43. Additional Operations • Intersection Operation(交操作) • Notation: r s • Assume: • r, s have the same degree • attributes of r and s are compatible • Defined as: rs ={ t | trts } • Note: rs = r - (r - s) • Find the names of all customers who have a loan and an account at bank. πcustomer_name (borrower )  πcustomer_name (depositor )

  44. R S= { | trR tsS tr[A] ts[B] } tr ts AθB A=B R S= { | trR tsS tr[A]= ts[B] } tr ts Additional Operations • -Join Operation(-连接) • -Join is a general join which don’t need there are same attribute for both operators, and the join condition can be any compare operations. That is: R times S WHERE F • Equi-join is another join. Its join condition is equal operator.

  45. A R.B C S.B E a1 b1 5 b2 7 a1 b1 5 b3 10 a1 b2 6 b2 7 R S RS a1 b2 6 b3 10 B = B C < E a2 b3 8 b3 10 Additional Operations • -Join Operation(-连接) Relations r, s: R S

  46. Additional Operations • Natural join Operation(自然连接) • Notation: r s • Let r and s be relations on schemas R and S respectively.The result is a relation on schema R S which is obtained by considering each pair of tuples tr from r and ts from s. • If tr and ts have the same value on each of the attributes in RS, a tuple t is added to the result, where • t has the same value as tr on r • t has the same value as ts on s • Example: R = (A, B, C, D) S = (E, B, D) • Result schema = (A, B, C, D, E) • rs is defined as: πr.A, r.B, r.C, r.D, s.E (r.B = s.B r.D = s.D (r ×s))

  47. B D E A B C D 1 3 1 2 3 a a a b b           1 2 4 1 2      a a b a b r s r s A B C D E Additional Operations • Natural join Operation(自然连接) Relations r, s:

  48. πCustomer_name(account.account_number = depositor.account_number( ) ) A1: balance>’700’(account) ×depositor πaccount_number() πCustomer_name(account.account_number = depositor.account_number  balance>’700’( account ×depositor ) ) A2: account depositor Example Queries Additional Operations simplify the relational algebra expressions ! • Find the customers name who have at least one deposit of a balance greater than $700. πCustomer_name(balance>’700’()) A3:

  49. × D2(depositoraccount) D1(depositoraccount) Example Queries • (exercise)Find all customers who have at least two deposits in different branch. πD1.Customer_name( D1.Customer_name =D2. Customer_name  D1. Account_number<>D2. Account_number  D1.branch_name<>D2. branch_name( )

  50. Example Queries • (exercise)Find all customers who have at least two deposits in different branch. πD1.cname(D1.cname=D2.cname  D1.a#<>D2.a# D1.bname<>D2.bname( ) D2(cname,a#,bname)(πdepositor.customer_name,account_number,branch_name depositor.account_number=account.account_number(depositor× account))× D1(cname,a#,bname)(πdepositor.customer_name,account_number,branch_name depositor.account_number=account.account_number(depositor× account))

More Related