CS 157B Midterm 1 Revision

CS 157B Midterm 1 Revision Prof. Sin Min Lee

Q # 1 (01) 1. Characterize the difference between the following pairs of terms a. Entity and entity class b. Relationship and relationship type c. Attribute value and attribute d. Strong entity class and weak entity class 2. Create an ER model for the following enterprise: Each building in an organization has a different building name and address. The meeting rooms in each building have their own room numbers and seating capacities. Rooms may be reserved for meetings, and each meeting must start on the hour. The hour and length of use are recorded. Each reservation is made by a group in the company. Each group has a group number and a contact phone.

Q# 1 (01): Solutions a. Entity and entity class An entity is an object and an entity class is a collection of objects. b. Relationship and relationship type A relationship is a specific instance of two related entities. A relationship type represents the possibility that two entities are related. c. Attribute value and attribute An attribute is a characteristic shared by entities of a class. An attribute value is the particular value of the attribute for a specific entity. d. Strong entity class and weak entity class A strong entity class is one with a key and a weak entity class has no key of its own.

Q # 1 (01): Solutions

Q #1 (02) 1. Consider the following E-R diagram and attributes Buyer: id, name, address Property: id, askingPrice, salePrice, address StaffMember: id, lastName, firstName, phone Branch: id, name, address, phone a. Write sentences to describe the roles of staff members in the diagram. b. Write sentences to describe the roles of property in the diagram. 2. Explain the following constraints. Domain constraint Key constraint Cardinality constraint Participation constraint

Q#1 (02) Solution 1. a. Write sentences to describe the roles of staff members in the diagram. A staff member may sell many properties. A staff member may work at a branch. A staff member may mange a branch b. Write sentences to describe the roles of property in the diagram. A property may be bought by one buyer A property may be sold by many staff members 2. Domain constraint : restrictions on the allowable values of attributes - the type and the range of values possible for each attribute Key constraint : unique value in an entity class-most entity classes have one or more attributes that form a key of the class Cardinality constraint : the number of relationships that may exist at a time – (to-one and to-many) Participation constraint : whether at least one relationship must exist for each entity of a class.

Q #2 -01Create a relational database schema for the following ER model. hour length number capacity type date name M M 1 Is Room Uses Meeting In M 1 Has Building contact 1 phone Group address name number

Q#2 -01 : solution Building (name, address) Room (number, capacity, type, buildingName) Meeting (name, date, hour, length, roomNo, byGroup) Group (number, contact, phone)

Q # 2 (02) 1. Create a relational database schema for the following EER model. ssn firstName lastName

Q # 2 (02) 2. Suppose R(A,B,C,D,E,F,G,H) is a relation. Suppose {A,B}{C,D,E,F,G,H}, C{B,E,F}, and GH. Which sets of attributes are the keys of R? Identify and eliminate any 2NF violations. Identify and eliminate any 3NF violations.

Q # 2 (02): solution One of the possible solutions Vehicle (vehicleId, weight, capacity, numberWheels) Car (vehicleId, numberDoors, amenities) Truck (vehicleId, weightCapacity, numberAxles) Motorcycle (vehicleId, engineType, usage) Owner (ssn, firstName, lastName) Has (ssn, vehicleId)

Q # 2 (02): solution 2. a. Which sets of attributes are the keys of R? {A, B} and {A, C} b. Identify and eliminate any 2NF violations. C  {E, F} is a 2NF violation The revised schema is R9: (A, B, C, D, G, H), {A, C} is also a key R10: (C, E, F) c. Identify and eliminate any 3NF violations. G  H is a 3NF violation The revised schema is R10, R11: (A, B, C, D, G), {A, C} is also a key R12: (G, H)

Q# 3 (01) 1. Write SQL statements to create the following two tables (including key constraints and referential constraints) 2. Write a SQL select statement to find the names of Employees who work in ‘Research’ department. 3. Write a SQL statement to insert a new department with Dnumber=6 and Dname=“sales”.

Q # 3 (01): Solutions 1. Create table Department ( Dnumber int primary key, Dname varchar(20), MgrSSN int references Employee ); Create table Employee ( SSN int primary key, Fname varchar(20), Lname varchar(20), Sex char(1), Salary int, SuperSSN int references Employee, Dno int references Department ); 2. Select Fname, Lname from Employee, Department where Dname=‘Research’ and Dnumber = Dno; 3. Insert into Department (Dnumber, Dname) values (6, “Sales”);

Q # 3 (02) 1. Write a SQL select statement to find the names of all employees whose salaries are greater than 30000. 2. Write a SQL select statement to find the names of departments for which female employees work. 3. Write a SQL statement to delete all the employees who work at Dno 4.

QUIZ # 3 (02): Solution 1. Select Fname, Lname From Employee Where Salary > 30000; 2. Select Dname From Employee, Department Where Sex=‘F’ and Dno=Dnumber; 3. Delete * From Employee Where Dno=4;

Manipulating Information with the Relational Algebra [Ch. 6.1] • Relation is a set of tuples and that each tuple in a relation has the same number and types of attributes. Relational algebra includes : • Selection Operators • Projection Operators • Set Operators • Join and product Operations

Selection Operators () • Reduce the number of tuples in a set by selecting those that satisfy some criteria. • Example : lastName = ‘Doe’ (Customer) [ Select from Customer where lastName = ‘Doe’ ] Customer

Projection Operators () • Reduce the size of each tuple in a set by eliminating specific attributes. • Example : lastName, firstNAme (Customer) [ project customer onto (lastName, firstName) ] Customer

Set Operators (  -) • Manipulate two similar sets of tuples by combining or comparing. • Example : Rental  PreviousRental Rental PreviousRental

Set Operators (  -) ...con’t • The union of two relations is a relation that contains the set of each tuple that is in at least one of the input relations. • Partial result of the Rental  PreviousRental

Set Operators (  -) ...con’t • The intersection of two relations is the set of all tuples that occur in both input relations. • The intersection of the relations Rental  PreviousRental in the previous example will return an empty set. • Another example would be the intersection between the video IDs of the two tables. •  videoId (Rental)   videoId (PrevioutsRental) = Videotapes that are currently rented as well as those that have been rented before. • The set of all videotapes that have been rented previously but are not currently rented is expressed as follows:  videoId (PreviousRental) -  videoId (Rental)

Join and Product Operations () • Increase the size of each tuple by adding attributes • The Cartesian product produces a tuple of the new realtion for each combination of one tuple from the left operand and one tuple from the right operand. Example : Employee  TimeCard Employee TimeCard

Join and Product Operations () ...con’t • The result of this operation has 30 tuples because there are 5 Employee and 6 TimeCard. • Partial result of Cartesian product Employee  TimeCard

Join and Product Operations () ...con’t • A selection of those tuples where Employee.ssn equals TimeCard.ssn can be expressed by : Employee.ssn = TimeCard.ssn (Employee  TimeCard) • This type of product is called a join. The join operation puts together related objects from two relations. • A Natural Join however is defined so that the shared attribute appears only once in the output table. • Ref. textbook Table 6.6 [natural join] vs Table 6.7 [join]

Using reflexivity, we can generate all trivial dependencies R = ( A, B, C ) F = { A  B, B  C } F+ = { A  A, B  B, C  C, AB  AB, BC  BC, AC  AC, ABC  ABC, AB  A, AB  B, BC  B, BC  C, AC  A, AC  C, ABC  AB, ABC  BC, ABC  AC, ABC  A, ABC  B, ABC  C, A  B, … (1) ( given ) B  C, … (2) ( given ) A  C, … (3) ( transitivity on (1) and (2) ) AC  BC, … (4) ( augmentation on (1) ) AC  B, … (5) ( decomposition on (4) ) A  AB, … (6) ( augmentation on (1) ) AB  AC, AB  C, B  BC, A  AC, AB  BC, AB  ABC, AC  ABC, A  BC, A  ABC }

Example: F = { A  B, B  C } A+ = ABC B+ = BC C+ = C AB+ = ABC

R = ( A, B, C, G, H, I ) F = { A  B, A  C, CG  H, CG  I, B  H } To compute AG+ closure = AG closure = ABG ( A  B ) closure = ABCG ( A  C ) closure = ABCGH ( CG  H ) closure = ABCGHI ( CG  I ) Is AG a candidate key? AG R A+ R ? G+ R ?

In general, suppose X  A violates BCNF, then one of the following holds • X is a subset of some key K: we store ( X, A ) pairs redundantly. • X is not a subset of any key: there is a chain K  X  A ( transitive dependency )

Third Normal Form A relation R is in 3NF if, for all X  A that holds over R • A  X ( i.e., X  A is a trivial FD ), or • X is a superkey, or • A is part of some key for R • The definition of 3NF is similar to that of BCNF, with the only difference being the third condition. • Recall that a key for a relation is a minimal set of attributes that uniquely determines all other attributes. • A must be part of a key (any key, if there are several). • It is not enough for A to be part of a superkey, because this condition is satisfied by every attribute. If R is in BCNF, obviously it is in 3NF.

Suppose that a dependency X  A causes a violation of 3NF. There are two cases: • X is a proper subset of some key K. Such a dependency is sometimes called a partial dependency. In this case, we store (X,A) pairs redundantly. • X is not a proper subset of any key. Such a dependency is sometimes called a transitive dependency, because it means we have a chain of dependencies K  XA.

Key Attributes X Attributes A A not in a key Partial Dependencies Key Attributes X Attributes A A not in a key Key Attributes A Attributes X A in a key Transitive Dependencies

Motivation of 3NF • By making an exception for certain dependencies involving key attributes, we can ensure that every relation schema can be decomposed into a collection of 3NF relations using only decompositions. • Such a guarantee does not exist for BCNF relations. • It weaken the BCNF requirements just enough to make this guarantee possible. • Unlike BCNF, some redundancy is possible with 3NF. • The problems associate with partial and transitive dependencies persist if there is a nontrivial dependency XA and X is not a superkey, even if the relation is in 3NF because A is part of a key.

Reserves • Assume: sid  cardno (a sailor uses a unique credit card to pay for reservations). • Reserves is not in 3NF • sid is not a key and cardno is not part of a key • In fact, (sid, bid, day) is the only key. • (sid, cardno) pairs are redundantly.

Reserves • Assume: sid  cardno, and cardno  sid (we know that credit cards also uniquely identify the owner). • Reserves is in 3NF • (cardno, sid, bid) is also a key for Reserves. • sid  cardno does not violate 3NF.

Decomposition • Decomposition is a tool that allows us to eliminate redundancy. • It is important to check that a decomposition does not introduce new problems. • A decomposition allows us to recover the original relation? • Can we check integrity constraints efficiently?

Supply sid status city part_id qty A set of relation schemas { R1, R2, …, Rn }, with n  2 is a decomposition of R if R1  R2  …  Rn = R Supplier sid status city and SP sid part_id qty

Problems with decomposition • Some queries become more expensive. • Given instances of the decomposed relations, we may not be able to reconstruct the corresponding instance of the original relation – information loss. • Checking some dependencies may require joining the instances of the decomposed relations.

Lossless Join Decomposition The relation schemas { R1, R2, …, Rn } is a lossless-join decomposition of R if: for all possible relations r on schema R, r = R1( r )  R2( r ) …  Rn( r )

Example: a lossless join decomposition IN sid sname sid sname major Student IM sid major Student IN ‘Student’ can be recovered by joining the instances of IN and IM IM

Example: a non-lossless join decomposition IN sid major sid sname major Student sname major IM Student IN IM Student = IN  IM????

IN IM IN  IM Student The instance of ‘Student’ cannot be recovered by joining the instances of IM and NM. Therefore, such a decomposition is not a lossless join decomposition.

Theorem: R - a relation schema F - set of functional dependencies on R The decomposition of R into relations with attribute sets R1, R2is a lossless-join decomposition iff ( R1  R2 )  R1  F+ OR ( R1  R2 )  R2  F+ i.e., R1  R2 is a superkey for R1 or R2. (the attributes common to R1 and R2 must contain a key for either R1 or R2 ).

Example • R = ( A, B, C ) • F = { A  B } • R = { A, B } + { A, C } is a lossless join decomposition • R = { A, B } + { B, C } is not a lossless join decomposition

N = 5 R(A1, A2, A3, A4, A5) M=3 R1(A1, A3) R2(A2, A4) R3(A1, A2, A5) # of FD P=5 FD1, A1 -> A2 FD2, A2, A3 -> A4 FD3, A5 ->A1 FD4, A2, A4 -> A1, A3 FD5, A3 -> A2, A4

S-matrix 5x3-matrix A1 A2 A3 A4 A5 R1 a1 b(1,2) a3 b(1,4) b(1,5) R2 b(2,1) a2 b(2,3) a4 b(2,5) R3 a1 a2 b(3,3) b(3,4) a5

FD1. A1->A2 a1 a2 a3 b(1,4) b(1,5) b(2,1) a2 b(2,3) a4 b(3,5) a1 a2 b(3,3) b(3,4) a5 FD2 A2,A3 -> A4 a1 a2 a3 b(1,4) b(1,5) b(2,1) a2 b(2,3) a4 b(3,5) a1 a2 b(3,3) b(3,4) a5

FD3 A5 -> A1 a1 a2 a3 b(1,4) b(1,5) b(2,1) a2 b(2,3) a4 b(3,5) a1 a2 b(3,3) b(3,4) a5 FD4 A2,A4 -> A1,A3 a1 a2 a3 b(1,4) b(1,5) b(2,1) a2 b(2,3) a4 b(3,5) a1 a2 b(3,3) b(3,4) a5

FD5 A3 -> A2, A4 a1 a2 a3 b(1,4) b(1,5) b(2,1) a2 b(2,3) a4 b(3,5) a1 a2 b(3,3) b(3,4) a5 // check if any single row has all “a” value // if not it is not a good decomposition

CS 157B Midterm 1 Revision

CS 157B Midterm 1 Revision

Presentation Transcript

CS 280 Midterm Review

Midterm 2 Revision 2

Relational Data Model CS 157B

Midterm 3 Revision 2

CS 2800 Midterm Review

CS 135 Midterm 2

CS 499 Midterm Presentation

Midterm 1

CS 3843 Midterm Review

CS 61B Midterm Review

Midterm 3 Revision and ID3

CS 490JL Midterm Review

Revision of Midterm 2

Midterm 3 Revision and ID3

Midterm 2 Revision

Midterm 3 Revision

Midterm 2 Revision

Midterm 3 Revision

CS 231 Midterm 1

CS 157B Spring 2008

Midterm 1

CS/SE 157B Database Management Systems II February 1 Class Meeting