Schema refinement
Download
1 / 24

Schema Refinement - PowerPoint PPT Presentation


  • 79 Views
  • Uploaded on

Schema Refinement. Learning Objectives. Identify update, insertion and deletion anomalies Identify possible keys given an instance Identify possible functional dependencies in a relation Determine all keys in a schema. What is Schema Refinement?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Schema Refinement' - omar-frederick


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Schema refinement

Schema Refinement

SHIRAJ MOHAMED M | MIS


Learning objectives
Learning Objectives

  • Identify update, insertion and deletion anomalies

  • Identify possible keys given an instance

  • Identify possible functional dependencies in a relation

  • Determine all keys in a schema

SHIRAJ MOHAMED M | MIS


What is schema refinement
What is Schema Refinement?

  • Schema Refinement is the study of what should go where in a DBMS, or, which schemas are best to describe an application.

  • For example, consider this schema

  • Versus this one:

  • Which schema do you think is best? Why?

EID Name DeptID DeptName

A01 Ali 12 Wing

A12 Eric 10 Tail

A13 Eric 12 Wing

A03 Tyler 12 Wing

EmpDept

EID Name DeptID

A01 Ali 12

A12 Eric 10

A13 Eric 12

A03 Tyler 12

DeptID DeptName

12 Wing

10 Tail

Emp

Dept

SHIRAJ MOHAMED M | MIS


What s wrong
What’s wrong?*

  • The first problem students usually identify with the EmpDept schema is that it combines two different ideas: employee information and department information. But what is wrong with this?

  • If we separated the two concepts we could save space.

  • Combining the two ideas leads to some badanomalies.

  • These two problems occur because DeptID determines DeptName, but DeptID is not a key. Let’s look into the anomalies further.

SHIRAJ MOHAMED M | MIS


Anomalies redundancy
Anomalies, Redundancy*

EID Name DeptID DeptName

A01 Ali 12 Wing

A12 Eric 10 Tail

A13 Eric 12 Wing

A03 Tyler 12 Wing

EmpDept

  • What anomalies are associated with EmpDept?

  • Update Anomalies:

    If the Wing department changes its name, we must change multiple rows in EmpDept

  • Insertion Anomalies:

    If a department has no employees, where do we store its name?

  • Deletion Anomalies:

    If A12 Eric quits, the information about the Tail department will be lost.

SHIRAJ MOHAMED M | MIS


Practice anomalies redundancies
Practice Anomalies, Redundancies*

  • Identify anomalies associated with this schema. Include update, insertion and deletion anomalies.

    EnrollStud(StudID, ClassID, Grade, ProfID, StudName)

  • Why do these anomalies occur?

SHIRAJ MOHAMED M | MIS


Practice anomalies redundancies1
Practice Anomalies, Redundancies*

  • Update Anomaly: If a student changes his name, we must change each row for which the student has taken a class. If a class changes the profID, we must change it for every row in which the class appears.

  • Insertion Anomaly: If a student has not taken a class, where do we store her name? If a class has no student grades recorded yet, where do we store its ProfID?

  • Deletion Anomaly: If a student drops her last course, the information about the student’s name will be lost. If the last student drops the course, the info about the ProfID will be lost.

SHIRAJ MOHAMED M | MIS


Decomposition a good solution
Decomposition: A good solution

  • The intergalactic standard solution to the redundancy problem is to decompose redundant schemas, e.g., EmpDept becomes

  • The secret to understanding when and how to decompose schemas is Functional Dependencies, a generalization of keys.

  • When we say "X determines Y" we are stating a functional dependency.

EID Name DeptID

A01 Ali 12

A12 Eric 10

A13 Eric 12

A03 Tyler 12

DeptID DeptName

12 Wing

10 Tail

Emp

Dept

SHIRAJ MOHAMED M | MIS


Review keys
Review Keys

EID Name DeptID DeptName

A01 Ali 12 Wing

A12 Eric 10 Tail

A13 Eric 12 Wing

A03 Tyler 12 Wing

EmpDept

  • Note that EID being a key* of EmpDept means that the values of EID are unique, and EID is minimal.

  • Remember: you cannot determine keys from an instance, only from “natural” information or from a domain expert.

  • Let’s practice keys by identifying possible keys in an instance.

    *sometimes called a candidate key

SHIRAJ MOHAMED M | MIS


Identify possible keys
Identify Possible Keys*

  • Identify all possible Keys based on this instance:

Time Flight Plane Origin Destination

9:57AM 157 abc SEA PDX

10:42AM 233 def PDX SEA

11:44AM 155 des ORD ATL

12:44PM 244 xdy ATL PDX

1:43PM 074 xyz SEA ATL

2:44PM 233 def PDX ATL

3:55PM 455 eff MSP SEA

5:44PM 120 ikk MSP PDX

7:55PM 233 abf CHI SEA

SHIRAJ MOHAMED M | MIS


Identify possible keys1
Identify Possible Keys*

  • Identify all possible Keys based on this instance:

Time Flight Plane Origin Destination

9:57AM 157 abc SEA PDX

10:42AM 233 def PDX SEA

11:44AM 155 des ORD ATL

12:44PM 244 xdy ATL PDX

1:43PM 074 xyz SEA ATL

2:44PM 233 def PDX ATL

3:55PM 455 eff MSP SEA

5:44PM 120 ikk MSP PDX

7:55PM 233 abf CHI SEA

SHIRAJ MOHAMED M | MIS

Possible keys are:

{Time}, {Plane, Dest}, {Origin, Dest}


Functional dependencies
Functional Dependencies

EID Name DeptID DeptName

A01 Ali 12 Wing

A12 Eric 10 Tail

A13 Eric 12 Wing

A03 Tyler 12 Wing

EmpDept

  • A key like EID has another property: If two rows have the same EID, then they have the same value of every other attribute. We say EID functionally determines all other attributes and write this Functional Dependency (FD):

    EID Name, DeptID, DeptName

  • Is Name DeptID true?

    • No, because rows 2 and 3 have the same Name but not the same DeptID.

SHIRAJ MOHAMED M | MIS


Functional dependencies ctd
Functional Dependencies, ctd.

EID Name DeptID DeptName

A01 Ali 12 Wing

A12 Eric 10 Tail

A13 Eric 12 Wing

A03 Tyler 12 Wing

EmpDept

  • Do you see any more FDs in EmpDept?

    • Yes, the FD DeptID  DeptName

  • DEFINITION: If A and B are sets of attributes in a relation, we say that A (functionally) determines B, or AB is a Functional Dependency (FD) if whenever two rows agree on A, they agree on B. In other words, the value of a row on A functionally determines its value on B.

  • There are two special kinds of FDs:

    • Key FDs, XA where X contains a key

    • Trivial FDs, such as NameName, or Name,DeptIDDeptID

SHIRAJ MOHAMED M | MIS


Identify natural fds
Identify (natural) FDs*

  • What are the (natural) FDs in these relations? Identify the key FDs but ignore trivial FDs

    Customer(CustID, Address, City, Zip, State)

    EnrollStud(StudID, ClassID, Grade, ProfID, StudName, ProfName)

SHIRAJ MOHAMED M | MIS


Identify natural fds1
Identify (natural) FDs*

  • What are the (natural) FDs in these relations? Identify the key FDs but ignore trivial FDs

    Customer(CustID, Address, City, Zip, State)

  • CustID -> Address, City, Zip, State. This is a key FD

  • Address, City, State -> Zip

  • Zip -> State

    EnrollStud(StudID, ClassID, Grade, ProfID, StudName, ProfName)

  • {studID,ClassID}->grade, ProfID, StudName,ProfName. This is a key FD

  • StudID -> StudName

  • ClassID -> ProfID,ProfName

  • ProfID -> ProfName

SHIRAJ MOHAMED M | MIS


What are fds
What are FDs?

  • An FD is a generalization of the concept of key.

  • FDs, like keys and foreign keys, are a kind of integrity constraint (IC).

  • Like other ICs, FDs are part of a relation’s schema.

  • For example, a schema might be:

    Assigned(EmpID Int,

    JobID Int,

    EmpName varchar(20),

    percent real,

    EmpID references… , JobID references…,

    PRIMARY KEY (EmpID, JobID))

    FDs: EmpIDEmpName

SHIRAJ MOHAMED M | MIS


How to determine fds
How to determine FDs

  • So far we have dealt with “natural” FDs. Sometimes it’s not clear what FDs apply in a relation, e.g., zip codes vs cities, or

    Supplier(Name, Address, Crating, Discount) – unclear what are the FDs.

  • There are two ways to determine FDs

    • Infer them as “natural” FDs from your experience

    • You may be given them as part of the schema, by the instructor or by the customer.

  • As with keys, you cannot determine FDs from an instance!

    • But you can tell if something is not an FD

SHIRAJ MOHAMED M | MIS


Lo8 3 identify possible fds
LO8.3:Identify Possible FDs*

  • Identify two possible non-key FDs based on this instance (identical to slide 10). Remember the possible keys for this instance are {Time}, {Plane, Dest}, {Origin, Dest}

Time Flight Plane Origin Destination

9:57AM 157 abc SEA PDX

10:42AM 233 def PDX SEA

11:44AM 155 des ORD ATL

12:44PM 244 xdy ATL PDX

1:43PM 074 xyz SEA ATL

2:44PM 233 def PDX ATL

3:55PM 455 eff MSP SEA

5:44PM 120 ikk MSP PDX

7:55PM 233 abf CHI SEA

SHIRAJ MOHAMED M | MIS


Lo8 3 identify possible fds1
LO8.3:Identify Possible FDs*

  • Identify two possible non-key FDs based on this instance (identical to slide 10). Remember the possible keys for this instance are {Time}, {Plane, Dest}, {Origin, Dest}

Time Flight Plane Origin Destination

9:57AM 157 abc SEA PDX

10:42AM 233 def PDX SEA

11:44AM 155 des ORD ATL

12:44PM 244 xdy ATL PDX

1:43PM 074 xyz SEA ATL

2:44PM 233 def PDX ATL

3:55PM 455 eff MSP SEA

5:44PM 120 ikk MSP PDX

7:55PM 233 abf CHI SEA

SHIRAJ MOHAMED M | MIS

Possible FDs are Plane -> Flight and Plane -> Orig


Reasoning about fds
Reasoning about FDs

EmpDept(EID, Name, DeptID, DeptName)

  • Two natural FDs are

    EIDDeptID and DeptIDDeptName

  • These two FDs imply the FD EIDDeptName

    • Because if two tuples agree on EID, then by the first FD they agree on DeptID, then by the second FD they agree on DeptName.

  • The set of FDs implied by a given set F of FDs is called the closure of F and is denoted F+

SHIRAJ MOHAMED M | MIS


Armstrong s axioms
Armstrong’s Axioms

  • The closure of F can be computed using these axioms

    • Reflexivity: If X  Y, then XY

    • Augmentation: If XY, then XZYZ for any Z

    • Transitivity: If XY and YZ then XZ

  • Armstrong’s axioms are sound (they generate only FDs in F+ when applied to FDs in F) and complete (repeated application of these axioms will generate all FDs in F+).

SHIRAJ MOHAMED M | MIS


Determining keys
Determining Keys

  • In order to determine if X is a key of a relation R, use this algorithm, which computes the attribute closure of X:

    AttClos = X; // Note: X is a set of attributes

    Repeat until there is no change 

    If there is an FD UV with U  AttClos, then set AttClos = AttClos ∪ V

    AttClos=R if and only if X is a key

SHIRAJ MOHAMED M | MIS


Determining the keys of r
Determining the keys of R*

  • Given the schema: R(A,B,C,D,E) BCA, DEC .

  • What are all the keys of this schema?

  • Hint: any key must include A, BC or DE. Why?

SHIRAJ MOHAMED M | MIS


Determining the keys of r1
Determining the keys of R*

Any key must include A, BC or DE because otherwise the Attribute Closure algorithm will never get started.

Determining the keys will be done in three steps, one for A, one for BC and one for DE.

1. A: it is already a key so we are done with this step, A is a key..

2. BC determines A which determines everything else so we are done, BC is a key

3. DE->DEC, dead end, so DE is not a key. Let’s try adding attributes to DE, in alphabetical order.

We can’t add A to DE, since the result would not be minimal (A is a key)

DEB->DEBC contains a key so DEB is a key

DEC, dead end, so DEC is not a key, and we can’t add anything to it to make a key (A and B would make it a key we have already seen).

Conclusion: The keys are A, BC and DEB. Notice how systematic we were. You’ll need it for the exercises and homework.

SHIRAJ MOHAMED M | MIS

R(A,B,C,D,E) BCA, DEC .


ad