keys and functional dependency
Download
Skip this Video
Download Presentation
Keys and Functional Dependency

Loading in 2 Seconds...

play fullscreen
1 / 60

Keys and Functional Dependency - PowerPoint PPT Presentation


  • 100 Views
  • Uploaded on

CS157A. Lecture 14. Keys and Functional Dependency. Prof. Sin-Min Lee Department of Computer Science San Jose State University. Data Normalization. Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Keys and Functional Dependency' - maj


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
keys and functional dependency

CS157A

Lecture 14

Keys and Functional Dependency

Prof. Sin-Min Lee

Department of Computer Science

San Jose State University

data normalization
Data Normalization
  • Primarily a tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data.
  • The process of decomposing relations with anomalies to produce smaller, well-structured relations.
  • Primary Objective: Reduce Redundancy,Reduce nulls,
  • Improve “modify” activities:
    • insert,
    • update,
    • delete,
    • but not read
  • Price: degraded query, display, reporting
functional dependency and keys
Functional Dependency and Keys
  • Functional Dependency: The value of one attribute (the determinant) determines the value of another attribute.
  • Candidate Key: Each non-key field is functionally dependent on every candidate key.
functional dependency
Functional dependency
  • a constraint between two attributes (columns) or two sets of columns
  • A  B if “for every valid instance of A, that value of A uniquely determines the value of B”
  • or …A B if “there exists at most one value of B for every value of A”
functional dependencies
Functional Dependencies

R

X Y Z

  • FDs defined over two sets of attributes: X, Y Ì R
  • Notation: X à Y reads as “X determines Y”
  • If X à Y, then all tuples that agree on X must also agree on Y

1 2 3

2 4 5

1 2 4

1 2 7

2 4 8

3 7 9

slide7

Functional Dependencies (example)

X Y Z

X Y Z

1 2 3

2 4 5

1 2 4

1 2 7

2 4 8

3 7 9

functional dependency1
… functional dependency
  • some examples
  • SSN  Name, Address, Birthdate
  • VIN  Make, Model, Color
  • note: the LHS is the determinant
  • so functional dependency is the technical term for determines
candidate keys
Candidate Keys
  • an attribute (or set of attributes) that uniquely identifies a row
  • primary key is a special candidate key
    • values cannot be null
  • e.g.
    • ENROLL (Student_ID, Name, Address, …)
      • PK = Student_ID
      • candidate key = Name, Address
candidate key
… candidate key
  • a candidate key must satisfy:
    • unique identification.
      • implies that each nonkey attribute is functionally dependent on the key (for not(A  B) to be true, A must occur more than once (with a different B), or A must map to more than one B in a given row)
    • nonredundancy
      • no attribute in the key can be deleted and still be unique
      • minimal set of columns (Simsion)
keys and dependencies
keys and dependencies

EMPLOYEE1 (Emp_ID, Name, Dept_Name, Salary)

determinant

functional dependency

slide12

EMPLOYEE2 (Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed)

not fully functionally dependant on the primary key

determinants candidate keys
determinants & candidate keys
  • candidate key is always a determinant (one way to find a determinant)
  • determinant may or may not be a candidate key
  •  candidate key is a determinant that uniquely identifies the remaining (nonkey) attributes
  • determinant may be
    • a candidate key
    • part of a composite candidate key
    • nonkey attribute
introduction
Introduction
  • Data integrity maintained by various constraints on data
  • Functional dependencies are application constraints that help DB model real-world entity
  • Join dependencies are a further constraint that help resolve some FD constraint limitations
normal forms provide database designers with
Normal Forms provide database designers with:
  • A formal framework for analyzing relation schemas based on their keys and on the functional dependencies among their attributes.
  • A series of tests that can be carried out on individual relation schemas so that the relational database can be normalized to any degree.
slide21
Keys
  • superkey:a superkey is a set of attributes S  R={A1,A2,….An} with the property that no two tuples t1 and t2 in any relation state r of R will have t1[S] = t2[S].
  • A key K is a superkey with the additional property that removal of any attribute from K will cause K not to be a superkey anymore.
slide22
Keys
  • The difference between a key and a superkey is that a key has to be “minimal”.
  • Example:
  • {SSN} is a key for EMPLOYEE, whereas {SSN}, {SSN,ENAME}, {SSN, ENAME, BDATE} are all superkeys.
slide23
Keys
  • If a relation schema has more than one “minimal” key, each is called a candidate key.
slide24
Keys
  • one of the candidate keys is designated to be the primary key.
  • Each relation schema must have a primary key.
  • For example, {SSN} is the only candidate key for EMPLOYEE, so it is also the primary key.
slide25

R(A B C D E)

  • FD1. A -> C
  • FD2. BC ->D
  • FD3. E ->AB
  • result = A
  • By FD1. A -> C A result
  • result = {A, C}

By FD2. BC -> D BC result

result = {A, C}

By FD3. E ->AB E result

result = {A, C}

 {A}+ = {A, C}

slide26

Similarly {B}+ = {B}

  • {C}+ = {C}
  • {D}+ = {D}
  • {E}+ = {E,A,B,C,D}
  • E is a candidate key

Now, we see

{AB}+ = {ABCD} {AC}+ = {AC} {AD}+ = {ACD}

{BC}+ = {BCD} {BD}+ = {BD} {CD}+ = {CD}

{ABC}+ = {ABCD} {ABD}+ = {ABCD} {BCD}+ = {BCD}

{ACD}+ = {ACD}

slide27

What is the largest normal form of this table?

R(A B C D E)

FD1. A ->C

FD2. BC ->D

FD3. E ->AB

Answer: {E} is the only candidate key of R

The non-prime attributes are: A, B, C, D

As FD!. A->C, we have transitive dependency.

Thus R(ABCD) is 2NF but not 3NF

what is normalization
What is Normalization?
  • The purpose of normalization is to produce a stable set of relations that is a faithful model of the operations of the enterprise. By following the principles of normalization, we can achieve a design that is highly flexible, allowing the model to be extended when needed to account for new attributes, entity sets, and relationships.
normal forms
Normal Forms
  • A relation is in specific normal form if it satisfies the set of requirements or constraints for that form. All of the normal forms are nested in that each satisfies the constraints of the previous one but is a "better" form because each eliminates flaws found in the previous
slide32
1NF
  • relation is in first normal form if it contains no multivalued attributes
  • remove repeating groups to a new table as already demonstrated, “carrying” the PK as a FK
first normal form 1nf
First Normal Form ( 1NF )
  • the domains of attributes must include only atomic(simple, indivisible) valuesand the value of any attribute in a tuple must be a single value from the domain of the attribute.
first normal form 1nf1
First Normal Form ( 1NF )
  • example:

Department

DNAME DNUMBER DMGRSSN DLOCATIONS

research 5 333445555 {Bellaire ,

Sugarland Houston}

Administration 4 987654321 {Stafford}

Headquarters 1 888665555 {Houston}

  • the domain of DLOCATIONS contains atomic values, but some tuples can have a set of these values. In this case,
    • DNUMBER x->DLOCATIONS.
  • The domain of DLOCATIONS contains sets of values and hence in non-atomic.
slide35

Our Example in 1NF

PROJ_NUM

PROJ_NAME

EMP_NUM

EMP_NAME

JOB_CLASS

CHG_HOUR

HOURS

  • Key (PROJ_NUM, EMP_NUM)
  • Given PROJ_NUM
    • PROJ_NAME is determined
  • Given EMP_NUM
    • EMP_NAME, JOB_CLASS, and CHG_HOUR are determined
slide36
2NF
  • a relation is in second normal form if it is in first normal form AND every nonkey attribute is fully functionally dependant on the primary key
  • i.e. remove partial functional dependencies, so no nonkey attribute depends on just part of the key
slide37

EMPLOYEE2 (Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed)

not fully functionally dependant on the primary key

second normal form 2nf
Second Normal Form ( 2NF )
  • it is based on the concept of full functional dependency.
  • A functional dependency XY is a full functional dependency , for any attribute A  X, {X - {A}}  Y.
second normal form
Second Normal Form
  • A relation is in second normal form (2NF) if and only if it is in first normal form and all the nonkey attributes are fully functionally dependent on the key.
slide40

Second Normal Form

  • A table is in second normal form (2NF) if:
    • It is in 1NF
    • It includes no partial dependencies. No attribute is dependent on only a portion of the primary key.
slide41
2NF
  • a relation is in 2NF if it is in 1NF and any one of these is true:
    • the PK consists of only 1 attribute
    • all attributes are part of the PK (no nonkey attributes)
    • every nonkey attribute is functionally dependant on the whole PK
slide42

2NF (Example)

A B C D

2 Candidate Keys

R with key{AB} is NOT 2NF

R with key{AC} is NOT 2NF

second normal form 2nf1
Second Normal Form ( 2NF )

fd1

fd2

  • {SSN, PNUMBER}HOURS is a fully dependency (neither SSNHOURS nor PNUMBERHOURS holds).

fd3

second normal form 2nf2
Second Normal Form ( 2NF )

EMP_PROJ

  • The functional dependencies fd1,fd2,fd3 lead to the decomposition of EMP_PROJ into the three relation schemas EP1,EP2,EP3, each of which is in 2NF.

fd1

fd2

fd3

2NF NORMALIZATION

EP2

EP3

EP1

fd2

fd1

fd3

1nf 2nf
1NF 2NF
  • EMPLOYEE2 (Emp_ID, Course_Title, Name, Dept_Name, Salary, Date_Completed)
  • EMPLOYEE1 (Emp_ID, Name, Dept_Name, Salary)
  • and
  • EMP_COURSE (Emp_ID, Course_Title, Date_Completed)
  • EMPLOYEE1 satisfies condition1
  • EMP_COURSE satisfies condition3
slide47
3NF
  • a relation is in third normal form if it is in 2NF, AND no transitive dependencies exist
  • transitive dependency is a functional dependency between nonkey attributes
transitive dependency
transitive dependency

transitive

dependency

transitive dependency1
… transitive dependency
  • same problems
    • insertion anomaly (no salesman without a customer)
    • deletion anomaly (if a salesman is assigned to only 1 customer, and the customer is deleted, we lose the salesman!)
    • modification (update) anomaly (reassign salesperson to region)
slide51

Converting to 2NF

  • To convert from 1NF to 2NF, list each key component and then the key itself.
  • Each component will become the key in a new table.
slide52

Our Example in 2NF

Table Name: PROJECT

PROJ_NUM

PROJ_NAME

Table Name: EMPLOYEE

EMP_NUM

EMP_NAME

JOB_CLASS

CHG_HOUR

Table Name: ASSIGN

PROJ_NUM

EMP_NUM

HOURS

slide53

Problems with 2NF

  • Transitive Dependency
    • An attribute that is dependent on a non-prime attribute exhibits transitive dependency.
    • Still leads to data anomalies.

EMP_NUM

EMP_NAME

JOB_CLASS

CHG_HOUR

Our example contains the transitive dependency:

JOB_CLASS -----> CHG_HOUR

second normal form1
Second Normal Form

Second normal form:

Let R’ be a relation, and let F be the set of governing FDs. An attribute belongs to R’ is prime if a key of R’ contains A. In other words, A is prime in R’ if there exists K<R’ such that (1) K->R’,

(2) for all B belongs to K, (K-B)->R’ not belongs to F+, and

(3) A belongs to K

third normal form
Third Normal Form

Third normal form:

Let R’ be a relation, a subset of the universal relation, in the context of a set of FDs F. R’ satisfies third normal form if for every nontrival X->A belong to F+, either

(1). X is superkey for R’ or

(2). A is a prime attribute in R’.

third normal form 3 nf
Third Normal Form ( 3 NF )
  • Third normal form is based on the concept of transitive dependency.
  • A functional dependency XY in a relation schema R is a transitive dependency if there is a set of attributes Z that is not a subset of any key of R, and both XZ and ZY hold.
third normal form 3 nf1
Third Normal Form ( 3 NF )

EMP_DEPT

ENAME SSN BDATE ADDRESS DNUMBER DNAME DMGRSSN

  • example:
  • the dependency SSNDMGRSSN is transitive through DNUMBER in EMP_DEPT, because both the dependencies SSNDNUMBER and DNUMBERDMGRSSN hold and DNUMBER is not a subset of the key of EMP_DEPT.
general definitions of second and third normal forms
General Definitions of Second and Third Normal Forms
  • A relation schema R is in second normal form (2NF) if every nonprime attribute A in R is fully functionally dependent on every key of R.
  • A relation schema R is in 3NF if, whenever a functional dependency X A holds in R, either
    • (a) X is a superkey of R.
    • (b) A is a prime attribute of R.
ad