Missing Information

1 / 24

# Missing Information - PowerPoint PPT Presentation

Missing Information. Stanislava Armstrong http://www.cs.nott.ac.uk/~saw/teaching/G64DBS/lecture_slides.html. Missing Information. Sometimes we don’t know what the value is for an entry in a relation We know that there is a value, but don’t know what it is

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' Missing Information' - ford

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Missing Information

Stanislava Armstrong

http://www.cs.nott.ac.uk/~saw/teaching/G64DBS/lecture_slides.html

SQL Data Definition II

Missing Information
• Sometimes we don’t know what the value is for an entry in a relation
• We know that there is a value, but don’t know what it is
• There is no value at all that makes any sense
• The value is intentionally undisclosed
• Two main methods have been proposed to deal with this
• NULLs can be used as markers to show that information is missing
• A default value can be used to represent the missing value

Missing Information

NULLs, \' \' and 0
• A NULL is a placeholder for a missing value (unknown, inapplicable or withheld)
• NULL is not an actual value
• An extra set of rules have been devised to accommodate NULL when using:
• Arithmetic expressions
• Comparison expressions
• GROUP BY and ORDER BY
• Aggregate functions
• Fro comparision: \' \' and 0 are actual values
• They will be treated as normal values at all times.
• One suggestion is to use \' \' and 0 when you know that the value does not exist and NULL when you don’t know what the value is

Missing Information

Problems with NULLs
• Problems with extending relational algebra operations to NULLs:
• Defining selection operation:

if we check tuples for some property like “Mark > 40” and one of our tuples is {Mark, NULL}, do we include it?

• Defining intersection or difference of two relations:

are two tuples {John,NULL} and {John,NULL} the same or not?

• Additional problems for SQL: do we treat NULLs as duplicates? Do we include them in count, sum, average and if so, how? How do arithmetic operations behave when an argument is NULL?

Missing Information

NULLS and Arithmetic Expressions

Fruit

ID

1

2

3

4

Name

kiwi

NULL

\' \'

lime

Price

NULL

34

10

0

Fruit

• Arithmetic expressions with NULL in them evaluate to NULL

e.g.

ID

1

2

3

4

Name

kiwi

NULL

\' \'

lime

Price

NULL

39

15

5

UPDATE Fruit

SET

price = price + 5;

Fruit

ID

1

2

3

4

Name

a kiwi

NULL

a

a lime

Price

NULL

34

10

0

UPDATE Fruit

SET name =

CONCAT

(\'a \', name);

Missing Information

NULLS and Comparing Expressions
• 1 > NULL can neither be evaluated to true nor to false. In fact it just cannot be evaluated.
• If we allow NULLs in the database we need to have a way to handle comparison of expressions.
• An expression which involves NULL theoretically evaluates to unknown.
• In MySQL that is usually represented by NULL

Exceptions and specifics:

• True OR NULL evaluates to True
• NULL is not equivalent to NULL, so NULL = NULL will evaluate to NULL and not to True.
• WHERE clause like Price >= 120 OR Price < 120 will not display rows which include NULL for the price

Missing Information

Example

Fruit

Fruit

ID

1

2

3

4

ID

3

4

Name

kiwi

NULL

apple

lime

Name

apple

lime

Price

NULL

34

0

0

Price

0

0

SELECT * FROM Fruit

WHERE price <20;

SELECT * FROM Fruit

WHERE price = NULL;

The empty set.

Missing Information

ISNULL, ISNOTNULL and ISNULL()
• As NULL could take any value, one NULL is not necessarily equal to another NULL.
• To deal with the problem of extracting rows that contain the NULL marker, MySQL has introduced number of keywords and functions.
• IS NOT NULL - returns all values that are not NULL
• IS NULL can be used to return all rows which contain the NULL marker for the tested column
• ISNULL(<expr>) – returns true if the <expr> evaluates to NULL, false otherwise.

Missing Information

Example

Fruit

Fruit

Fruit

Fruit

ID

1

ID

1

ID

2

3

4

ID

1

2

3

4

Name

kiwi

Name

kiwi

NULL

apple

lime

Name

NULL

apple

lime

Name

kiwi

Price

34

0

0

Price

NULL

34

0

0

Price

NULL

Price

NULL

SELECT * FROM Fruit WHERE price IS NOT NULL;

SELECT * FROM Fruit WHERE price IS NULL;

SELECT * FROM Fruit WHERE ISNULL(Price) = TRUE;

Missing Information

NULLs and aggregate functions
• Aggregate functions remove NULLs before performing the required operations
• COUNT(*) is the only one which takes rows containing NULLs into account
• Note that 0 unlike NULL will be taken in account in the calculations

Missing Information

Example

Fruit

Fruit

ID

1

2

3

4

Name

kiwi

kiwi

lime

lime

COUNT(*)

4

COUNT(Price)

3

Name

kiwi

lime

Price

NULL

34

0

34

AVG (Price)

34

17

SELECT COUNT(*) FROM Fruit;

SELECT COUNT(Price) FROM Fruit;

SELECT Name, AVG(price) FROM Fruit GROUP BY Name;

Missing Information

NULL with GROUP BY, ORDER BY AND DISTINCT

Fruit

Fruit

Fruit

Fruit

ID

4

2

1

3

ID

1

2

3

4

Name

kiwi

NULL

kiwi

NULL

Name

kiwi

Name

NULL

NULL

kiwi

kiwi

Name

NULL

kiwi

Price

NULL

34

100

0

Price

17

100

Price

34

0

NULL

100

• GROUP BY, ORDER BY and DISTINCT all treat NULLs as equivalent

SELECT Name, AVG(Price) FROM Fruit GROUP BY Name;

SELECT ID, Name, Price FROM Fruit ORDER BY Name, Price;

SELECT DISTINCT Name FROM Fruit;

Missing Information

Outer Joins
• When we take the join of two relations we match up tuples which share values
• Some tuples have no match, and are ‘lost’
• These are called ‘dangles’
• Outer joins include dangles in the result and use NULLs to fill in the blanks
• Left outer join
• Right outer join
• Full outer join – not available in MySQL

Missing Information

Outer Join Syntax in MySQL

SELECT <cols>

FROM <table1> <type> OUTER JOIN <table2>

ON <condition>

Where <type>is one of LEFTor RIGHT

• You can then use any of the normal SELECT attributes, e.g. a WHERE clause, GROUP BY, etc.
• You could also have a natural outer, the syntax is:

SELECT <cols>

FROM <table1> NATURAL <type> OUTER JOIN <table2>

Missing Information

Example

Fruit

ID

1

NULL

ID

1

2

ID

1

2

ID

1

3

Kingdom

Fairlyland

Trolland

Name

kiwi

lime

Name

kiwi

NULL

Name

kiwi

lime

Price

10

NULL

Price

10

20

Price

10

20

SELECT * FROM

Kingdom

Fairlyland

NULL

Kingdom

Fairlyland

Trolland

ID

1

3

ID

1

NULL

SELECT * FROM

Fruit RIGHT OUTER JOIN Buyers ON

Missing Information

Default Values
• Default values are an alternative to the use of NULLs
• If a value is not known a particular placeholder value - the default - is used
• These are actual values, so no special rules have to be introduced for them.
• Problem: since defaults are real values
• They can be updated like any other value
• You need to use a value that won’t appear in any other circumstances
• They might not be interpreted properly

Missing Information

DEFAULTS
• A default value can be assigned for each column
• You can use a meaningful value like -1 for price, but be careful with UPDATE
• The default value in MySQL has to be an exact value and not a function or an expression
• Exceptions:
• CURRENT_TIMESTAMP for TIMESTAMP
• AUTO_INCREMENT for numerical data types - the DEFAULT is the next value in the sequence
• In MySQL the TEXT and BLOB data types cannot be assigned a DEFAULT value
• If a column does not have a specific DEFAULT value, MySQL will assume DEFAULT NULL
• If the column does not have a DEFAULT value and it has been declared NOT NULL, the DEFAULT will be automatically set to 0 for numerical types and ‘ ‘ for character types

Missing Information

Splitting Tables
• NULLs and defaults both try to fill entries with missing data
• NULLs mark the data as missing
• Defaults give some indication as to what sort of missing information we are dealing with
• Often you can remove entries that have missing data
• You can split the table up so that columns which might have NULLs are in separate tables
• Entries that would be NULL are not present in these tables

Missing Information

Splitting Tables Example

Fruit

ID Name Fairyland Trolland

1 kiwi 10 20

2 lime 15 NULL

3 orange 3 100

grape NULL 30

5 cherry NULLNULL

6 Fig 150 0

Fruit

ID Name Fairyland

1 kiwi 10

2 lime 15

3 orange 3

6 Fig 150

Fruit

ID Name Trolland

1 kiwi 20

3 orange 100

grape 30

6 Fig 0

Missing Information

Problems with Splitting Tables
• Splitting tables has its own problems
• We might introduce many extra tables
• Information gets spread out over the database
• Queries become more complex and require many joins
• We can recover the original table, but
• We need to do an outer join to do so
• This introduces NULLs, which brings in all the associated problems again
• In MySQL full outer joins are not possible, which means that we will never be able to recover the original table exactly.

Missing Information

Which Method to Use?
• Often a matter of personal choice, but
• Default values should not be used when they might be confused with ‘real’ values
• Splitting tables shouldn’t be used too much or you’ll have lots of tables
• NULLs can (and often are) used where the other approaches seem inappropriate
• You don’t have to always use the same method - you can mix and match as needed

Missing Information

Exercise

Determine the outcome from executing each of the following statements on the Holidays table below:

• SELECT Destination FROM Holiday WHERE Destination LIKE \'C%\';
• SELECT Destination FROM Holiday WHERE Destination LIKE \'N%\';
• SELECT * FROM Holiday WHERE Price <> 500;
• SELECT * FROM Holiday WHERE ISNULL(Price) = TRUE;
• SELECT COUNT(*) FROM Holiday WHERE ISNULL(Price) = TRUE;

Missing Information

Exercise

Determine the outcome from executing each of the following statements on the Holidays table below:

• SELECT Destination, COUNT(*) FROM Holiday WHERE ISNULL(Price) = TRUE GROUP BY Destination;
• SELECT COUNT(*), AVG(Price) FROM Holiday;
• UPDATE Holiday SET Price = Price*0.80;
• SELECT * FROM Holiday;
• UPDATE Holiday SET Price = Price*0.80;
• SELECT * FROM Holiday;

Missing Information