Create Presentation
Download Presentation

Download Presentation
## SQL Unit 7 Set Operations

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**SQL Unit 7Set Operations**Kirk Scott**7.1 Introduction**• 7.2 UNION Queries • 7.3 Queries with IN (Intersection) • 7.4 Queries with NOT IN (Set Subtraction) • 7.5 Unions, Joins, and Outer Joins**1. The technical term for a table is a relation. A**relation is like a set. • The technical term for a row in a table is a tuple. • A tuple is like an element in a set.**The fundamental difference between an element of a set and a**tuple in a relation is that the tuple may be a composite. • It may contain values for more than one different attribute.**The similarity between sets and relations explains some of**the aspects of relations. • The order of elements in a set is immaterial; • likewise, the order of tuples in a relation is immaterial.**A set can't contain duplicate elements;**• likewise, a relation can't contain duplicate tuples. • (Although a query result can.)**2. Recall the logical operator OR.**• This allowed you to make conditions on the values of attributes. • There is also a set level operation, union, which is related in meaning. • Union applies not to attribute values, but to collections of tuples in relations.**Given two sets, A and B, you may recall this definition of**union from math class: • = the union of A and B • = the set of elements in A or in B or in both A and B**A**B • In this Venn diagram, both A and B and the area where they overlap are shaded, indicating that they all are included in the union.**Microsoft Access SQL has the keyword UNION, which implements**the idea behind a logical union.**3. Here is a simple example illustrating the use of the**union operator. • Suppose that table A and table B have each been defined with the same number of fields, of the same type, in the same order. • The names of the fields don't have to be the same.**Then consider this query:**• SELECT * • FROM A • UNION • SELECT * • FROM B**The results will contain the set of rows that were in A or B**or both • Because the query uses a set operator, and duplicates are not allowed in sets, any row that might have occurred in both A and B will only appear once in the results. • It’s not a major thing, but to emphasize, you can observe the following: • In general, query results may contain duplicate rows, but the use of set operators has a similar effect to the keyword DISTINCT.**4. The previous example specified that the tables in the**two parts of the query had to have the same number of fields of the same type and in the same order. • It wouldn't do to have records in the same result table which varied in the number of fields they contained. • It also wouldn't do for numeric fields to hold non-numeric values, and vice-versa.**Having a correspondence between fields in the two parts of**the query is known as union compatibility. • The specific requirements for this are: • A. The corresponding fields in the two parts of the query should mean the same thing. • This may be referred to as semantic equivalence.**B. If the corresponding fields are of exactly the same type**and size, there is no problem at all. • The formal requirements are less stringent though: • i. All numeric fields are union compatible with each other. • ii. All text fields are union compatible with each other. • iii. All date fields are union compatible with each other.**In cases where the types of the fields are not the same, but**they are union compatible, the "larger" of the two types will be used in the results. • Given two union compatible types, the "larger" kind of one type can always hold values of the other type.**A text field with a large width can hold the values of a**text field with a smaller width. • A numeric type that can have decimal points can hold integer values. • Since the one that can hold the other is used in the results, no data will be lost when a union is done.**1. Here is a concrete example of a union query using tables**and fields from the cardealership database: • SELECT * • FROM Car • WHERE make = 'Chevrolet' • UNION • SELECT * • FROM Car • WHERE make = 'Toyota'**This query illustrates the relationship between the UNION**operator and the OR operator. • Because the two parts of the query are on the same table, the following query would accomplish the same thing: • SELECT * • FROM Car • WHERE make = 'Chevrolet' • OR make = 'Toyota'**In this query the Car table is the "universe", and the query**finds the union of two disjoint subsets of the Car table, because no car could have two different makes at the same time. • This is the Venn diagram for the query:**Car**make = 'Chevro-let' make = 'Toyota'**2. Here is another example of a union query.**• The two parts of the query are based on two different tables: • SELECT name, addr, city, state • FROM Customer • UNION • SELECT name, addr, city, state • FROM Salesperson**Because two tables are involved, it would not be possible to**accomplish this with the OR operator. • Notice that there is no problem with union compatibility because the corresponding fields in the two tables were defined in exactly the same way.**The Venn diagram for this query is more typical than the**previous diagram. • The same person could be both a salesperson and a customer. • The results of the query would include the names, addresses, cities, and states of all customers, all salespeople, and anybody who fell into both categories.**Custom-ers**Sales- people**A union can be thought of as a vertical combination of two**tables:**3. As noted previously, the union operator eliminates**duplicates from the results of a query. • If by chance you would like to do a union and not have the duplicates eliminated, you would use the keywords UNION ALL: • SELECT city • FROM Customer • UNION ALL • SELECT city • FROM Salesperson**There is a side effect related to eliminating or keeping**duplicates in the results. • When plain UNION is used, the duplicates will be eliminated and the results will typically be sorted in some order. • The explanation is that the system uses the following approach to eliminate duplicates: • First it sorts the records. • After sorting, duplicates should be next to each other. • Then the system finds and eliminates them.**4. It is possible to do unions where one part of the query**doesn't have fields corresponding to the fields in the other part. • Those fields that correspond have to be union compatible. • For those fields without corresponding fields, nulls have to be used.**Recall that the schemas for Customer and Salesperson look**like this: • Customer(custnopk, name, addr, city, state, phone) • Salesperson(spnopk, name, addr, city, state, phone, bossno, commrate)**Here is an example of a union query where all of the fields**of the Customer table are matched with the explicitly listed corresponding fields of the Salesperson table: • SELECT * • FROM Customer • UNION • SELECT spno, name, addr, city, state, phone • FROM Salesperson**If you would like to keep all of the fields from the**Salesperson table while also including all of the records from the Customer table in the results, you could do this: • SELECT *, NULL, NULL • FROM Customer • UNION • SELECT * • FROM Salesperson**1. Among the concepts of set theory, along with union,**there is intersection and there is the idea of set containment. • Given two sets, A and B, here is the definition of union again, along with the definitions of intersection and containment:**= the union of A and B**• = the set of elements in A or in B or in both A and B**= the intersection of A and B**• = the set of elements that A and B have in common**= A is contained in B;**• as a proposition this is either true or false, either the elements of A are also in B, or they're not**A**B • In this Venn diagram, the area where A and B overlap is crosshatched, indicating that this is the area in the intersection.**B**A • This Venn diagram signifies that A is contained in B:**Microsoft Access SQL does not have keywords for intersection**or containment, but it does have this operator: • IN • Using IN it is possible to write expressions that check whether or not a given set of tuples is included in another set. • This makes it possible to find the intersection of two sets.**2. It is possible to specify a set of values in SQL by**enclosing the values in parentheses (not curly braces) and separating them with commas. • This first example of the use of the keyword IN involves such a set: • SELECT * • FROM Car • WHERE make IN ('Chevrolet', 'Toyota')**This query is equivalent in results to the following query**already seen above: • SELECT * • FROM Car • WHERE make = 'Chevrolet' • OR make = 'Toyota' • The results of the query are the union of two sets.**3. The more general use of the keyword IN occurs when a set**of values in a query is defined by a subquery rather than a set listed in parentheses. • An example is shown below. • Notice that its structure is similar to the foregoing examples.**The outer query selects from a table where some field value**is in or is not in the set specified by the subquery: • SELECT name • FROM Salesperson • WHERE spno IN • (SELECT spno • FROM Carsale)**This query illustrates the ideas of intersection and**containment. • You're selecting the names of salespeople whose spno's appear in the Carsale table. • Because of referential integrity, every spno in the Carsale table has to appear in the Salesperson table. • That means that the set of spno's from the Carsale table is a subset of the spno's in the Salesperson table.**Not every salesperson has to have sold a car, so not**necessarily every spno in Salesperson appears in the Carsale table. • When you find the intersection between the two, it is simply the set of spno's from Carsale.