1 / 61

# SQL Unit 5 Aggregation, GROUP BY, and HAVING - PowerPoint PPT Presentation

SQL Unit 5 Aggregation, GROUP BY, and HAVING. Kirk Scott. 5.1 Grouping By One Field 5.2 Grouping By More than One Field 5.3 GROUP BY with HAVING 5.4 More on Nulls. 5.1 Grouping By One Field. 1. Recall that the term aggregation referred to built-in functions like these:

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about ' SQL Unit 5 Aggregation, GROUP BY, and HAVING' - lowri

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### SQL Unit 5Aggregation, GROUP BY, and HAVING

Kirk Scott

• 2. functions like these: Remember also that the records in the Carsale table include the spno, and it is possible to write a query that orders the results of a query by that field:

• SELECT *

• FROM Carsale

• ORDER BY spno

• 3. What if you would like the functions like these: subtotals of the sums of the salespricesfor the cars sold by each salesperson?

• This would involve finding a SUM, and it would also depend on the spno

• Both of these fields arein the Carsale table.

• Here is a query that accomplishes this:

• SELECT spno, SUM(salesprice)

• FROM Carsale

• GROUP BY spno

• This query will give the subtotal for each functions like these: spno in the Carsale table.

• There will be only one row for each spno in the results of the query.

• In a sense when you GROUP BY, it is like having the keyword DISTINCT in the query.

• The aggregate functions ignore nulls, but GROUP BY does not functions like these:

• If any sales records had null spno's, the query results would also include a row where the sum of the salesprices for such records appeared.

• However, in calculating the sums, null values for salesprice would still be ignored

• Recall that the meaning of COUNT(*) is to count all of the records where any of the fields are non-null.

• None of the records can be all null, so this counts all records.

• GROUP BY will include in the results a group that counts how many records had a null spno, if there were any such records.

• The reason is simple. be included in the results of a GROUP BY query.

• By definition, there will only be one row per spno in the results of the query.

• However, it is possible that there would be more than one custno per salesperson.

• It would not be possible to show the multiple be included in the results of a GROUP BY query. custno's belonging to a single spno, so this is not allowed.

• It's true that in some cases there may only be one custno for a given spno, but even so, the syntax will not support exceptions like these.

• 6. It is possible to use GROUP BY and ORDER BY together in a single query.

• This is a simple, practical example.

• It illustrates the fact that you can order the results by the aggregate if you want to.

• Recall that the default order is by the GROUP BY field.

• SELECT spno, COUNT(*)

• FROM Carsale

• GROUP BY spno

• ORDER BY COUNT(*)

• This query will give the sum of the time in a query. stickerprices for every possible combination of make and model.

• Each of these combinations will appear only once in the results.

• Again, the effect is similar to having the keyword DISTINCT in a query.

• The results time in a query. would also include rows for the three cases where either the make, model, or both fields were null in the original records in the Car table.

• No fields other than make and model (and the aggregate) could be included in the select clause.

• Also, both make and model are optional in the SELECT, although in most cases the query results would probably be more useful if they were included.

• In this query the primary sort key is make and the secondary sort key is model.

• The results of the query will show every combination of make and model that occurs in the Car table sorted first by make, and within make by model.

• The corresponding GROUP BY query will show the sums of the stickerprices for every combination of make and model in the table and the results will be given in the same order as the ORDER BY query.

• 3. Observe that sort key is model. it would also be possible to write queries where the order that the fields are selected is changed.

• The sums for the various combinations of make and model wouldn't change, but the orders of the columns and rows in the results would change.

• The first example would put the model column before the make column, but the sort order of the rows would be the same as in the previous example.

• The second example would put the make column first and the model column second, but the sort order has been changed to sort first by model and than by make.

• It seems unlikely that anyone would write the query in this way intentionally, but it is possible that all they're interested in is the sum for each combination of make and model and the sort order doesn't make a difference.

• In any case, it's syntactically OK: model column second, but the sort order has been changed to sort first by model and than by make.

• SELECT make, model, SUM(stickerprice)

• FROM Car

• GROUP BY model, make

• 4. It bears repeating that including a GROUP BY field in the SELECT is optional.

• For example, the following example would be OK.

• The results will only show the make and sum in each row, but there will be a row for each combination of make and model:

• SELECT make, SUM(stickerprice)

• FROM Car

• GROUP BY make, model

• For example, this query is wrong: in the SELECT any fields except for the aggregate field and the fields in the GROUP BY.

• SELECT make, model, year, SUM(stickerprice)

• FROM Car

• GROUP BY make, model

• 5. It is always possible to specify an order for the results of a query in addition to doing GROUP BY.

• This example is kind of silly, because it simply accomplishes what could be accomplished by putting the fields in the GROUP BY in the other order.

• But it does illustrate how the syntax for ORDER BY will override the ordering that otherwise would be used by GROUP BY:

• SELECT make, model, SUM(stickerprice)

• FROM Car

• GROUP BY model, make

• ORDER BY make, model

• 1. In a simple query, a WHERE clause causes the SELECT to pick out only certain sets of records in a table based on a condition on the value of an individual field.

• This is known as a selection or a restriction.

• It might also be called a refinement of the query's results.

• A query with a WHERE clause will potentially give as its results a subset of the results that would be returned by the same query without the WHERE clause.

• For example, this query will show the achieve similar results as the WHERE clause in a simple query. spno's and the sums of the salesprices of cars that they sold, but only for those salespeople who sold a total of at least 50000 dollars worth of cars overall:

• SELECT spno, SUM(salesprice)

• FROM Carsale

• GROUP BY spno HAVING SUM(salesprice) >= 50000

• 3. It is possible to have both a condition on a GROUP BY field (a non-aggregate field) and the aggregate field in a query.

• Again, it may be helpful to keep them straight by using WHERE for the condition on the GROUP BY field.

• You have to use HAVING on the aggregate field in any case.

• So, for example, this query will find the makes and the sums of their stickerprices for makes that appear after Chevrolet in alphabetical order, and whose stickerprice sums are greater than or equal to 50000.

• Notice that even though the word "and" appears in the verbal description, the keyword AND does not belong in the syntax of a correct query implementing this:

• SELECT make, SUM( of their stickerprice)

• FROM Car

• WHERE make > 'Chevrolet'

• GROUP BY make HAVING SUM(stickerprice) >= 50000

• For example: conditions on the group by fields or the aggregate.

• SELECT make, SUM(stickerprice)

• FROM Car

• WHERE make > 'Chevrolet‘

• AND year > 2005

• GROUP BY make HAVING SUM(stickerprice) >= 50000

• 5. The ability to mix and match extends to joins. conditions on the group by fields or the aggregate.

• It is possible to have a join query where the grouping is done on the field of one table, while the aggregate is done on a field of the other table.

• Such a query could also include the keyword HAVING as well as other elements of SQL queries unrelated to grouping.

5.4 More on Nulls except for the joining condition in order to clearly illustrate doing a join and GROUP BY together.

1. For the purposes of the following discussion, here are the contents of the Salesperson table:

• If you want to make sure that nulls are included, you have to use the NZ function.

• For sums, if nulls are treated as zero, this won't make a difference, but for counts and averages, it will.

• In the query below the average will be calculated by dividing by 5 rather than 4:

• SELECT AVG(NZ(commrate, 0))

• FROM Salesperson

This is what the results look like: group for null values even though the aggregate functions ignore nulls.

• There is nothing surprising here. group for null values even though the aggregate functions ignore nulls.

• GROUP BY returns a row for the case where bossno is null.

• There is only one record that meets this condition, and the commrate for that salesperson is 0.05.

• That means that the average is also 0.05.

• GROUP BY also returns a row for the case where group for null values even though the aggregate functions ignore nulls. bossno equals 333, which happens to be a group where 4 records in the Salesperson table have that value.

• 3 of those 4 records in the Salesperson table have non-null commrates.

• The average for them is calculated as (0.03 + 0.05 + 0.03) / 3, giving the value shown above.

• The null value is ignored both in the sum in the numerator and in the count in the denominator.

• Other examples could be devised. group for null values even though the aggregate functions ignore nulls.

• The point simply is that you need to keep this in mind:

• GROUP BY will return rows for those cases where the GROUP BY fields are null.

• However, the aggregate functions still do ignore nulls in the aggregate fiels, unless you include NZ in the expression.

The End group for null values even though the aggregate functions ignore nulls.