Aggregation

1 / 61

# Aggregation - PowerPoint PPT Presentation

Aggregation . Functions that operate on a single column of a table and return a single value. Five aggregation functions defined in SQL: COUNT returns the number of rows in a specified column. SUM returns the sum of the values in a specified column.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Aggregation' - art

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Aggregation
• Functions that operate on a single column of a table and return a single value. Five aggregation functions defined in SQL:
• COUNT returns the number of rows in a specified column.
• SUM returns the sum of the values in a specified column.
• AVG returns the average of the values in a specified column.
• MIN returns the smallest value in a specified column.
• MAX returns the largest value in a specified column.
Aggregation

max

min

avg

9000

30000

17000

Examples: Find the minimum, maximum, and average staff salary.

SELECT MIN(salary) AS min, MAX(salary) AS max, AVG(salary) AS avg

FROM staff;

Aggregation

count

2

Examples: How many properties cost more than 350 per month to rent?

SELECTCOUNT(*) AS count

FROM property

WHERE rent > 350;

Simple QueriesAggregation

count

2

Examples: How many different properties were viewed in May 1998?

SELECTCOUNT(DISTINCT pno) AS count

FROM viewing

WHERE date BETWEEN ‘1-May-98’ AND ‘31-May-98’;

Simple QueriesAggregation

sum

count

54000

2

Examples: Find the total number of Managers and the sum of their salaries.

SELECT COUNT(sno) AS count, SUM(salary) AS sum

FROM staff

WHERE position = ‘Manager’;

GROUPING

In many cases, we want to apply the aggregate functions to subgroups of tuples in a relation.

Each subgroup of tuples consists of the set of tuples that have the same value for the grouping attribute(s).

The function is applied to each subgroup independently.

SQL has a GROUP BY-clause for specifying the grouping attributes, which must also appear in the SELECT-clause.

GROUPING

Or COUNT(*)

count

bno

sum

B003

3

54000

B005

2

39000

B007

1

9000

Groups the data from the SELECT table(s) and produces a single summary row for each group.

Example: Find the number of staff working in each branch and the sum of their salaries

SELECT bno, COUNT(sno) AS count, SUM(salary) AS sum

FROM staff

GROUP BY bno

ORDER BY bno;

GROUPING

sno

bno

salary

B003

SG37

12000

count

sum

B003

SG14

18000

B003

SG5

24000

3

54000

B005

SL21

30000

2

39000

9000

1

9000

B005

SL41

SA9

9000

B007

• The STAFF tuples are divided into groups; each group having the same value for the grouping attribute bno.
• The COUNT and SUM functions are applied to each such group of tuples separately.
• The SELECT-clause includes only the grouping attribute and the functions to be applied on each group of tuples.
GROUPING

Course_attend

SELECT course, COUNT(student)AS count

FROM course_attend

GROUP BY course;

GROUPING

Course_attend

SELECT course, semester, COUNT(student)AS count

FROM course_attend

GROUP BY course, semester;

Simple QueriesHAVING clause

Sometimes we want to retrieve the values of these functions for only those groups that satisfy certain conditions.

The HAVING-clause is used for specifying a selection condition on groups (rather than on individual tuples).

Simple QueriesHAVING clause

count

bno

sum

B003

3

54000

B005

2

39000

Designed for use with the GROUP BY clause to restrict the groups that appear in the final result table.

WHERE clause filters individual rows going into the final result table

HAVING clause filters groups going into the final result table

Example: For each branch office with more than one member of staff, find the number of staff working in each branch and the sum of their salaries.

SELECT bno, COUNT(sno) AS count, SUM(salary) AS sum

FROM staff

GROUP BY bno

HAVING COUNT(sno) > 1

ORDER BY bno;

Multi-Table Queries: JOIN
• So far, the columns that are to appear in the result table must all come from a single table.
• To combine columns from several tables into a result table, we need to use a join operation.
• To perform a join, we include more than one table name in the FROM clause. WHERE clause to specify the join columns.

SELECT [DISTINCT|ALL] {* |column |[column_expression

[AS new_name]] [,…]}

FROM table_name [alias] [, … ]

[WHERE condition];

JOIN

Example: Retrieve the name and address of all employees who work for the 'Research' department.

SELECTFname, Lname, AddressFROM EMPLOYEE, DEPARTMENT WHERE Dname='Research' AND Dnumber=Dno;

• (Dname='Research') is a selection condition .
• (Dnumber=Dno) is a join condition.
JOIN
• Multiple Joins:
• Example: For every project located in 'Stafford', list the project number, the controlling department number, and the department manager's last name, address, and birth date.

SELECT Pnumber, Dnum, Lname, Bdate, Address

FROM PROJECT, DEPARTMENT, EMPLOYE

WHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND

Plocation='Stafford‘;

• There are two join conditions.
• The join condition Dnum=Dnumber relates a project to its controlling department.
• The join condition Mgr_ssn=Ssn relates the controlling department to the employee who manages that department.
JOIN

The functions are applied after the joining of the two relations.

Example: Find the maximum salary, the minimum salary, and the average salary among employees who work for the 'Research' department.

SELECT MAX(Salary)AS min, MAX(salary) AS max, AVG(salary) AS avg

FROM EMPLOYEE, DEPARTMENT

WHERE Dno=Dnumber AND Dname='Research‘;

Example: and the number of employees in the 'Research' department.

SELECT COUNT (*) AS count FROM EMPLOYEE, DEPARTMENTWHERE Dno=Dnumber AND Dname='Research’;

JOIN

Example: Retrieve a list of employees and the projects each works in, ordered by the employee's department, and within each department ordered alphabetically by employee last name, first name.

SELECT Dname, Lname, Fname, Pname

FROM DEPARTMENT,EMPLOYEE,WORKS_ON,PROJECT

WHERE Dnumber=Dno AND Ssn=Essn AND Pno=Pnumber ORDER BY Dname, Lname, Fname;

JOIN

Example: For each project, retrieve the project number, project name, and the number of employees who work on that project.

SELECT Pnumber, Pname, COUNT (*)As count FROM PROJECT, WORKS_ON WHERE Pnumber=Pno GROUP BY Pnumber, Pname;

• The grouping is applied after the joining of the two relations.
JOIN

Example: For each project on which more than two employees work, retrieve the project number, project name, and the number of employees who work on that project.

SELECT Pnumber, Pname, COUNT(*) FROM PROJECT, WORKS_ON WHERE Pnumber=Pno GROUP BY Pnumber, Pname HAVING COUNT (*) > 2;

JOIN

PROPERTYFORRENT(pno, street, area, city,pcode, type, rooms,rent,sno)

STAFF(sno, fname, lname, position, sex, DOB, salary, bno)

BRANCH(bno, street, city, postcode)

Examples: For each branch office, list the names of staff who manage properties, and the properties they manage, ordered by branch number, staff number and property number.

SELECT s.bno, s.sno, fname, lname, pno

FROM staff s, property_for_rent p

WHERE s.sno = p.sno

ORDER BY s.bno, s.sno, p.pno;

JOIN

BranchNo

StaffNo

LName

PropertyNo

FName

B003

SG14

David

Ford

PG16

B003

SG37

Ann

Beech

PG21

B003

SG37

Ann

Beech

PG36

B005

SL41

Julie

Lee

PL94

B007

SA9

Mary

Howe

PA14

JOIN

PROPERTYFORRENT(pno, street, area, city,pcode, type, rooms,rent,sno)

STAFF(sno, fname, lname, position, sex, DOB, salary, bno)

BRANCH(bno, street, city, postcode)

Exmaple: Find the number of properties handled by each staff member and branch

SELECT s.bno, s.sno, COUNT(*) AS count

FROM staff s, propertyForRent p

WHERE s.sno = p.sno

GROUP BY s.bno, s.sno

ORDER BY s.bno, s.sno;

JOIN

BranchNo

StaffNo

count

B003

SG14

1

B003

2

SG37

B005

SL41

1

B007

SA9

1

SET OPERATIONS

SQL has directly incorporated some set operations.

There is a union operation (UNION), and in some versions of SQL there are set difference (MINUS or EXCEPT) and intersection (INTERSECT) operations.

The resulting relations of these set operations are sets of tuples; duplicate tuples are eliminatedfrom the result.

The set operations apply only to union compatible relations; the two relations must have the same attributes and the attributes must appear in the same order.

UNION

Example: Make a list of all project numbers for projects that involve an employee whose last name is 'Smith' as a worker or as a manager of the department that controls the project.

(SELECT PnumberFROM PROJECT, DEPARTMENT, EMPLOYEEWHERE Dnum=Dnumber AND Mgr_ssn=Ssn AND Lname='Smith')

UNION

(SELECT PnumberFROM WORKS_ON, EMPLOYEEWHERE Essn=Ssn AND Lname='Smith');

UNION
• Example: Construct a list of all cities where there is either a branch office or a rental property.

(SELECT city FROM branch)

UNION

(SELECT city FROM property_for_rent);

INTERSECT
• Example : Construct a list of all cities where there is both a branch office and a rental property.

(SELECT city FROM branch)

INTERSECT

(SELECT city FROM property_for_rent);

OR

SELECT b.city

FROM branch b, property_for_rent p

WHERE b.city=p.city;

EXCEPT

Example: Construct a list of all cities where there is a branch office but no rental property.

(SELECT city FROM branch)

EXCEPT

(SELECT city FROM property_for_rent);

NESTING OF QUERIES

A complete SELECT query, called a nested query, can be specified within the WHERE-clause of another query, called the outer query.

In general, we can have several levels of nested queries.

NESTING OF QUERIES
• Example: Retrieve the name and address of all employees who work for the 'Research' department.

SELECT Fname, Lname, Address FROM EMPLOYEE WHERE Dno IN ( SELECT Dnumber FROM DEPARTMENT WHERE Dname='Research' );

NESTING OF QUERIES

STAFF(sno, fname, lname, position, sex, DOB, salary, bno)

BRANCH(bno, street, city, postcode)

Example: List the staff who work in the branch at ‘163 Main St’.

SELECT sno, fname, lname, position

FROM staff

WHERE bno =

(SELECT bno

FROM branch

WHERE street = ‘163 Main St’);

NESTING OF QUERIES

Example: List the staff whose salary is greater than the average salary, and list by how much their salary is greater than the average.

SELECT sno, fname, lname, position, salary – (SELECT avg(salary) FROM staff) AS sal_diff

FROM staff

WHERE salary >

(SELECT avg(salary)

FROM staff);

NESTING OF QUERIES

PROPERTYFORRENT(pno, street, area, city,pcode, type, rooms,rent,sno)

STAFF(sno, fname, lname, position, sex, DOB, salary, bno)

BRANCH(bno, street, city, postcode)

Example: List the properties that are handled by staff who work in the branch at ‘163 Main St’.

SELECT pno, street, area, city, pcode, type, rooms, rent

FROM property_for_rent

WHERE sno IN

(SELECT sno

FROM staff

WHERE bno =

(SELECT bno

FROM branch

WHERE street = ‘163 MainSt’));

CORRELATED NESTED QUERIES
• If a condition in the WHERE-clause of a nested query references an attribute of a relation declared in the outer query, the two queries are said to be correlated
• The result of a correlated nested query is different for each tuple (or combination of tuples) of the relation(s) the outer query.
• Example: Retrieve the name of each employee who has a dependent with the same first name as the employee.

SELECT Fname, Lname FROM EMPLOYEE WHERE SsnIN (SELECT Essn FROM DEPENDENT WHERE Essn=Ssn AND Fname=Dependent_name);

CORRELATED NESTED QUERIES
• A query written with nested SELECT... FROM... WHERE... blocks and using the = or IN comparison operators can always be expressed as a single block query. The prevoius query can be written as:

SELECT Fname, Lname FROM EMPLOYEE, DEPENDENT WHERE Ssn= EssnAND Fname= dependent_name;

THE EXISTS FUNCTION

EXISTS is used to check whether the result of a correlated nested query is empty (contains no tuples) or not.

Used only with sub-queries.

EXISTS is true if and only if there exists at least one row in the result table returned by the sub-query. It is false if the sub-query returns an empty result table.

Example: Retrieve the name of each employee who has a dependent with the same first name as the employee.

SELECT Fname, Lname FROM EMPLOYEE WHERE EXISTS (SELECT * FROM DEPENDENT WHERE Ssn=Essn AND Fname=Dependent_name);

THE EXISTS FUNCTION

Example: Retrieve the names of employees who have no dependents.

SELECT Fname, Lname FROM EMPLOYEE WHERE NOT EXISTS (SELECT * FROM DEPENDENT WHERE Ssn=Essn);

The correlated nested query retrieves all DEPENDENT tuples related to an EMPLOYEE tuple. If none exist, the EMPLOYEE tuple is selected.

THE EXISTS FUNCTION

STAFF(sno, fname, lname, position, sex, DOB, salary, bno)

BRANCH(bno, street, city, postcode)

Example: Find all staff who work in a London branch

SELECT sno, fname, lname, position

FROM staff s

WHERE EXISTS

(SELECT *

FROM branch b

WHERE s.bno = b.bno AND city = ‘London’);

NESTING OF QUERIES RULES

A complete SELECT statement can be embedded (subselect) within another SELECT statement

A subselect can be used in the WHERE and HAVING clauses of the outer SELECT statement (nested query).

A subquery can be used immediately following a relational operator

Subquery always enclosed in parentheses.

The ORDER BY clause may not be used in a subquery.

The subquery SELECT list must consist of a single column name or expression,except for subqueries that use the keyword EXISTS.

When a subquery is one of the two operands involved in a comparison, the subquery must appear on the right-hand side of the comparison.

Ambiguous Attribute Names, Aliasing, and Tuple Variables

Aliasing:

In SQL, we can use the same name for two (or more) attributes as long as the attributes are in different relations

A query that refers to two or more attributes with the same name must qualify the attribute name with the relation name by prefixing the relation name to the attribute name

Example:

EMPLOYEE.Lname, DEPARTMENT.Dname

Ambiguous Attribute Names, Aliasing, and Tuple Variables

STAFF(sno, fname, lname, position, sex, DOB, salary, bno)

BRANCH(bno, street, city, postcode)

Example: Find all staff who work in a London branch

SELECT sno, fname, lname, position

FROM staff, branch

WHERE staff.bno = branch.bno AND city = ‘London’;

SELECT sno, fname, lname, position

FROM staff AS s, branch AS b

WHERE s.bno = b.bno AND city = ‘London’;

SELECT sno, fname, lname, position

FROM staff s, branch b

WHERE s.bno = b.bno AND city = ‘London’;

Ambiguous Attribute Names, Aliasing, and Tuple Variables
• Some queries need to refer to the same relation twice.
• In this case, aliases are given to the relation name.
• Example: For each employee, retrieve the employee's name, and the name of his or her immediate supervisor.SELECT E.Fname, E.Lname, S.Fname, S.Lname FROM EMPLOYEE E, EMPLOYEE S WHERE E.Super_ssn=S.Ssn ;
• The alternate relation names E and S are called aliases or tuple variables for the EMPLOYEE relation.
• We can think of E and S as two different copies of EMPLOYEE; E represents employees in role of supervisees and S represents employees in role of supervisors.
Summary of SQL Queries

A query in SQL can consist of up to six clauses, but only the first two, SELECT and FROM, are mandatory. The clauses are specified in the following order:SELECT <attribute list>FROM <table list>[WHERE <condition>][GROUP BY <grouping attribute(s)>][HAVING <group condition>][ORDER BY <attribute list>]

There are three SQL commands to modify the database(DML): INSERT, DELETE, and UPDATE.

INSERT:

In its simplest form, it is used to add one or more tuples to a relation.

Attribute values should be listed in the same order as the attributes were specified in the CREATE TABLE command.

INSERT
• Syntax

INSERT INTO table_name [(column (,…))]

{ VALUES (date_value (,…)) | subquery };

• table_name may be either a base table or an updatable view
• column_list represents a list of one or more column names separated by commas.
• If omitted, SQL assumes a list of all columns in their original CREATE TABLE order.
• If specified, then any columns that are omitted from the list must have been declared as NULL column.
• data_value must match the column_list as follows:
• The number of items in each list must be same
• There must be a direct correspondence in the position of items in the two lists, so that the first item in the data_value_list applies to the second item in the column_list, and so on.
• The data type of each item in the data_value_list must be compatible with the data type of the correspon.ding column
INSERT
• Example:

INSERT INTO EMPLOYEEVALUES ('Richard','K','Marini', '653298653', '30-DEC-52','98 Oak Forest,Katy,TX', 'M', 37000,'987654321', 4);

• An alternate form of INSERT specifies explicitly the attribute names that correspond to the values in the new tuple
• Attributes with NULL values can be left out.
• Example: Insert a tuple for a new EMPLOYEE for whom we only know the FNAME, LNAME, and SSN attributes.

INSERT INTO EMPLOYEE (Fname, Lname, Ssn)VALUES ('Richard', 'Marini', '653298653');

INSERT

STAFF(sno, fname, lname, position, sex, DOB, salary, bno)

Example: Insert a new row into the staff table supplying data for all mandatory columns

INSERT INTO staff (Sno, fname, lname, position, salary, bno)

VALUES (‘SG16’, ‘Alan’, ‘Brown’, ‘Assistant’, 8300, ‘B003’);

Alternative:

INSERT INTO staff

VALUES (‘SG16’, ‘Alan’, ‘Brown’, ‘Assistant’, NULL, NULL, 8300, ‘B003’);

INSERT

Important Note: Only the constraints specified in the DDL commands are automatically enforced by the DBMS when updates are applied to the database

Example:

INSERT INTO EMPLOYEE (Fname, Lname, Ssn, Dno) VALUES ('Richard', 'Marini', '653298653‘, 2 );

Insert is rejected if referential integrity checking is provided by DBMS.

INSERT
• CREATE TABLE DEPTS_INFO (Dept_name VARCHAR(10) NOT NULL,No_of_emps INTEGER NOT NULL,Total_sal INTEGER );
• INSERT INTO DEPTS_INFO (Dept_name, No_of_emps,Total_sal)SELECT Dname, COUNT(*), SUM (Salary)

FROM DEPARTMENT, EMPLOYEE

WHERE Dnumber=Dno

GROUP BY Dname;

DELETE
• Syntax:

DELETE FROM table_name

[WHERE search_condition];

• table_namemay be either a base table or an updatable view.
• Only rows that satisfy the search_condition are deleted.
• If search_conditionis omitted, all rows are deleted from the table. The table then becomes an empty table.
• DELETE does not delete the table itself, only rows in the table.
• Referential integrity should be enforced.
• Tuples are deleted from only one table at a time (unless CASCADE is specified on a referential integrity constraint).
DELETE
• Examples:
• Delete all employees with last name Brown.

DELETE FROM EMPLOYEEWHERE Lname='Brown’;

• Delete the employee with SSN equal to ‘123456789’.

DELETE FROM EMPLOYEEWHERE Ssn='123456789’;

• Delete all employees who are working in Research department.

DELETE FROM EMPLOYEEWHERE Dno IN (SELECT Dnumber FROM DEPARTMENT WHERE Dname='Research');

• Delete all employees

DELETE FROM EMPLOYEE;

UPDATE
• Syntax:

UPDATE table_name

SET column_name1 = data_value1 [, column_namei=data_valuei ...]

WHERE search_condition]

• table_namemay be either a base table or an updatable view.
• The SET clause specifies the names of one or more columns that are updated for all rows in the table.
• Only rows that satisfy the search_condition are updated
• data_values must be compatible with the data types for the corresponding columns.
• Each command modifies tuples in the same relation.
• Referential integrity should be enforced.
UPDATE

STAFF(sno, fname, lname, position, sex, DOB, salary, bno)

Example: Give all staff a 3% pay increase

UPDATE staff

SET salary = salary * 1.03;

Example: Give all managers a 3% pay increase

UPDATE staff

SET salary = salary * 1.03

WHERE position = ‘Manager’;

UPDATE

STAFF(sno, fname, lname, position, sex, DOB, salary, bno)

Example: Promote David Ford (sno = ‘SG14’) to Manager and change his salary to \$18,000

UPDATE staff

SET position=‘Manager’, salary = 18000

WHERE sno=‘SG14’;

UPDATE
• Example: Give all employees in the 'Research' department a 10% raise in salary.

UPDATE EMPLOYEESET Salary = Salary *1.1WHERE Dno IN (SELECT Dnumber FROM DEPARTMENT WHERE Dname='Research');

• In this request, the modified SALARY value depends on the original SALARY value in each tuple
• The reference to the SALARY attribute on the right of = refers to the old SALARY value before modification
• The reference to the SALARY attribute on the left of = refers to the new SALARY value after modification
Specification of Views in SQL
• The command to specify a view is CREATE VIEW.
• The view is:
• given a virtual table name (or view name).
• List of attribute names.
• A query to specify the contents of the view.
• Ifwe did not specify the view attributes names next to the view name, they will have the same attributes names as in base tables.

Emp10

Ssn

Fname

Salary

Specification of Views in SQL
• Example: create a view returning a subset of the set of tuples of the Employees relation; the view must include the columns SSN, frist name and salary for the employees of department #10?

CREATE VIEW Emp10

AS SELECT Ssn, Fname, Salary

FROM EMPLOYEE

WHERE Dno=10;

• The names of the columns of view Emp10 are: SSN, Fname and Salary “same as Employee table”

WORKS_ON1

Fname

Lname

Pname

Hours

Specification of Views in SQL

CREATE VIEW WORKS_ON1

AS SELECT Fname, Lname,Pname, Hours

FROM EMPLOYEE, PROJECT,WORKS_ON

WHERE Ssn=Essn AND Pno=Pnumber;

Specification of Views in SQL
• Now we can specify SQL queries on view in the same way we do with base tables.
• Example: to retrive the last name and first name of all employees who work on ‘ProjectX’.

SELECT Fname, Lname

FROM WORKS_ON1

WHERE Pname=‘ProjectX’;

• A view is up to date, if you modify a tuple in the base tables on which the view is defined, the view must automatically reflect these changes.
• If you do not need a view any more, you can drop it by using:

DROP VIEW WORKS_ON1;

Specification of Views in SQL

CREATE VIEW DEPT_INFO (Dept_name,No_of_emps,Total_sal)

AS SELECT Dname, COUNT(*), SUM (Salary)

FROM DEPARTMENT, EMPLOYEE

WHERE Dnumber= Dno

GROUP BY Dname;

• Here we specify new attrbuites names for the view DEPT_INFO

Material for this lecture was obtained from:

• Fundamentals of Database Systems, Elmasri and Navathe,  5th edition, Addison-Wesley, 2007.
• Database Systems: A Practical Approach to Design, Implementation and Management, Thomas Connolly, Carolyn Begg, 3rd edition, 2002.
• The slides of dr. Lilac Safadi, King Saud University.
• The slides of T. Abeer Al-Nafjan, Imam University.