1.2k likes | 1.34k Views
Chapter 6. Structured Query Language (SQL). Sample Database for the following examples. Publishing company has three subsidiary publishers. Need to keep info about books, authors, editors, & publishers. Assume: an author may write more than one book a book may have more than one author
E N D
Chapter 6 Structured Query Language (SQL)
Sample Database for the following examples • Publishing company has three subsidiary publishers. • Need to keep info about books, authors, editors, & publishers. • Assume: • an author may write more than one book • a book may have more than one author • each book has one publisher • each book has one editor (may need to revise this assumption) • each editor may edit more than one book.
BookBiz Database titles authors publishers editors title_id title price type pubdate ytd_sales advance pub_id ed_id pub_id pub_name address city state zip au_id au_lname au_fname phone address city state zip ed_id ed_lname ed_fname phone address city state zip titleauthors title_id au_id au_ord royaltyshare
SELECT Queries SELECT [ALL| DISTINCT] Select_list FROM Table_list [WHERE Search_conditions] [GROUP BY Group_by_list] [HAVING Search_conditions] [ORDER BY Order_by_list] • Order of clauses is important • keywords are case insensitive. Everything else is (depends on system) • Attribute and table names may have to be qualified: owner.database.table.attribute
Eliminating Duplicate Row: DISTINCT SELECT [ALL| DISTINCT] Desired Attributes …… • ALL returns all qualified rows (default) • DISTINCT returns only unique rows Example: Select title Select DISTINCT title from titles from titles order by title order by title
WHERE Clause: Search Conditions • Relational operators • Logical: AND, OR, NOT • Special • BETWEEN - defines limits • IN - checks for value in a set • LIKE - checks for similar string • IS NULL - checks for nulls • EXISTS - used with subqueries SELECT * FROM PublishersWHERE zip = 77002 OR V_CODE = 77077;
Between Select title_id, ytd_sales From titles Where ytd_slates BETWEEN 4000 AND 12000; Select title_id, ytd_sales From titles Where ytd_slates NOT (BETWEEN 4000 AND 12000);
IN • List all authors who reside in the states of CA, TX or NY Select au_lname From authors where state in ('CA', 'TX', 'NY'); • List must be in parentheses • list elements separated by commas
IS [NOT] NULL • Null is a place holder for unknown info. Select title_id, advance from titles where advance is not null
LIKE • Match for embedded pattern in string • Syntax: WHERE attribute [NOT] LIKE 'pattern' • pattern must include a wildcard: • % matches a string of zero or more chars • - matches a single char E.g. find city of some author whose last name starts with 'Mc' or 'Mac' E.g. list all editors in the 713 area code
ORDER BY Clause SELECT Select_List FROM Table_list [WHERE Search_Conditions] [ORDER BY Order_by_list] • Order by allows you to sort results by any expression or mixture of expressions from the select_list
ORDER BY Clause • Ordering is ascending, unless you specify the DESC keyword. • Ties are broken by the second attribute on the ORDERBY list, etc. • Generally, items in orderby_list must also appear in select_list • attribute name • arithmetic expr of attributes • display label (alias) • a position number in select_list
ORDER BY Clause Select title_id, pub_id, price*ytd_sales as income From titles Where type in (‘business’ , ‘psychology’) Order by pub_id, income desc
ORDER BY Clause Select title_id, pub_id, price*ytd_sales as income From titles Where type in (‘business’ , ‘psychology’) Order by pub_id, 3 desc
Aggregate Functions • Aggregate functions are used to get summary values: sum, avg, min, max, count • aggregates are applied to sets of rows • to all the rows in a table • to those rows specified in WHERE clause • to groups of rows set up by the GROUP BY clause • Outcome is always a single value for each set of rows
Aggregate Syntax aggregate_function([DISTINCT] expression) • sum( [Distinct] expression) • avg ( [Distinct] expression) • count ([Distinct] expression) • count(*) //count number of selected rows • Max (expression) • Min (expression) • All aggregate functions, except count(*), ignore rows where expression is null
Aggregate fiunctions • Compute total ytd sales for all books Select sum(ytd_sales) From titles Note: Select price, sum(price) From titles NOT ALLOWED
Aggregate fiunctions • Compute total ytd sales for all books Select sum(ytd_sales) as Total From titles • Total 87654
Aggregate fiunctions • Total number of books: Select count(title_id) From titles • May use any other attribute: Select count(price) From titles • Same results as long as there are no NULL Select count(price), count(*) From titles Count distinct values? Select count(DISTINCT price) from titles
Aggregate Functions--Count • Count(expression) • expression is generally a column name • returns the number of non-null occurrences of the expression • Count(distinct expression) • Each occurrence is counted only once • Count(*) • counts numbers of rows (includes NULLs)
GROUP BY Clause SELECT [ALL| DISTINCT] Select_list FROM Table_list [WHERE Search_conditions] [GROUP BY Group_by_list] [HAVING Search_conditions] [ORDER BY Order_by_list]
Grouping Data & Aggregates • The Group by clause partitions a table into sets of rows • Aggregate functions (if any) produce summary values for each set ==> vector aggregates • Having clause puts conditions on the group by results much like the Where clause qualifies the individual rows
Grouping Data & Aggregates • How many books each publisher has? Select pub_id , count(title_id) as count from titles group by pub_id pub_idcount 1380 6 2733 17 1322 21
Groups within groups Select pub_id , type, count(title_id) as count from titles group by pub_id, type pub_idtypecount 1380 math 2 1380 cs 4 2733 math 5 2733 Business 12 1322 cs 5 1322 biology 3 1322 physics 6 • Divide the rows in the table by publishers • Separate the rows in each publisher group by type • Apply aggregate function(s) to each set
Avoid confusing queries Select pub_id, type , count(title_id) as count from titles group by pub_id pub_idtypecount 1380 math 6 1380 cs 6 2733 math 17 2733 Business 17 1322 cs 14 1322 biology 14 1322 physics 14
Groups • All NULLs are include in same group • Order_by clause always goes after Group_by • E.g. list avg price for each book type in ascend order. Include only books with more than $500 advance Select type, avg(price) as avg_price from titles where advance > 500 group by type order by avg_price
Having Clause • Puts conditions on the group_by results I.e. on groups formed by the group_by • Having conditions must be either • from select_list • aggregate function Order of execution: 1. Where clause selects rows 2. Group_by partitions resulting rows from (1) 3. Aggregates on groups are calculated 4. Having clause checks results of group rows to see which qualify for output.
Having Clause • Get number of books and average price of each book type whose average price is above $50 Select type, count(*), avg(price) from titles group by type having avg(price) > 50
Where vs Having Select type, count(*), avg(price) from titles where price > 50 group by type • Get number of books and average price of each book type. Include only books whose price is above $50
Where vs Having • Can't use aggregates in WHERE clause • WHERE clause applies to individual rows • HAVING clause applies to groups of rows
Where vs Having Select type, count(*) from titles group by type having type like 'B%' Equivalent to: Select type, count(*) from titles where type like 'B%' group by type
BookBiz Database titles authors publishers editors title_id title price type pubdate ytd_sales advance pub_id ed_id pub_id pub_name address city state zip au_id au_lname au_fname phone address city state zip ed_id ed_lname ed_fname phone address city state zip Salesline Sonum title_id qty_ordered qty_shipped date_shipped titleauthors Sales title_id au_id au_ord royaltyshare Sonum store_id sdate
Joining Tables • The join operation lets you retrieve data from more than one table in a SELECT statement. • Joins are needed because during the normalization process, data is split in multiple tables to minimize redundancy. • There can be more than one join op in a SELECT statement • Each join operation is done on two tables, using one column from each table as a connecting column • Join column should have compatible data types
Joining Tables--Example • List the names of the authors of the book titled “Database Systems” SELECT au_lname, au_fname FROM authors a, titles t, titleauthors tl WHERE t.title_id = tl.title_id AND tl.au_id = a.au_id AND t.title = ‘Database Systems’ ;
Joining Tables – SYNTAX-1 Select Select_list From table-1, table_2 [, table_3,…] Where [table_1.]Col join_op [table_2.]Col • Tables are joined pairwise • join_op: a relational operator • Specify a join on two tables using a column from each table as a connecting column. • connecting columns must be join compatible. Best if they have same data type. • Connecting columns DO NOT have to have the same name.
Joining Tables • Most often: connecting column is a primary key of one table and foreign key of the second table. • Example: Find the name of the editor of book title "Database Systems" Select ed_fname, ed_lname From editors e, titles t Where t.ed_id = e.ed_id AND t.title =‘Database Systems’ ;
Joining Tables – SYNTAX-2 SELECT Select_list FROM table-1 [NATURAL|CROSS]JOIN table_2 [ON [table_1.]Col join_op [table_2].Col] | USING (column) ] • join_op: a relational operator • If keywords NATURAL or CROSS are used, the ON and USING clauses are not allowed • NATURAL JOIN works only when the two tables have a single identical column. The join is done on this column • CROSS JOIN produces the Cartesian product need to use a WHERE clause with it.
Joining Tables • Example: Find the name of the editor of book title "Database Systems“ SELECT ed_fname, ed_lname FROM editors JOIN titles ON editors.ed_id = titles.ed_id WHERE title.title ='Database Systems’ SELECT ed_fname, ed_lname FROM editors NATURAL JOIN titles WHERE title.title = ‘Database Systems’ SELECT ed_fname, ed_lname FROM editors CROSS JOIN titles WHERE editors.ed_id = titles.ed_id AND title.title ='Database Systems’
Joining Tables • e.g. find out if any editor is also an author select ed_lname, ed_fname from editors, authors where editors.ed_id = authors.au_id Note: • if connecting columns have NULLs, the NULLs won't join: NULLs are not equal to each other
Joining Tables--Examples • Find all authors who live in same city as any publisher • Select a.au_lname, a.city, p.pub_name From authors a, publishers p where a.city = p.city
Joining more than two tables • example: find titles of all business books and the names of their authors Select au_lname, au_fname, title from authors a, titles b, tiltesauthors ba where a.au_id = ba.au_id and b.title_id = ba.title_id and b.type = 'business'
Self_joins • Find authors who live in same city as "Steven King" Select au_lname, au_fname, city from authors a1, authors a2 where a1.city = a2.city and a2.au_lname = 'King' and a2.au_fname = 'Steven'
Non-equijoins • Equijoin: join on matching (equal) values • Join can be done on other relational operations: > >= <= <= <>
Joins not based on equality • Example: list all authors with same last name Select a1.au_id, a1.au_fname, a1.au_lname from authors a1, authors a2 where a1.au_lname = a2.au_lname and a1.au_id <> a2.au_id
Joins not based on equality • Find all sales orders that were on date later than the sales date (company tries to ship same day) and the store id and date they were sold at. SELECT DISTINCT s.sonum, s.store_id, s.sdate FROM sales s, salesline sl WHERE s.sdate < sl.date_shipped AND s.sonum = sl.sonum;
Subqueries A subquery is a SELECT statement that nests inside the WHERE or HAVING clause of another SELECT, INSERT, UPDATE or DELETE statement SELECT [DISTINCT] Select_list FROM Table_list WHERE {expression [not] IN | comparison_op [ANY|ALL] | EXISTS} (SELECT [DISTINCT] Select_list FROM Table_list WHERE Conditions ) [GROUP BY Group_by_list] [HAVING Search_conditions] [ORDER BY Order_by_list]
Simple Subqueries • Find names of publishers who publish CS books Select distinct pub_name from publishers where pub_id IN (Select pub_id from titles where type = 'CS') • Inner and outer queries are evaluated in two steps. • First the inner subquery is evaluated. Then the result (a set of pub_ids) are passed to outer subquery.
Subqueries & Joins Select distinct pub_name from publishers p, titles b where p.pub_id = b.pub_id and type = 'CS' • Subqueries and joins can often be used interchangeably • In some cases, one is preferred to other
Subqueries & Joins • Find all books with prices equal to the minimum book price. • Subquery: select title, price from titles where price = ( select min (price) from titles) • Can NOT be done with a join.
Subqueries & Joins • Find name of the authors whose address is in the same city as the home office of a publisher (list pub_name, authors name & city) • Join: select pub_name, au_fname, au_lname, authors.city From publishers, authors where publishers.city = authors.city