MGS 4020 Business Intelligence Relational Algebra and Structured Query Language Jul 13, 2010 - PowerPoint PPT Presentation

Mgs 4020 business intelligence relational algebra and structured query language jul 13 2010 l.jpg
Download
1 / 38

MGS 4020 Business Intelligence Relational Algebra and Structured Query Language Jul 13, 2010. SQL. Agenda. Queries. Join. Set Operations. Unary. Restriction. Projection. Binary. Union. Intersection. ?. ?. Difference. UNION. SQL Building Blocks. CREATE, ALTER, DROP

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

MGS 4020 Business Intelligence Relational Algebra and Structured Query Language Jul 13, 2010

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Mgs 4020 business intelligence relational algebra and structured query language jul 13 2010 l.jpg

MGS 4020Business Intelligence Relational Algebra and Structured Query Language Jul 13, 2010


Agenda l.jpg

SQL

Agenda

Queries

Join


Set operations l.jpg

Set Operations

Unary

Restriction

Projection

Binary

Union

Intersection

?

?

Difference

UNION


Sql building blocks l.jpg

SQL Building Blocks

  • CREATE, ALTER, DROP

  • INSERT, DELETE, UPDATE

  • SELECT

  • UNION, INTERSECT, MINUS

  • JOIN

  • INDEX

  • VIEWS

  • Utilities (introduced throughout the examples).

  • Transaction Management Features

  • Additional Features


Horizontal slices l.jpg

Horizontal Slices

  • Restriction

    • Specifying Conditions

Algebra: selection

or restriction

(R)

  • Unconditional

  • List all students

  • select*

  • fromSTUDENT;

  • (Student)

  • Conditional

  • List all student with GPA > 3.0

  • select*

  • fromSTUDENT

  • whereGPA > 3.0;

  • GPA > 3.0(Student)


Pattern matching l.jpg

List all CIS courses.

select*

fromCOURSE

wherecourse# like CIS%’;

Pattern Matching

‘%’any string with n characters, n>=0

‘_’any single character.

x exact sequence of string x.

List all CIS 3200 levelcourses.

select*

fromCOURSE

wherecourse# like ? ;


Specifying conditions l.jpg

Specifying Conditions

List all students in ...

select*

fromSTUDENT

wherecity in (‘Boston’,’Atlanta’);

List all students in ...

select*

fromSTUDENT

wherezip not between 60115 and 60123;


Vertical slices l.jpg

Vertical Slices

  • Projection

    • Specifying Elements

Algebra: projection

<A1,A2,...Am> (R)

  • No Specification

  • List all information about Students

  • select*

  • fromSTUDENT;

  • (Student)

  • Conditional

  • List IDs, names, and addresses of all students

  • selectID, name, address

  • fromSTUDENT;

  • ID, name, address(Student)


Does sql treat relations as sets l.jpg

Does SQL treat Relations as ‘Sets’

  • What are the different salaries we pay to our employees?

  • selectsalary

  • fromEMPLOYEE;

  • OR is the following better?

  • selectDISTINCT salary

  • fromEMPLOYEE;

  • Is the following necessary?


Horizontal and vertical l.jpg

Horizontal and Vertical

Query:

Lista all student ID, names and addresses who have

GPA > 3.0 and age >20.

selectID, Name, Address

fromSTUDENT

whereGPA > 3.0 and DOB < ‘1-Jan-6’

order byName DESC;

Algebra:

ID,name, address ( GPA > 3.0 and DOB < ‘1-Jan-74’ (STUDENT)

Order by sorts result in descending (DESC) order.

Note: The defauld order is ascending (ASC) as in:

order by Name;


Agenda11 l.jpg

SQL

Agenda

Queries

Join


Relational database l.jpg

Relational Database

  • A relational database is a collection of data items organized as a set of formally-described tables from which data can be accessed or reassembled in many different ways without having to reorganize the database tables. The relational database was invented by E. F. Codd at IBM in 1970.

  • The standard user and application program interface to a relational database is the structured query language (SQL). SQL statements are used both for interactive queries for information from a relational database and for gathering data for reports.

  • A relational database is a set of tables containing data fitted into predefined categories. Each table (which is sometimes called a relation) contains one or more data categories in columns. Each row contains a unique instance of data for the categories defined by the columns. For example, a typical business order entry database would include a table that described a customer with columns for name, address, phone number, and so forth. Another table would describe an order: product, customer, date, sales price, and so forth. A user of the database could obtain a view of the database that fitted the user's needs. For example, a branch office manager might like a view or report on all customers that had bought products after a certain date. A financial services manager in the same company could, from the same tables, obtain a report on accounts that needed to be paid.


Relational database13 l.jpg

Relational Database

When creating a relational database, you can define the domain of possible values in a data column and further constraints that may apply to that data value. For example, a domain of possible customers could allow up to ten possible customer names but be constrained in one table to allowing only three of these customer names to be specifiable.

The definition of a relational database results in a table of metadata or formal descriptions of

the tables, columns, domains, and constraints. Meta is a prefix that in most information technology usages means "an underlying definition or description." Thus, metadata is a definition or description of data and metalanguage is a definition or description of language.

A database is a collection of data that is organized so that its contents can easily be accessed, managed, and updated. The most prevalent type of database is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. A distributed database is one that can be dispersed or replicated among different points in a network. An object-oriented programming database is one that is congruent with the data defined in object classes and subclasses.

SQL (Structured Query Language) is a standard interactive and programming language for getting information from and updating a database. Although SQL is both an ANSI and an ISO standard, many database products support SQL with proprietary extensions to the standard language. Queries take the form of a command language that lets you select, insert, update, find out the location of data, and so forth.


Nesting queries l.jpg

Nesting Queries

  • SELECTattribute(s)

  • FROMrelation(S)

  • WHEREattr [not] {in | comparison operator | exists }

  • ( query statement(s) );

  • List names of students who are taking “BA201”

  • select Name

  • from Student

  • whereID in

  • ( selectID

  • fromREGISTRATION

  • where course#=‘BA201’);


Sub queries l.jpg

Sub Queries

List all students enrolled in CIS courses

selectname

fromSTUDENT

where studentnum in

(selectStudentId

fromREGISTRATION

wherecno like ‘CIS%’);

List all courses taken by Student (Id 1011)

selectcname

fromCOURSE

wherecnum _ any

(selectcno

fromREGISTRATION

whereStudentId = 1011);


Sub queries16 l.jpg

Sub Queries

Who received the highies grade in CIS 814

selectStudentId

fromGRADE

wherecnum = ‘CIS 814’ and

grade >=all

(selectgrade

fromGRADE

wherecno = ‘CIS 814’);

List all students enrolled in CIS courses.

selectname

fromSTUDENT S

whereexists

(select*

fromREGISTRATION

whereStudentId = S.Studentnum

and cno like “CIS%’);


Recursive queries l.jpg

Recursive Queries

List all employees who earn more than their

immediate supervisor.

selectE.Emp#, E.title, E.salary

fromEMPLOYEE, EMPLOYEE M

whereE.salary > M.salary and

E.ManagerEmp# = M.Wmp#;


Summaries and aggregates l.jpg

Summaries and Aggregates

Calculate the average GPA selectavg. (GPA)

from STUDENT,

Find the lowest GPAselectmin (GPA) as minGPA

from STUDENT,

How many CIS majors?selectcount (StudentId)

fromSTUDENT

wheremajor=‘CIS’;

Discarding duplicatesselectavg (distinct GPA)

fromSTUDENT

where major=‘CIS’


Aggregate functions l.jpg

Aggregate Functions

  • COUNT (attr)- a simple count of values in attr

  • SUM (attr)- sum of values in attr

  • AVG (attr)- average of values in attr

  • MAX (attr)- macimum value in attr

  • MIN (attr)- minimum value in attr

    • Take effect after data are retrieved from the database

    • Applied to either the entire resulting relation or groups

    • Can’t be involved in any query qualifications (where clause)

  • Would the following query be permitted?

  • selectStudentId

  • fromSTUDENT

  • whereGPA = max (GPA);


Missing or incomplete information l.jpg

Missing or Incomplete Information

  • List all students whose address or telephone number is missing:

  • select*

  • fromSTUDENT

  • whereAddress is null or GPA is null;

  • Truth Table T T T F F F U U U

  • T F U T F U T F U

  • ~ a

  • a & b

  • a or b


Groupin results obtained l.jpg

Groupin Results Obtained

  • Show all students enrolled in each course.

  • selectcno, stno

  • fromREGISTRATION

  • group by cno;Is this grouping OK?

    Calculate the average GPA of students by county.

  • selectcounty, avg (GPA) as County GPA

  • fromSTUDENT

  • group by county;

    Calculate the average GPA of each class.

  • selectcno, term, year, count (stno) as enrol

  • fromREGISTRATION

  • group by cno, year, term;


Selections on groups l.jpg

Selections on Groups

  • Show all CIS courses that are full.

  • selectcno, count (stno)

  • fromREGISTRATION

  • group by cno

  • havingcount (stno) > 29;


Union l.jpg

Union

List students who live in Atlanta or GPA > 3.0

selectID, Name, DOB, Address

fromSTUDENT

where Address = ‘Atlanta’

union

selectID, Name, DOB, Address

fromSTUDENT

where GPA > 3.0;

Can we perform a Union on any two Relations ?


Union compatibility l.jpg

Union Compatibility

Two relations, A and B, are union-compatible

  • if

    1) A and B contain a same number of attributes, and

    2) The corresponding attributes of the two have the same domains

  • Examples

  • CIS=Student (ID: Did; Name: Dname; Address: Daddr; Grade: Dgrade);

  • Senior-Student (SName: Dname; S#: Did; Home: Daddr; Grade: Dgrade);

  • Course (C#: Dnumber; Title: Dstr; Credits: Dnumber)

  • Are CIS-Student and Senior-Student union compatible?

  • Are CIS-Student and Couse union compatible?

  • What happens if we have duplicate tuples?

  • What will be the column names in the resulting Relation?


Union intersect minus l.jpg

B

A

Union, Intersect, Minus

selectCUSTNAME, ZIP

fromCUSTOMER

where STATE = ‘MA’

UNION

selectSUPNAME, ZIP

fromSUPPLIER

whereSTATE = ‘MA’

ORDER BY 2;

selectCUSTNAME, ZIP

fromCUSTOMER

where STATE = ‘MA’

INTERSECT

selectSUPNAME, ZIP

fromSUPPLIER

whereSTATE = ‘MA’

ORDER BY 2;

selectCUSTNAME, ZIP

fromCUSTOMER

where STATE = ‘MA’

MINUS

selectSUPNAME, ZIP

fromSUPPLIER

whereSTATE = ‘MA’

ORDER BY 2;

B

A

A

B

A


Union intersect minus26 l.jpg

Union, Intersect, Minus


Agenda27 l.jpg

SQL

Agenda

Queries

Join


Connecting linking relations l.jpg

Connecting/Linking Relations

  • List information about all students and the classes they are taking

Student

Class

What can we use to connect/link Relations?

Join: Connecting relations so that relevant tuples can be retrieved.


Slide29 l.jpg

R1

R2

Join

Cartesian

Product

Student: 30 tuples

Class: 4 tuples

Total Number of Tuples in the Cartesian Product. ?

(match each tuple of student to every tuple of class)

Select tuples having identical Student Ids.

Expected number of such Tuples: Join Selectivity


Join forms l.jpg

Join Forms

R1

R2

  • General Join Forms

    • Equijoin

    • Operator Dependent

  • Natural Join

  • Outer Join

    • Left

    • Right

    • Full

selects.*.c.*

fromSTUDENT s, CLASS c

wheres.ID = c. ID;

R1

R2

=

x > y

<>

...

selects.*.c.*

fromSTUDENT s, CLASS c

wheres.ID = c. ID (+);


Grouping results after join l.jpg

Grouping Results after Join

Calculate the average GPA of each class

selectcourse#, avg (GPA)

fromSTUDENT S, CLASS C

whereS.ID = C.ID

group by course#,


Index l.jpg

Index

  • There is no order among tuples

  • Indexes speed up data retrieval

  • Find all students who live in Atlanta.

  • How many tuples would you have to search for this query?

  • What if the table was ‘indexed’?

  • Index TableStudent


Creating and deleting indices l.jpg

Creating and Deleting Indices

CREATE [UNIQUE] INDEX index name

ON base-relation-name

( attr-name [order], attr-name[order] ...)

[CLUSTER];

create unique index student-id

on STUDENT ( ID ASC );

create index Address-index

on Student (Address);

create unique index Name-Age-Index

on STUDENT ( Name DESC, Age );

  • What are the advantages & disadvantages of Indexing?


Relational views l.jpg

View 1

View 2

View N

Base

Relation 1

Base

Relation 2

Relational Views

  • Relations derived from other relations.

  • Views have no stored tuples.

  • Are useful to provide multiple user views.


View creation l.jpg

View Creation

  • Create View view-name [ ( attr [ , attr ] ...) ]

  • AS subquery

  • [ with check option ] ;

  • DROP VIEW view-name;

    • Create a view containing the student ID, Name, Age and GPA for those who are qualified to take 300 level courses, i.e., GPA >=2.0.


View options l.jpg

View Options

  • With Check Option enforces the query condition for insertion or update

  • To enforce the GPA >=2.0 condition on all new student tuples inserted into the view

  • A view may be derived from multiple base relations

  • Create a view that includes student IDs, student names and their instructors’ names for all CIS 300 students.


View retrieval l.jpg

View Retrieval

Queries on views are the same as that on base relations.

Queries on views are expanded into queries on their base relations.

selectName, Instructor-Name

fromCIS300-Student

whereName = Instructor-Name;

?


View update l.jpg

View: Update

Update on a view actually changes its base relation(s)!

update Qualified-Student

setGPA = GPA-0.1

whereSID = ‘s3’;

insert intoQualified-Student

values( ‘s9’, ‘Lisa’, 4.0 )

insert intoQualified-Student

values( ‘s10’, ‘Peter’, 1.7 )

Why are some views not updateable?

What type of views are updateable?


  • Login