390 likes | 631 Views
Κεφάλαιο 5. SQL Data Definition Language και άλλες γλώσσες Σχεσιακών Βάσεων Δεδομένων ( QBE). SQL Data Definition. The Data Definition Language (DDL) of SQL is used to CREATE, DROP and ALTER the descriptions of the relations in the database CREATE TABLE DEPARTMENT
E N D
Κεφάλαιο 5 SQL Data Definition Language και άλλες γλώσσες Σχεσιακών Βάσεων Δεδομένων (QBE)
SQL Data Definition • The Data Definition Language (DDL) of SQL is used to CREATE, DROP and ALTER the descriptions of the relations in the database CREATE TABLE DEPARTMENT ( DNumber integer keymember 0 not null, DName varchar(12) keymember 1 not null, MgrSSN char(9) references EMPLOYEE.SSN, MgrSD char(9) ); • In several (older) SQL systems, there is no support for REFERENCES (foreign key) and KEYMEMBER (key)
SQL DDL - continued • The command: DROP TABLE DEPENDENT deletes the full table DEPENDENT and its definition (can no longer be used in queries, updates, etc.) • The command: ALTER TABLE EMPLOYEE ADD JOB VARCHAR(15) adds a new attribute, named JOB, in relation EMPLOYEE • All values for this new attribute will initially be NULL • Ôhey can be changed in each tuple with the UPDATE command.
Creating Indexes in SQL • In most cases, a base relation corresponds to a stored file • The CREATE INDEX command is used to specify an index which always has a name (index name) CREATE INDEX LN_INDEX on EMPLOYEE (Name); • To specify a key constraint on the indexing attribute (or combination of attributes) the keyword UNIQUE is used CREATE UNIQUE INDEX S_IND on EMPLOYEE (SSN); • To specify that an index is a clustering index, the keyword CLUSTER is used CREATE INDEX D_IN on EMPLOYEE(DNumber) CLUSTER; • In most DBMSs, a variation of the B+-tree is used as the implementation data structure for indexes
Relational Views and SQL • The three-level database architecture implies that different users see different external schemas - this is accomplished with the concept of VIEWin the relational model. RELATIONAL VIEWS External Schema 1 External Schema 2 External Schema N ..... RELATIONAL DATABASE SCHEMA (RELATIONS) Conceptual Schema Internal Schema
Relational Views - Definition • A VIEW is a relation which is not part of the conceptual schema (the base relations) but is made visible to the user as a virtual relation • A user is not aware if a relation is a view or a base relation • A view is not stored (materialized) in the database • The contents of the view are determined by its stored definition as a function of the contents of the stored database • A view is defined on base tables or other views via a query • Queries and updates on views are translated into queries and updates on their defining base relations • There are no limitations on querying a view • Only few updates on views are allowed • A view changes dynamically with the database
Relational Views - Format CREATE VIEW table_name [ ( view_column_list ) ] AS query_expression [ WITH CHECK OPTION ] • V1: Project workers CREATE VIEW PROJWORKER(EName, Address, Project) ASselect Name, Address, PName from EMPLOYEE, PROJECT, WORKS_ON where PROJECT.PNumber = WORKS_ON.PNumber and WORKS_ON.SSN=EMPLOYEE.SSN ;
Relational Views - Examples • When the view_column_list is missing, the view attributes are inherited from the attribute names in the defining query • V2: Employees that are well-paid CREATE VIEW BIGSHARKS AS select * from EMPLOYEE where Salary > 50000 and BirthDate > 31.12.65; • The user may ask questions directly on this new table called BIGSHARKS, which has exactly the same attribute names as the base relation EMPLOYEE
View Examples (2) • A view can have different units (attributes) than the conceptual schema • V3: Information data on departments (average salary, etc.) CREATE VIEW DEPTINFO (Name, NoOfEmpl, AvgSalary) ASselect DName, COUNT(*), AVG(Salary) from DEPARTMENT d, EMPLOYEE e where d.DNumber = e.DNumber group by DName ;
View Examples (3) • V4: Find the number of employees and average salary in the research department (also print the department name) select * from DEPTINFO where Name = “research” • V5: Increase by 1 the number of employees in the research department • Such an update is NOT allowed (why?)
View Execution • As said before, the system translates retrievals (updates) on views into retrievals (updates) on the base relation(s). HOW is this done? • NAIVE Effort 1: Materialize the view (temporary table) and then run the query on it. • NAIVE Effort 2: Maintain the view at all times -- that is, keep it materialized (called, a snapshot) and whenever relevant updates on base relations are made then propagate them on the view. Run the query on the snapshot. • CORRECT Effort:Modify the query by replacing the view with its defining query on base relations, and run the query on these base relations.
View Execution Example • For instance, given the query: select Name, Address from BIGSHARKS where DNumber = 5 we modify it accordingly and we execute the query: select Name, Address from EMPLOYEE where Salary > 50000 and BirthDate > 31.12.65 and DNumber = 5
Updating on Views • In general, a view has restrictions on updates • A view update is unambiguous if one update on the base relations can accomplish the desired update effect on the view • OBSERVATIONS: • A view with a single defining table is updatable if the view attributes contain the primary key • Views defined on multiple tables are generally not updatable • Views defined with aggregate functions are not updatable • View updating is still an open research area
VIEWS - Comments • The “WITH CHECK OPTION” is used with updatable views to deal with the vanishing rows problem For instance consider the view: CREATE VIEW V AS select * from R where A = “X” Now, assume the update: UPDATE V set A = “Y” The update will go on without a problem, but the rows (tuples) that were previously seen in V will disappear! (they no longer meet the WHERE-clause condition) • Views should be used whenever certain subqueries or expressions are used very frequently • Views are also used as a security and authorization mechanism
Integrity Constraints • The relational model allows for the specification of 6 types of integrity constraints: (a) entity integrity, (b) referential integrity, (c) primary key integrity, (d) column integrity, (e) domain integrity, and(f) user-defined. • Most commercial relational database systemsdo not enforce all integrity constraints (enforcing means that if an updateviolates a constraint, it isnot allowed to execute) • We have seen how entity, referential and key integrity are specified - they are theinherent to the model constraints ---The other three constraints areexplicitly specified. Modern DBMSs enforce many of these constraints, with notable example the system: Access 2.0 of Microsoft.
Relational Model Constraints • When enforcing (supporting) referential integrity, severalactions must be taken by the DBMS. These are usually thecascade delete and thecascade update. • Some DBMSs supportdirectlythe cascade options (Access) • Other DBMSs require from the user to writetriggers (procedures) in order to support the options (SQL Server) • Column and Domain constraints are partially supported (not always bystrong-typing or other powerful mechanisms) • User-defined constraints were introduced in the newest SQL standard (SQL-92) with ASSERTIONS (stand-alone rules)
Assertions -- Examples • Every employee is either male or female ASSERT gender ON EMPLOYEE: Sex=“m” OR Sex=“f” • Salaries have to be above 10000 ASSERT salary_bound ON EMPLOYEE: Salary > 10000 If someone attempts to insert a tuple in EMPLOYEE such as: insert into EMPLOYEE (select Name=“tom”,.., Salary=12000..) the update is modified from the assert rule to: insert into EMPLOYEE (select Name=“tom”,.., Salary=12000.. where 12000 > 10000 )
Embedding SQL in a Programming Language • DML commands are oftenembeddedintohost language programs (these commands amount to calls to the DBMS) • The embedded SQL statement is distinguished from the programming language statements with a special “prefix” • There are two ways to “embed” a DML: 1.- Extend the Programming Language (Compiler) For example, RIGEL, MODULA-R, Gemstone, Orion, etc.) These are called database programming languages 2.- Use a pre-processor for the DML statements The pre-processor replaces all DML commands with calls to the host language, which are thenexecuted (the usual case)
Example with C as Host • When SQL is embedded in C, all DML statements and all declarations of variables are preceded by a “##”. ## char ename[30] ## int salary; ... prog () { ## sql companydb ## select ename=Name, salary=Salary from EMPLOYEE ## where DNumber=5 ## { /* process returns values */ ## } }
Embedded SQL • SQL commands can be called from within a host language (e.g., C or COBOL) program. • SQL statements can refer to host variables(including special variables used to return status). • Must include a statement to connect to the right database. • Requires compiler preprocessing • SQL relations are (multi-) sets of records, with no a priori bound on the number of records. No such data structure in C. • SQL supports a mechanism called a cursor to handle this.
Cursors • Can declare a cursor on a relation or query statement (which generates a relation). • Can open a cursor, and repeatedly fetch a tuple then move the cursor, until all tuples have been retrieved. • Can use ORDER BY to control the order in which tuples are returned. • Fields in ORDER BY clause must also appear in SELECT clause. • Can also modify/delete tuple pointed to by a cursor.
Cursor that gets names of sailors who’ve reserved a red boat, in alphabetical order • DECLARE sinfo CURSOR FOR SELECT S.sname FROM Sailors S, Boats B, Reserves R WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’ ORDER BY S.sname; FETCH 5 IN sinfo; • Note that it is illegal to replace S.sname by, say, S.sid in the ORDER BY clause! (Why?) • Can we add S.sid to the SELECT clause and replace S.sname by S.sid in the ORDER BY clause???
Embedding SQL in C -- An Example char SQLSTATE[6]; EXEC SQL BEGIN DECLARE SECTION char c_sname[20]; short c_minrating; float c_age; EXEC SQL END DECLARE SECTION c_minrating = random(); EXEC SQL DECLARE sinfo CURSOR FOR SELECT S.sname, S.age FROM Sailors S WHERE S.rating > :c_minrating ORDER BY S.sname; do { EXEC SQL FETCH sinfo INTO :c_sname, :c_age; printf(“%s is %d years old\n”, c_sname, c_age); } while (SQLSTATE != ‘02000’); EXEC SQL CLOSE sinfo;
Database APIs: alternative to embedding • Rather than modify compiler, add library with database calls (API) • special procedures/objects • passes SQL strings from language, presents result sets in a language-friendly way • Microsoft’s ODBC becoming C/C++ standard on Windows • Sun’s JDBC a Java equivalent • Supposedly DBMS-neutral • a “driver” traps the calls and translates them into DBMS-specific code • database can be across a network
SQL API in Java (JDBC) Connection con = // connect DriverManager.getConnection(url, ”login", ”pass"); Statement stmt = con.createStatement(); // set up stmt String query = "SELECT COF_NAME, PRICE FROM COFFEES"; ResultSet rs = stmt.executeQuery(query); try { // handle exceptions // loop through result tuples while (rs.next()) { String s = rs.getString("COF_NAME"); Float n = rs.getFloat("PRICE"); System.out.println(s + " " + n); } } catch(SQLException ex) { System.out.println(ex.getMessage () + ex.getSQLState () + ex.getErrorCode ()); }
Disadvantages of having Host Programming Languages • Mixing Procedural and Declarative languages(the famouslanguage mismatchproblem) • Differentpre-processors are required fordifferentlanguages • Relations are not treated as 1st class objects in the language (e.g., cannot pass a relation as a parameter to a procedure) • The host languagemay not support required constructs (e.g., FORTRAN does not support records) • The alternative, that isdatabase programming languagesare gaining ground, especially in object-oriented DBMSs
Shortcomings of SQL • SQL does not support strong typing, inheritance, etc. • SQL tables are NOT relations (they allow duplicate rows) • SQL tables can not be nested (be values of another table) • SQL does not support several relational operators, like: generalized restriction, division, forall • SQL does not support 3-V logic (at least, not correctly) • SQL does not support transition constraints • SQL does not provide support for declaring (and using) functional dependencies • SQL does not support integrity constraints on views DESPITE ALL THAT, SQL is the STANDARD LANGUAGE
Query-by-Example (QBE) • A Query Language developed at IBM (by Moshe Zloof) found in a product called QMF (interface option to DB2) • Easier to use than SQL for the average end-user (VISUAL and TWO-DIMENSIONAL) • IDEA: The system provides the user with askeleton of the relations in the database and the user fills-in the tables with examples of what the answer should look like • QBE1: Find the names of employees in department 4 EMPLOYEE
QBE Summary • PRINCIPLES • user does not have to remembernames of columns/relations • in query specification, no rigid rules apply • based on domainrelationalcalculus(columns are variables) • relationally complete specification of queries, but also support of transitiveclosure • HOW IT WORKS • Tokens with an underscore preceding “_”, are VARIABLES • Tokens with no underscore are CONSTANT VALUES (they signify an equalityselection-condition) • The prefix “P.” is used to indicate which attributes will be printed (it signifies a projection)
QBE Summary - The Process • QUERY PROCESS • the user first chooses the tables needed to specify the query • templates of the chosen tables are displayed • user moves to the appropriate columns using special keys • example values (variables), constants, etc. are typed-in • other comparison operators than equality (automatic with a constant value) must be typed-in (like, >, <, etc.) • more complex conditions are entered in a condition box • conditions in the same row, specify the Boolean AND • conditions in distinct rows, specify the Boolean OR • negation (Boolean NOT) is specified by the symbol “ Ø“ • JOINS are expressed by using common example values in multiple tables
QBE -- Examples • QBE2: Find the names of employees who work in a department with a manager different than 3334 EMPLOYEE DEPARTMENT
QBE -- Examples (2) • QBE3: Find names of employees who earn more than 30000 AND work in department 5 • QBE4: Find names of employees who earn more than 30000 OR work in department 5
QBE -- Examples (3) • QBE5: Find names and addresses of employees in department 4 who earn more than someone whoworks in department 5 EMPLOYEE CONDITION BOX __10 > __20
QBE -- Examples (4) • QBE6: Find names of employees who earn more than their department manager EMPLOYEE DEPARTMENT CONDITION BOX __10>__15
QBE -- Examples (5) • QBE7: Find all DEPENDENT tuples for employees who have a dependent born in 1972 while no other dependents were born before 1975 DEPENDENT P. Ø
QBE -- Examples (6) • QBE8: Find the total salary of all the employees EMPLOYEE • QBE9: Insert a new employee EMPLOYEE I.
QBE -- Examples (7) • QBE10: Give a 15% salary increase to all employees in department 5 EMPLOYEE U.
Summary • QBE is an elegant, user-friendly query language based on DRC. • It is quite expressive (relationally complete, if the update features are taken into account). • Simple queries are especially easy to write in QBE, and there is a minimum of syntax to learn. • Has influenced the graphical query facilities offered in many products, including Borland’s Paradox and Microsoft’s Access.