1 / 82

Multidatabase manipulations Part 2

Multidatabase manipulations Part 2. Witold Litwin. http://ceria.dauphine.fr/witold.html. Multidatabase manipulations (Kandinsky: Ligne avec Accompagnement, 1937 ). Multidatabase manipulations. MSQL (Litwin, Abdellatif, Nicolas, Zeroual, 1989 L. Suardi, M. Rusinkiewicz, 1992).

laban
Download Presentation

Multidatabase manipulations Part 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multidatabase manipulations Part 2 Witold Litwin http://ceria.dauphine.fr/witold.html

  2. Multidatabase manipulations(Kandinsky: Ligne avec Accompagnement, 1937 )

  3. Multidatabase manipulations

  4. MSQL(Litwin, Abdellatif, Nicolas, Zeroual, 1989L. Suardi, M. Rusinkiewicz, 1992) • An extension to SQL • Contains by definition every SQL-x • Allows for non-procedural multidatabase base manipulations •  MSQL queries impossible to formulate in SQL • An MSQL query may replace several SQL queries • Developed in 1986-89 • INRIA, projet B A BA, • initialement sous projet du projet-pilote SIRIUS (J. Le Bihan, puis W. Litwin) • Dr. Thesis of MM. Abdellatif, Nicolas, Zeroual • Compiler implemented at Houston University • Team od Prof. M. Rusinkiewicz, 1990-1993

  5. MSQL(Litwin, Abdellatif, Nicolas, Zeroual, 1989L. Suardi, M. Rusinkiewicz, 1992) • Research vehicle for functions for the MBD environment • to address relations in different databases • to manipulate semantically heterogeneous data • to create MDB views • to transfer data (and schemas) between DBs • to define MDB dependencies • Present to limited extent in most of commercial DBMSs &DAMSs

  6. MSQL(Basic new properties) • SQL Query • Uses 1st order predicate calculus • Is compiled for optimization into the relational algebra • Result is a table • MSQL Query • May use higher-order predicate calculus • Is compiled for optimization into the multirelational algebra • Result is a multitable • A set of relations (tables) • May be constituted from one or no tables

  7. MSQL(More on functions specific to MDB env.) • Addressing of tables in different DBs • Implicitly or by qualification by (multi)database names • Introduced around 1985 by relational multibase system prototype MRDSM • B A BA project at INRIA • Unknown at that time of any relational language • See the overview of relational DBMSs existing in 1987 (M. Brodie)

  8. MSQL(More on functions specific to MDB env.) • Manipulation of semantically heterogeneous data • Multiple Queries • With multiples identifiers • With semantic variables • Ranging over data names • Scale and Precision • Units of measure • Implicit joins • Capabilities still unknown of SQL • Capabilities known at present to some dialects • Limited with respect to MSQL

  9. MSQL : example View View SIL = Internal Logical Schema

  10. Conceptual Schemas (the multischema) DB bnp : br (br#, brname, street, street#, city, zipcode, tel)account (acc#, cl#, balance, br#)client (cl#, clname, cltel, cltype, street, street#, city, zipcode)spe-acc (acc#, br#, cl#, balance, curr) DB sg : branch (bra#, braname, street, s#, town, zip, t#, class)acc (acc#, bra#, c#, balance)client (c#, cname, ct#, ctype, street, s#, town, zip) DB cic : br (br#, brname, street, street#, city, zipcode, tel)account (ac#, br#, cl#, balance, open_date)client(cl#, clname, cltel, cltype, street, street#, city, zipcode)

  11. Semantic Heterogeneity In Banks • Same names can designate different data • Different names can designate same data • same client, same town.. • The value of a primary key is valid only in one DB • how to identify same client in diff. banks ?

  12. MSQL Commands • CREATE TABLE CREATE DATABASE • CREATE MULTIDATABASE CREATE VIEW • ALTER TABLE ALTER VIEW • ALTER MULTIDATABASE • DROP TABLE DROP DATABASE • DROP MULTIDATABASE DROP VIEW

  13. MSQLCREATE DATABASE • > MSQL • CREATE DATABASE boulogne ; • CREATE DB |.com.org.user.boulogne ; • CREATE MULTIDATABASE Banks (bnp cic sg ); • USE Banks; • CREATE DATABASE boulogne FROM bnp ; Query scope

  14. MSQLCREATE MULTIDATABASE • MSQL • CREATE MDB EC-Banks (f-banks-i-banks, s-banks, g-banks, e-banks ); CREATE MULTIDATABASE can create : • flat MDBs (only contain DBs) • nested MDBs (DBs or MDBs) • can be potentially any network of DBs or MDBs • like through the links on the WEB • what about cycles ?

  15. MSQLCREATE TABLE Import use banks ; CREATE TABLE boulogne.loan FROM bnp.loan ; CREATE TABLE fake_checks (Chq# INT, Montant_Euro CURRENCY [EURO] .... ); One has created four (empty) tables :bnp. fake_checks , cic. fake_checks ... boulogne.fake_checks CREATE TABLE boulogne.client (c#, cn, ct#) FROM bnp.client (cl#, clname, cltel) PRIMARY KEY (c#) (cn, ct#) OUTER REFERENCES (clname, cltel); Unit of mesure

  16. MSQLCREATE TABLE with References USE AuPrintemps /* MDB AuPrintemps CREATE TABLE MusicDep.Inventory …. FOREIGN KEY (Item#) REFERENCES Central.Stock(I#); No unauthorized Item# in the inventory of the Music Department • Other options • PRIMARY KEY (…) REFERENCES T(…) ; • [T1(A)] [LEFT|RIGHT] REFERENCES T2(B) ; • Generates implicit equijoin, or left or right implicit outerjoins when a query selects attributes A and B.

  17. MSQLALTER MULTIDATABASE use banks ; alter banks include vernesremove cic • Alter MDB can create • flat MDBs (only contain DBs) • nested MDBs

  18. MSQL Elementary queries Prefixing with DB names was unknown to SQL - and is in DB2 SQL since last year only

  19. MSQL Default DB Tables of the default database are not prefixed

  20. MSQL Elementary queries without prefixed names Table names are unique within the query scope

  21. Updates USE (bnp b) sg ; UPDATE account SET account.balance = account.balance + 500 WHERE account.balance > acc.balance AND b.client.clname = sg.client.cname AND b.client.street = sg.client.street ; • What does it mean ?

  22. Multiples Queries

  23. Multiple Queries

  24. Results (a multitable)

  25. Multiple Updates Begin Use BanksUpdate cl*set street = 'Charles de Gaulle"where street = 'Etoile' If SQLCODE <> 0 then Rollback ; Commit Use Banks vital cicUpdate cl*set street = 'Charles de Gaulle"where street = 'Etoile' • MSQL transaction semantics is more general than ACID • may include COMP (compensation) statement, list of accept. states....

  26. Semantic Variables in MSQL use bnp sglet x be town cityselect * from b%where x = 'Paris' and street = 'r. de Rivoli'

  27. Semantic Variables in MSQL use bnp sglet x be town cityselect * from b%where x = 'Paris' and street = 'r. de Rivoli' use bnp select * from br where town = 'Paris' and street = 'r. de Rivoli' use sg select * from branch where city = 'Paris' and street = 'r. de Rivoli'

  28. Semantic Variables in MSQL use bnp sglet x be town cityselect * from b%where x = 'Paris' and street = 'r. de Rivoli' Alternatively: use bnp sglet x be to% cityselect * from b%where x = 'Paris' and street = 'r. de Rivoli'

  29. Semantic Variables in MSQL use banks let X be banks.* select a%, balance, c%namefrom X.a% a, X.c% c where a. a%= c. c% • The query illustrates the multitable pair-wise join • Semantic variable a over relation name account is not necessary, but simplifies the typing of the query

  30. banks FD Semantic Variables in MSQL • Semantic variables can be compound and with values selected by queries from some dictionaries use bankslet (x, y) be :select X.attr Y.attr from FD X, FD Y where X.mean = tel and Y.mean = city select * from clientwhere x = '123' and y = 'Paris' FD mean attr tel t# tel tel city city city town city burgh

  31. Semantic Variables in MSQL • Can be applied to MSQL DD statements use banks create database cic2 ; let x be a% b% c% create table cic2.x from cic.x ; • Copies cic schema except for one table

  32. Name homogenizationThe labels USE Banks ; LET t BE tel t#SELECT %name branch_name, t tel#, s%# street#FROM br% brWHERE street = ‘Champs Elysées’ ; The result : multitable: {( bnp.br.branch_name, bnp.br.tel#, bnp.br.street# ), ( sg.br.branch_name, sg.br.tel#, sg.br.street# )( cic.br. branch_name, cic.br.tel#, cic.br.street# )}

  33. Multidatabase Views bnp my_bank sg A partial view of DBs bnp and sg in DB my_bank The views in my_bank can be considered Import Schemes

  34. Multidatabase Union Views Use Banks Create View bnp.all-banks as Use banks let x be town city let y be banks.* Select y.br% ( y, br#, br%name branch, street, street#, x city, zip% zip, t% tel) Union * • Union * unions all the tables of the selected multitable • It scales to all the tables named br% of Banks, if new banks enters the MDB Banks in the future • Current DBMS, e.g., SQL Server, require to alter the union view definition in such a case

  35. Key words and Aggregate Functions in MSQL • Key words and Aggregate Functions of SQL • par definition • DISTINCT, GROUP BY, ORDER BY • COUNT, AVG, SUM… • operate at each table of a multitable • Their extensions to multitables • MDISTINCT, MCOUNT, MGROUP BY, MORDER BYMAVG, MSUM... • operate at whole multitable • important for warehousing

  36. Example USE BanksSELECT COUNT (*)FROM br% brWHERE street = 'champs elysées' ;

  37. Example USE BanksSELECT COUNT (*)FROM br% brWHERE street = 'champs elysées' ; bnp.br2 cic.br2 sg.br2

  38. Example USE BanksSELECT MCOUNT (*)FROM br% brWHERE street = 'champs elysées' ;

  39. Example USE BanksSELECT MCOUNT (*)FROM br% brWHERE street = 'champs elysées' ; br6 • Exercises in warehousing : • -Average balance per client in each bank • Average balance per client in BANKS • Sum of client assets per bank • Sum of client assets in BANKS

  40. Aggregate Functions IMPLEMENTATION ISSUES • All-in-one (traditional computation) • Possibly in parallel • The calculus can take long time. • Successive approximations • Some kind of sampling • result1, from any 1st DB to come • (result1 + result2) / 2 • … • sampling within each database • several ACM-Sigmod & VLDB papers dealt with query evaluation using sampling • Precomputing • Incremental evaluation using interdatabase dependencies • Common to warehousing

  41. Aggregate Functions MERGE ON • form a single tuple from all the tuples of the same objet in the multitable • Uses outer jointures • Find millionaires in Banks and form the tuple for each millionaire USE Banks ;LET x.y BE clname.cltel cname.ct#LET z BE Banks.*SELECT *FROM z.a% WHERE z.a%.c%# = z.client.c%#AND z.a%.balance > 1 000 000MERGE ON x y ;

  42. Aggregate Functions MERGE ON nulls nulls nulls • USE Banks ;LET x.y BE clname.cltel cname.ct#LET z BE Banks.*SELECT *FROM z.a% WHERE z.a%.c%# = z.client.c%#AND z.a%.balance > 1 000 000MERGE ON x y ;

  43. Aggregate Functions NAME • Transform a name (table, attribute..) into attribute value USE Banks ;LET x.y BE br.city branch.townSELECT %name branch_name, NAME (.x) bankFROM xWHERE y = 'Nice' UNION * ; Note: Union * unions all the tables of the selected multitable • the result is the table :branch_name bank Jaures CIC DeGaulle BNP

  44. Aggregate Functions CHOOSE • Chooses at most n tuples among the selected ones • the 1st found as does the function TOP (default) in any or some order, specified by ORDER BY (default) • strictly random (RND) • these that were not chosen by the previous execution of the query in the same transaction (NEW) • preferably in the DBs listed, and in the listed order • at most j per DB • selecting at most m tuples sharing the values of the attr. in the list A, supposed global key of some objet. CHOOSE (n, (m, <A>), [<B>]| j, [<B>], [RND | NEW] <A> ::= <list of attr.> <B> ::= <list of DBs>

  45. Aggregate Functions CHOOSE • Choose a millionaire randomly USE Banks ;SELECT c.*FROM c% c, a% aWHERE c.c%# = a.c%# AND a.ba% > 1.000.000CHOOSE (1) RND ; • Function very important in MBD environment • information overload

  46. Aggregate Functions TIMEOUT • Fix time limit of a query • the system should possibly deliver all the relevant tuples • however, any query arriving to timeout is considered executed successfully TIMEOUT (t [unit]) ;<unit> := ms | s | m | h | ds - seconds (default) USE BanksSELECT * FROM br%WHERE street = 'champs elysées' TIMEOUT (10) ;

  47. Aggregate Functions POST • Make a query continuous • One manipulates each tuple found during the life time of the query • Even those created after the query start • TIMEOUT may be used to limit the life time USE Immo LaCentrale Orpi ;SELECT * FROM logem%WHERE prix < 1,000,000 AND Ville = 'Paris' POST ;

  48. Aggregate Functions ESTIMATE • Compute the cost of a query before the execution and can start the execution after an authorizationESTIMATE (type, price, time, count, size, report) [WITH EXEC_PROMPT] • type of estimate : • exact (can be long to compute) • approximate • price of the query (in $, FF...). • completion time • number of tuples • size of the resultant, in bytes • report on the estimate itself • precision...

  49. Privileges in MSQL USE bnp sg cic ; GRANT SELECT ON client TO Nicolas Abdellatif ; client is a multitable : client = (bnp.client, sg.client, cic.client) GRANT ALL ON etoile.account TO Nicolas Abdellatif FROM bnp.account ; GRANT ALL ON etoile.account TO Nicolas FROM Zeroual ON bnp.account ;

  50. Interdatabase Queries • Transfer data between DBs • Source and target are multitables INSERT...

More Related