580 likes | 715 Views
This document delves into the interplay between XML and views in databases, highlighting their significance in structured and semistructured data. Emphasizing the utility of XML in e-commerce and data exchange, it draws connections to existing work in views, highlighting topics like query optimization and incomplete information. With a focus on practical applications, the text provides insights into both current technologies and future research directions. It aims to motivate new explorations into XML views while recycling foundational concepts from earlier works.
E N D
On Views and XML Serge Abiteboul INRIA PODS 1999
Organization • Introduction • XML View := Query +:= Change Control +:= Objects +:= Structured & Semistructured Data +:= Active Features +:= Incomplete Information +:= more... Many Facets! Views and XML - Serge Abiteboul
Warning • This is not a survey on database views • This is not a tutorial on XML • This is about the use of XML&ecommerce as excuses to survey some works on views cast in a fashionable context: O2views, views of OEM, ActiveViews, Lorel/Ozone... (and also motivate future works) Views and XML - Serge Abiteboul
Executive Summary: Database folks should be interested in XML Views and more and more are Footnote: this is a great way to recycle your old results on views, incomplete information, deductive databases, universal instance assumption, dependency theory, etc.
Introduction: XML in short • Document mark-up language; descendant of SGML • Standard for data exchange on the Web • We are interested here in data exchange and not in document editing and retrieval Views and XML - Serge Abiteboul
EXAMPLE: EDI Electronic Data Interchange • Standard for business data exchange • 2 standards: • ANSI X12 in US -- all B2G by end 1999 • EDIFACT in world -- UN committee • translate EDI transmit Views and XML - Serge Abiteboul
<!DOCTYPE Book-Order PUBLIC "-//Editor//DTD Book Order Message//EN"> <Book-Order Supplier="4012345000094" Send-to="http://www.bic.org/order.in"> <title>Editor Lite-EDI Book Ordering</title> <Order-No>967634</Order-No> <Message-Date>19961002</Message-Date> <Buyer-EAN>5412345000176</Buyer-EAN> <Order-Line Reference-No="0528837"> <ISBN>0316907235</ISBN> <Author-Title>Labaln, Brian/Chrome</Author-Title> <Quantity>2</Quantity> </Order-Line> <Order-Line Reference-No="0528838"> <ISBN>0856674427</ISBN> <Author-Title>Parry, Linda (ed)/William Morris</Author-Title> <Quantity>1</Quantity> </Order-Line><input type="checkbox" name="partial" value="allowed"/> <text>Tick here if a delayed/partial supply of order is acceptable</text> <input type="checkbox" name="confirmation" value="requested"/> <text>Tick here if Confirmation of Acceptance of Order is to be returned by e-mail</text> <input type="checkbox" name="DeliveryNote" value="required"/> <text>Tick here if e-mail Delivery Note is required to confirm details of delivery</text> <E-Address>E-mail address: <input name="e-address" size="25"></input></E-Address> <Language>Please respond in:<select name="response-language"> <option value="EN" selected>English</option><option value="FR">Français</option> <option value="DE">Deutsch</option> <option value="ES">Espagnol</option> <option value="IT">Italian</option> </select></language> <input type="submit" value="Press here to send completed form to supplier"> </Book-Order> data in XML/EDI Views and XML - Serge Abiteboul
I personally prefer: Views and XML - Serge Abiteboul
XML • Some noise and confusion • Is the syntax important? No • What is XML? • the means to exchange tree/graph data on the Web • an object-oriented API for it • more Views and XML - Serge Abiteboul
A (simplified) model for XML XML-tree :- list(node) node :- string | element | ref node element :- label list(att : string) list(node) • label :- string • att :- string • an attribute occurs at most once Views and XML - Serge Abiteboul
XML in short <person> <name>Serge Abiteboul</name>PODS invited speaker <a xml:link=`simple’ href=“gif/serge.gif”> old picture</a> <address> <city>Le Chesnay</city><zip>92310</zip></address> <a xml:link=`simple’ href=“www-rocq.inria.fr/~abitebou”>Web</a> </person> DTD: grammar DCD: some typing DOM: object API RDF: meta data XPOINTER/XLINK ... Views and XML - Serge Abiteboul
XML Views Query Publish&subscribe Crawler&filter engine Security manager Request broker Business intelligence Output/report/delivery Data Warehouse Web browsers OLAP Web browsers View server Image video Web browsers reports Information repository Views and XML - Serge Abiteboul
What databases can bring to XML is query optimization and query rewriting View := Query
View = Query • like for relational model • use of query optimization techniques • use of query rewriting techniques • processing queries using views • main issue: virtual vs. materialized Views and XML - Serge Abiteboul
B2C: Comparative Shopping http://www.addall.com • 24 bookstores searched in about 10 seconds • between $42 and $78 • that’s why people will use them! Views and XML - Serge Abiteboul
What DB can bring to XML is the control of changes View +:= Change Control
Some of the most studied problems for relational views update propagation: • incremental updates • view update problem Views and XML - Serge Abiteboul
D2V: Incremental Updates • a customer has loaded portions of the catalog • some prices change • no need to reload the entire catalog • many such examples on the Web • updates Views and XML - Serge Abiteboul
V2D: View Update • Sometimes considered less of an issue: the Web is read only! • Many Web applications involve updates • We may be able to annotate the products of the catalog • some of the data is in read mode • some data is not visible (this is only a view!) • some data may be updated Views and XML - Serge Abiteboul
Example: Change Detection A customer (self) is in a department (self.department) and may want to see only the current promotions of products in this department (MyPromotions) let MyPromotions be select I.* from I inCatalog.promotions.item where I.department = self.department Views and XML - Serge Abiteboul
99/02/01 description 01/05/03 super sale Query Subscription: Changes [from Chawathe’s thesis] Changes in label graphs : as in DOEM Catalog name Gismos78 item promotion department electronic price £234 self department £278 Views and XML - Serge Abiteboul
Query Subscription: Changes • Change value of atomic vertex value • Creation of new vertex • Addition/removal of an edge • Change of the label on an edge: add/remove • Move a vertex: add/remove • annotations on edges and vertexes Views and XML - Serge Abiteboul
Query Subscription: Queries select P.code, P.description from P in Catalog.product where P.price <changed>Q vertex annotation where P.<added>description edge annotation where P.price data in annotation <changed <old=Q’, date T>>Q and Q - Q’ > 100 and T > “99/04/03” Views and XML - Serge Abiteboul
Query Subscription: Examples • On the first of each month, send me the list of all products in my interest list such that their price increased by more than 10% • Each time there are ten new employees, send me their names and departments • Notify me if the price of this house decreases • similarity on event when condition do action Views and XML - Serge Abiteboul
XML +:= World of Objects The underlying model for XML is object-based and XML views should be based on OO(DB) technology
Views +:= World of objects • API for XML: Domain Object Model • Views XML as object-oriented • Allows designing C++ or Java applications • E.g.: • use subclass Promotion of XMLNode • Catalog.promotionsis only a set of virtual elements • the list of promotions is generated on demand based on the nature of customers Views and XML - Serge Abiteboul
Views in OODB: O2Views • Virtual values • like for relational views • entirely virtual XML document, e.g., view of relational data • virtual attributes e.g., product: code, name, price,… alternatives = the set of products that are “similar” and are on promotion Views and XML - Serge Abiteboul
Views in OODB: O2Views • Virtual class: a set of database objects that are grouped together and as such acquire a new interface • catalog1/DTD1,…,catalog17/DTD17 • products are represented differently in each catalog • unique DTD that allows to view all products • each product can be “viewed” with that DTD Views and XML - Serge Abiteboul
Views in OODB: O2Views • Imaginary class: groups objects that are all virtual, e.g., join of two relations • For more: see Souza’s thesis Views and XML - Serge Abiteboul
XML data/views +:= semistructured + structured data XML should also allow the exchange of structured data as in relational/ODMG models
Semistructured + Structured Data • If we know about the structure of data, not using it may damage performance • The use of structure facilitates the programming of applications, e.g., in Java • Structure may be useful to explain data to users • For more: see Lahiri’s thesis [and Ozone = OQL + Lorel ] Views and XML - Serge Abiteboul
Web catalog - continued Product-basic all products category=electronic, subcategory=sound, name=Gismo223, code=F2GHYYRF, selling-price=1200FF Product-specific for Gismos only voltage=list(110,220), Gismo-norm=GHTF333 External resources description=http://m.ec.fr/cat/Gismo reviews=http://reviews.com/Gismo Private data buying-price=100$, quantity-in-stock=20000, supplier=Sears, authorized-discount=30% Regular data Semistructured data External data Other regular data Views and XML - Serge Abiteboul
This data in XML <product> <basic> <cat> electronic <subcat >sound </subcat><cat> <n>Gismo223 </n><c>F2GHYYRF</c> <sp currency=French-franc>1200</sp> </basic> <specific> <v>110</v><v>220</v> <Gismo-norm>=GHTF333</Gismo-norm></specific> <external> … </external> <private> <bp currency=dollar>100</bp> <qis>20000</qis>, <s>Sears</s> <ad>30</ad></private><\product> Views and XML - Serge Abiteboul
What is such data exactly? • A mix of structured and semistructured data with pointers between two worlds • Purely XML. Then • use a relation as a materialized view Product(name, code, category, subcategory, price, rest) • Index on name and subcategory select P.name, P.price from P in Product where P.subcategory = “sound” Views and XML - Serge Abiteboul
Digression: storage of XML • as blobs • generic mapping : ignore the structure • specific mapping • relational • object • hybrid Views and XML - Serge Abiteboul
As blobs <product><basic> <cat> electronic <subcat >sound </subcat><cat> <n>Gismo22</n><c>F2GHYYRF</c> <sp currency=French-franc>1200</sp> </basic><specific><v>110</v><v>220</v> <Gismo-norm>=GHTF333</Gismo-norm></specific> <external> … </external> <private><bp currency=dollar>100</bp> <qis>20000</qis>, <s>Sears</s> <ad>30</ad></private><\product> + full-text index Views and XML - Serge Abiteboul
Generic mapping root product o1 o3 electronic o1 basic o2 o4 sound o2 cat o3 o5 Gismo223 o2 subcat o4 o6 F2GHYYRF o2 n o5 o7 1200... o2 c o6 o2 sp o7... o7 currency French-franc o12 currency dollar... element graph atomic objects attributes Views and XML - Serge Abiteboul
Specific Class Product type tuple( cat:string; subcat:set(string); n: string, c:string; price: Price; specific: OEM; external: list(tuple(label:string;val:URL)); private pr: tuple( bp:Price; qis: integer; supplier: Company; ) ) type Price : tuple(sum:int, currency:Currency); Views and XML - Serge Abiteboul
What is better? Hybrid? • Need for comparative studies • My feeling/common sense?: • Use structure for very structured portions of data • Use semistructured for less so or portions with very evolving structures • Use blobs for components accessed mostly via full-text indexing, e.g., paragraphs in a document Views and XML - Serge Abiteboul
Active Views • System developed at INRIA • Long term goals: • Declarative specification of data intensive applications with cooperation between partners • Ease of use and fast deployment • (Automatic) verification Views and XML - Serge Abiteboul
Architecture JAVA DOM AVApi O2 Java application O2 Notification Java RMI XML repository ACTIVEVIEWS MANAGER Web Browser Java Client Views and XML - Serge Abiteboul
Motivations • Database Applications: • passive behavior • closed systems • persistence, concurrency, access control • New needs • interactions between clients: e.g., notification • change control • reactive behavior • E.g: e-Commerce, cooperative work Views and XML - Serge Abiteboul
Illustration of Interactions: Notification In the vendor view: when Customer.entersDept(dept) if dept = self.dept thennotifyme Views and XML - Serge Abiteboul
Notification AVServer entersDept book AVClient customer notify notify AVServer AVClient vendor in book dept Views and XML - Serge Abiteboul
Illustration of Interaction : Change Control In the customer view let monitoredMyPromotions be s elect I.name, I.price from I in Catalog.promotions.item where I.department = self.department read, write, append, monitored, refresh, deferred… simpler case: monitoring of the catalog Views and XML - Serge Abiteboul
Change control 3 Modification AVServer 4 Write 1 Read AVClient 6 Notification 2 Read 7.Read 5 Notification AVServer AVClient Views and XML - Serge Abiteboul
Choices • All XML • XML repository • XML query language • XML views • Declarative specification • almost no code to write • compilation to an executable application • active rules Views and XML - Serge Abiteboul
Important Aspects • workflow e.g., customization: to search for a biblio ref, look first in my own files, otherwise look in dblp otherwise look… • activities (search, buy, accounting, chat…) • active rules • logical traces • notifications Views and XML - Serge Abiteboul
View +:= Incomplete Information Use something like Imielinski-Lipski tables