1 / 25

XML-  : an extendible framework for manipulating XML data

XML-  : an extendible framework for manipulating XML data. Jaroslav Pokorny Charles University Praha. Two approaches to XML. logical or physical Idea: XML as a database DB of XML documents „mix“ of (relational) DB and XML data XML views (over non-XML and/or XML data) Advantages:

mattox
Download Presentation

XML-  : an extendible framework for manipulating XML data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML-: an extendible framework for manipulating XML data Jaroslav Pokorny Charles University Praha XML-KSI, 2004

  2. Two approaches to XML logical or physical Idea: XML as a database • DB of XML documents • „mix“ of (relational) DB and XML data • XML views (over non-XML and/or XML data) Advantages: • independence on original platforms and models on processed data • more flexible for design, manipulation (integration, updates, querying) XML-KSI, 2004

  3. Two approaches to XML • implications • implementations: XML DBs (native, via relational, OO, OR), • special demands on query languages • how do them powerful • how to describe their semantics • how implement them • new types of software: wrappers, mediators • (personal) goal: to develop a powerful formal approach appropriate for manipulating both XML and non-XML data XML-KSI, 2004

  4. Outline • XML - shortly • XML – functional data model • functional typing XML (and non-XML data) • LT language • XML-schema, XML-database • XML- framework • Conclusions XML-KSI, 2004

  5. XML – an example <!DOCTYPE biblio [ <!ELEMENT biblio (book  monograph)*> <!ELEMENT book (title, author*)> <!ELEMENT title (#PCDATA) <!ELEMENT monograph (title, author, editor)> <!ATTLIST monograph year CDATA #REQUIRED> <!ELEMENT editor (monograph*)> <!ELEMENT author (name, address?)> <!ELEMENT name (firstname?, surname)> <!ELEMENT firstname (#PCDATA) > <!ELEMENT surname (#PCDATA) > <!ELEMENT address(locality, ZIP)> <!ELEMENT locality (#PCDATA) > <!ELEMENT ZIP (#PCDATA) > ]> XML-KSI, 2004

  6. XML – an example <book> <title> Fundamentals of DBS </title> <author > <name> <firstname> Ramez </firstname> <surname> Elmasri </surname> </name> <address > <locality> Arlington </locality> <ZIP> 76019 </ZIP> </address> </author > <author > <name> <firstname> Shamkant </firstname> <surname> Navathe </surname> </name> </author > </book> XML-KSI, 2004

  7. MEMBER* DEPARTMENT PROJECT* XML model • Usually: tree- or graph-oriented • Here: inspiration by functional approach to conceptual modelling For example, the HIT data model from 80s. XML-KSI, 2004

  8. Synopsis of the approach • Typing XML data Background: • a functional type system (base of primitive types + functions, tuples, and unions) Extensions to: • typing XML regular expressions, • typing XML elements. • Querying XML elements • a general typed -calculus (functional variables and constants, tuples, applications of functions, -abstractions) • XML-database schema as a set of variables of types, • XML-database as any valuation of these variables • XML- - a syntactic variant of the typed -calculus over XML-data XML-KSI, 2004

  9. phone element object will be conceived as a (partial) function from E into PCDATA. Typing XML data - informally E … a set of abstract elements. The content of an abstract element will be either a string from PCDATA, in the easiest example, or a sequence of abstract subelements (or groups), or empty. Ex: <phone>781 7090</phone>. It is an instance of a phone element object. For an eE, phone(e) returns e.g. the phone number ‘781 7090‘. XML-KSI, 2004

  10. Typing XML data - informally Ex: <!ELEMENT name (firstname?, surname)> is conceived a set of functions from E EE The current name element object, i.e. the one stored in a given XML database, is a function assigning to each abstract element eE at most a couple of abstract elements. Hierarchy of notions: element type, element object, element XML-KSI, 2004

  11. Functional typing B … a set of symbols (the base) T ::= Sprimitive type  (T1T2) functional type  (T1,...,Tn) tuple type  (T1 + T2) uniontype where S  B Remark: relations are ((T1,...,Tn )  BOOL)-objects! XML-KSI, 2004

  12. Functional typing Interpretation: Members of B … mutually disjoint non-empty sets, (T1 T2) ... the set of all (total or partial) functions from T1 into T2, (T1,...,Tn) … T1... Tn, (T1+…+Tn ) …  Ti Exs: • arithmetic operations: +, -, *, / are ((NUMBER, NUMBER)  NUMBER)-objects. • logic: • and/((BOOL, BOOL)  BOOL), • universal R-quantifier R,and existential R-quantifiers R are ( (R BOOL)  BOOL) - objects. • R-identity =Ris ((R,R)  BOOL)-object. • aggregation functions: COUNTR/((R  BOOL)  NUMBER) XML-KSI, 2004

  13. Typing XML regular expressions Let B = {PCDATA, BOOL, NAME}. The type systemTregover B is recursively defined as follows. T ::= tag: PCDATA tag: where tag  NAME.elementaryregular expression T* zero or more T+ one or more T? zero or one where T is an alternative or elementary regular expression.  (T1T2) alternative XML-KSI, 2004

  14. Typing XML regular expressions Interpretation: Ex.: (T1T2) … a set of objects of type T1T2. T* … (T  BOOL) /partially ordered model/ T* … ((T, NUMBER)  BOOL)/ordered model/ • Consider a function f of this type. For a couple (t, i), f(t, i) = TRUE iff t is ith object in an (ordered) set of T-objects. XML-KSI, 2004

  15. Typing XML elements and attributes Treg over B, E. The type systemTEinduced byTreg (or TEif Treg is understood) containing the regular element expressions given by the following rules: E ::= TAG:TTAG: elementary element types where tag:T and tag: are elementary regular expressions over B E* E+ E?  (E1E2) TAG:(E1,..., En) where tag  NAME. Elementary element types and regular element expressions TAG:(E1,...,En) are called element types. XML-KSI, 2004

  16. Typing XML elements and attributes Semantics of element types: TAG:PCDATA … the set of all (partial functions) from E to tag:PCDATA … etc Attributes are also functions. Ex.: year (of monograph) is a function assigning to each monograph its year (of issue). Notation: EMONOGRAPH CDATA XML-KSI, 2004

  17. TITLE:PCDATA FIRSTNAME:PCDATA SURNAME:PCDATA LOCALITY:PCDATA ZIP:PCDATA ADDRESS:(LOCALITY, ZIP) BOOK:(TITLE, AUTHOR*) NAME:(FIRSTNAME, SURNAME) MONOGRAPH:(TITLE, AUTHOR, EDITOR) YEAR/(MONOGRAPH  CDATA) EDITOR:MONOGRAPH* AUTHOR:(NAME, ADDRESS?) BIBLIO: (BOOKMONOGRAPH)* Example: BIBLIO element types XML-KSI, 2004

  18. LT language (Language of Terms) Func ... constants, each of a fixed type, variables for each type from T. Let types T, T1, ..., Tn (n 1) are members of T. Typed constants and variables are terms. M(M1,...,Mn) application x1,...,xn(M) -abstraction where x1,...,xn are distinct variables (M1,...,Mn) tuple Mi projections for a term M  (M1,...,Mn) K:M tagged term where K/NAME. If M/T, then K:M/(E T). XML-KSI, 2004

  19. Schema and DB • XML-database schema, SXML, is a set of variables of types from TE. • Given a database schema SXML, an XML-database is any valuation of these variables. Ex.: SURNAME, AUTHOR XML-KSI, 2004

  20. XML- framework What is it? XML- framework is a subset of LT + syntactic sugar Features: • queries are expressed by terms • Ex.: AUTHOR (1) RESULT: AUTHOR …. more „XML-like“) Typically:  .. ( .. …(expression)…), where expression/BOOL x (AUTHOR(x))does the same as (1) • paths as compositions of functions Ex.: SURNAME(NAME(AUTHOR(m))) where m is a monograph abstract element object Notation: m.AUTHOR.NAME.SURNAME XML-KSI, 2004

  21. XML- framework • applications of logic, arithmetic, … functions e (b.AUTHOR(e) and e.NAME.SURNAME = ‘Smith’) where b is a book abstract element object be (b.AUTHOR(e) and e.NAME.SURNAME = ‘Smith’) is a YES/NO query. XML-KSI, 2004

  22. XML- framework • restructuring name:x.NAME(title:y(.BOOK.(AUTHOR(x) and TITLE = y)) ) title:y (name:x.NAME(.BOOK.(AUTHOR(x) and TITLE = y)) ) Notation: tagged variables, content of abstract elements by y, x • aggregations + nesting D. For each book, find the number of its authors.  x, n (.BOOK..(TITLE = x and COUNT(AUTHOR) = n)) Notation: dots .. for omitting parts of paths and prefixes • possibility to embed any user defined function XML-KSI, 2004

  23. XML- framework D(XQuery): FOR $x IN distinct(document(“biblio1.xml”)//book) LET $n := count($x/author) RETURN <book> <name>$x/title/text()</name> <numb_of_auth>$n</numb_of_auth> </book> XML-KSI, 2004

  24. user answer Integration of heterogeneous information sources query typed objects relational schemes, DTDs, ADTs, classes in OO XML-KSI, 2004

  25. Conclusions Issues: • finding appropriate restrictions of XML- for querying • implementation is in progress The forthcoming paper: • cleaning the model (ordered and unordered) • formal semantics of types, • extensions to tagged variables Future: • XML- with tag variables • semantics of XQuery in XML- framework XML-KSI, 2004

More Related