1 / 44

9 Querying XML Data and Documents

9 Querying XML Data and Documents. XQuery , W3C XML Query Language "work in progress", Working Draft, 30 April 2002 joint work by XML Query and XSL Working Groups together with XPath 2.0 ( <- XPath 1.0 and XQuery) influenced by many research groups and query languages

taffy
Download Presentation

9 Querying XML Data and Documents

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 9 Querying XML Data and Documents • XQuery, W3C XML Query Language • "work in progress", Working Draft, 30 April 2002 • joint work by XML Query and XSL Working Groups • together with XPath 2.0 (<- XPath 1.0 and XQuery) • influenced by many research groups and query languages • Quilt, XPath, XQL, XML-QL, SQL, OQL, Lorel, ... • Goal: a query language applicable to any XML-represented data: both documents and databases Notes 9: XQuery

  2. Functional Requirements (1) • Support operations (selection, projection, aggregation, sorting, etc.) on all data types: • Choose parts of data based on content or structure • Also operations on document hierarchy and order • Structural preservation and transformation: • Preserve the relative hierarchy and sequence of input document structures in the query results • Transform XML structures and create new XML structures • Combining and joining: • Combine related information from different parts of a given document or from multiple documents Notes 9: XQuery

  3. Functional Requirements (2) • Cross-references: • must be able to traverse intra- and inter-document references • Closure property: • The result of an XML query is also XML (usually not a valid document, but a well-formed document fragment) • The results of a query can be used as input to another query • Extensibility: • The query language should support the use of externally defined functions on all data types of the data model Notes 9: XQuery

  4. XQuery in a Nutshell • Functional expression language • Strongly-typed: (XML Schema) types may be assigned to expressions statically • Includes XPath 2.0 (says Draft, but not all XPath axes included!) • XQuery 1.0 and XPath 2.0 share extensive functionality: • XQuery 1.0 and XPath 2.0 Functions and Operators, WD 30/4/2002 • Roughly: XQuery  XPath' + XSLT' + SQL' Notes 9: XQuery

  5. XQuery: Basics • A query is represented as an expression • Expressions operate on and return sequences of • atomic values (of XML Schema simple types) and • nodes (same 7 node types as in XPath/XSLT) • an item  a singleton sequence • sequences are flat: no sequences as items • (1, (2, 3), (), 1) = (1, 2, 3, 1) • sequences are ordered, and can contain duplicates • Unlimited combination of expressions, often with automatic type conversions (e.g. for arithmetics) Notes 9: XQuery

  6. XQuery: Basics • Query applied to an input sequence,accessed through functions ... • input() • implementation-dependent input sequence • collection("URI") • sequence of nodes from URI • document("URI") • root of the XML document available at URI • this also in XSLT Notes 9: XQuery

  7. Some Central XQuery Expressions • path expressions • creation and combination of sequences (union, intersect, except) • element constructors ( XSLT templates) • FLWR (”flower”; for-let-where-return) expressions • sort expressions • quantified expressions (some/every … satisfies …) • testing and modifying datatypes; validation Notes 9: XQuery

  8. Path Expressions • Extend the syntax of XPath: [/]Expr/… /Expr • arbitrary (node-sequence-valued) expressions as steps • only 6 (of 13) axes: child, descendant, attribute, self, descendant-or-self, parent • combined with order comparison operators (precedes, follows) sufficient for queries (?) • produce ordered sequence of nodes in document order, without duplicate nodes • dereferencing for ID/IDREF values: Figure(s) referenced by refid attribute of the first figref child:figref[1]/@refid => figure Notes 9: XQuery

  9. Element Constructors • Similar to XSLT templates: • start and end tag enclosing the content • literal fragments written directly, expressions enclosed in braces { and } • often used inside another expression that binds variables used in the element constructor • See next Notes 9: XQuery

  10. Example • Generate an emp element containing an empid attribute and nested name and job elements; values of the attribute and nested elements given in variables $id, $n, and $j: <emp empid="{$id}"> <name>{$n} </name> <job> {$j} </job> </emp> Notes 9: XQuery

  11. FLWR ("flower") Expressions • Constructed from for, let, where and return clauses (~SQL select-from-where) • Form: (ForClause | LetClause)+ WhereClause? "return" Expr • a FLWR expression binds values to one or more variables, and uses these variables to construct a result (in general, an ordered sequence of nodes) Notes 9: XQuery

  12. Flow of data in a FLWR expression Notes 9: XQuery

  13. for clauses • for $V1inExp1 (, …) associates each variable Vi with an expression Expi that returns a list of items (e.g. a path expression) • Result: list of tuples, each containing a binding for each of the variables • can be though of as loops iterating over the items returned by respective expressions Notes 9: XQuery

  14. let clauses • let also binds one or more variables to one or more expressions • each variable to the entire value of its expression (without iterating through items of the sequence) • results in binding a single sequence for each variable • Compare: • for $x in /library/book-> many bindings (books) • let $x := /library/book-> single binding (to a sequence of books) Notes 9: XQuery

  15. for/let clauses • A FLWR expr may contain several fors and lets • each of these clauses may refere to variables bound in previous clauses • the result of the for/let sequence: • an ordered list of tuples of bound variables • number of tuples = product of the cardinalities of the node-lists returned by the expressions in the for clauses Notes 9: XQuery

  16. for/let clauses - Example 1 for$iin (1,2), $jin (3,4)return <tuple> <i>{$i}</i> <j>{$j}</j></tuple> Result: <tuple><i>1</i><j>3</j></tuple> <tuple><i>1</i><j>4</j></tuple> <tuple><i>2</i><j>3</j></tuple> <tuple><i>2</i><j>4</j></tuple> Notes 9: XQuery

  17. for/let clauses - Example 2 let $s := (<one/>, <two/>, <three/>) return <out>{$s}</out> Result: <out> <one/> <two/> <three/> </out> Notes 9: XQuery

  18. where clause • binding tuples generated by for and let clauses filtered by an optional where clause • those with a true condition are used to instantiate the return clause • the where clause may contain several predicates connected by and, or, and not() • usually refer to the bound variables • sequences treated as Booleans similarly to node-lists in XPath: empty ~ false; non-empty ~ true Notes 9: XQuery

  19. where clause • Variables bound by a for clause represent a single item -> scalar predicates, e.g. $p/color = ”Red” • Variables bound by a let clause may represent lists of nodes -> list-oriented predicates, e.g. avg($p/price) > 100 • a number of aggregation functions available: avg(), sum(), count(), max(), min() (also in XPath 1.0) Notes 9: XQuery

  20. return clause • The return clause generates the output of the FLWR expression • instantiated once for each binding tuple • often contains element constuctors, references to bound variables, and nested subexpressions Notes 9: XQuery

  21. Examples (from "XML Query Use Cases") • Assume: a document named ”bib.xml” containing of a list of books:<book>+ <title> <author>+ <publisher> <year> <price> Notes 9: XQuery

  22. List the titles of books published by Morgan Kaufmann after 1998 <recent-MK-books> { } </recent-MK-books> for $b indocument(”bib.xml”)//book where $b/publisher = ”Morgan Kaufmann” and $b/year gt 1998 return <book year="{$b/year}"> {$b/title} </book> Notes 9: XQuery

  23. Result could be... <recent-MK-books> <book year=”1999”> <title>TCP/IP Illustrated</title> </book> <book year=”2000”> <title>Advanced Programming in the Unix environment</title> </book> </recent-MK-books> Notes 9: XQuery

  24. List each publisher with the average price of its books for $p indistinct-value(document(”bib.xml”)//publisher) let $a := avg(document(”bib.xml”)/book[ publisher = $p]/price) return <publisher> { <name> {$p/text()}</name> , <avgprice> {$a} </avgprice> } </publisher> Notes 9: XQuery

  25. Invert the book-list structure <author_list>{ {-- group books by authors --} for $a indistinct-values(document(”bib.xml”)//author ) return <author> <name> {$a/text()} </name> for $b indocument(”bib.xml”)//book[ author = $a] return $b/title </author> } </author_list> Notes 9: XQuery

  26. Make an alphabetic list of publishers; for each publisher, list books (title & price) in descending order of price for $p indistinct-values(document(”bib.xml”)//publisher ) return <publisher> <name> {$p/text()} </name> {for $b indocument(”bib.xml”)/book[ publisher = $p] return <book> {$b/title , $b/price} </book> sortby(price descending)} </publisher> sortby(name) Notes 9: XQuery

  27. Queries on Document Order • Operators precedes and follows to express conditions on document order • The next example involves a surgical report with elements procedure, incisionand anesthesia • return a "critical sequence" that contains all contents between the 1st and 2nd incisions of the 1st procedure Notes 9: XQuery

  28. Computing a "critical sequence" <critical_sequence> { let $p := //procedure[1] for $n in $p/node() where $n follows ($p//incision)[1]and $n precedes ($p//incision)[2] return $n } </critical_sequence> Notes 9: XQuery

  29. Filtering • Function filter(Expr) • in XQuery core function library • returns shallow copies of the nodes selected by Expr • mutual order and hierarchy btw nodes the same as in input sequence Notes 9: XQuery

  30. Hierarchical Filtering • Only selected nodes remain; other levels are collapsed Notes 9: XQuery

  31. Table of contents for”cookbook.xml”, consisting of section titles (at any level of nesting) let $b := document(”cookbook.xml”) return <toc> { filter($b//(section | section/title | section/title/text()) ) } </toc> Notes 9: XQuery

  32. Querying relational data • Lots of data is stored in relational databases • should be able to access also this data • Example: suppliers and parts • Table S: supplier numbers (sno) and names (sname) • Table P: part numbers (pno) and descriptions (descrip) • Table SP: relationships between suppliers and the parts they supply, including the price (price) of each part from each supplier Notes 9: XQuery

  33. * * * A possible XML form for relations Notes 9: XQuery

  34. SQL vs. XQuery • SQL: • XQuery: SELECT pno FROM p WHERE descrip LIKE ’Gear%’ ORDER BY pno; for $p indocument(”p.xml”)//p_tuple wherestarts-with($p/descrip, ”Gear”) return $p/pno sortby(.) Notes 9: XQuery

  35. Grouping • Many relational queries involve forming data into groups and applying some aggregation function such as count or avg to each group • in SQL: GROUP BY and HAVING clauses • Example: Find the part number and average price for parts that have at least 3 suppliers Notes 9: XQuery

  36. Grouping: SQL SELECT pno, avg(price) AS avgprice FROM sp GROUP BY pno HAVING count(*) >= 3 ORDER BY pno; Notes 9: XQuery

  37. Grouping: XQuery for $pn in distinct-values(document(”sp.xml”)//pno) let $sp := document(”sp.xml”)//sp_tuple[ pno = $pn] where count($sp) >= 3 return <well_supplied_item> { $pn, <avgprice> {avg($sp/price)} </avgprice> } <well_supplied_item> sortby(pno) Notes 9: XQuery

  38. Joins • Example: Return a ”flat” list of supplier names and their part descriptions, in alphabetic order for $sp indocument(”sp.xml”)//sp_tuple, $p indocument(”p.xml”)//p_tuple[ pno = $sp/pno], $s in document(”s.xml”)//s_tuple[ sno = $sp/sno] return <sp_pair>{ $s/sname , $p/descrip }<sp_pair> sortby (sname, descrip) Notes 9: XQuery

  39. XQuery vs. XSLT • Could we express XQuery queries with XSLT? • In principle yes, always, (but could be tedious) • Ex: Partial XSLT simulation of FLWR expressions XQuery:for $x inExpr … can be expressed in XSLT as:<xsl:for-each select="Expr"> <xsl:variable name="x" select="."> … </xsl:for-each> Notes 9: XQuery

  40. XQuery vs. XSLT XQuery:let $y := Expr … corresponds directly to:<xsl:variable name="y" select="Expr" /> andwhereCondition …can be expressed as: <xsl:if test="Condition"> … </xsl:if> Notes 9: XQuery

  41. XQuery vs. XSLT • XQuery:returnElemConstructor • can be simulated with a corresponding XSLT template: • static fragments as such • enclosed expressions in element content, e.g. {$s/sname}become <xsl:copy-ofselect="$s/sname" /> Notes 9: XQuery

  42. XQuery vs. XSLT: Example XQuery: for $b indocument(”bib.xml”)//book where $b/publisher = ”MK” and $b/year > 1998 return <book year="{$b/year}"> {$b/title} </book> <xsl:template match="/"> <xsl:for-each select="document('bib.xml')//book"> <xsl:variable name="b" select="." /> <xsl:if test="$b/publisher='MK' and $b/year > 1998”> <book year="{$b/year}"> <xsl:copy-of select="$b/title" /> </book></xsl:if> </xsl:for-each></xsl:template>  Notes 9: XQuery

  43. XSLT for XQuery FLWR Expressions • NB: The sketched simulation is not complete: • Only two things can be done in XSLT for result tree fragments produced by templates: • insertion in result tree, and conversion to a string (with xsl:value-of) • Not possible to apply further operations to results (like, e.g., sorting in XQuery) Notes 9: XQuery

  44. Summary • XQuery • W3C working draft of an XML query language • vendor support?? • http://www.w3.org/XML/Querymentions about 20 prototype implementations or products that support (or will support) some version of XQuery • future?? • promising confluence of document and database research Notes 9: XQuery

More Related