1 / 51

XML Data Management XQuery

XML Data Management XQuery. Werner Nutt. Requirements for an XML Query Language. David Maier, W3C XML Query Requirements: Closedness : output must be XML Composability : wherever a set of XML elements is required, a subquery is allowed as well Support for key operations : selection

mlinsey
Download Presentation

XML Data Management XQuery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XML Data Management XQuery Werner Nutt

  2. Requirements for an XML Query Language David Maier, W3C XML Query Requirements: • Closedness: output must be XML • Composability: wherever a set of XML elements is required, a subquery is allowed as well • Support for key operations: • selection • extraction, projection • restructuring • combination, join • fusion of elements

  3. Requirements for an XML Query Language • Can benefit from a schema, but should also be applicable without • Retains the order of nodes • Formal semantics: • structure of results should be derivable from query • defines equivalence of queries • Queries should be representable in XML documents can have embedded queries

  4. How Does One Design a Query Language? • In most query languages, there are two aspects to a query: • Retrieving data (e.g., from … where … in SQL) • Creating output (e.g., select … in SQL) • Retrieval consists of • Pattern matching (e.g., from … ) • Filtering (e.g., where … ) … although these cannot always be clearly distinguished

  5. XQuery Principles • Data Model identical with the XPath data model • documents are ordered, labeled trees • nodes have identity • nodes can have simple or complex types (defined in XML Schema) • A query result is an ordered list/sequence of items (nodes, values, attributes, etc., but not lists) • special case: the empty list ()

  6. XQuery Principles (cntd) • XQuery can be used without schemas, but can be checked against DTDs and XML schemas • XQuery is a functional language • no statements • evaluation of expressions • function definitions • modules

  7. The Recipes DTD (Reminder) <!ELEMENT recipes (recipe*)> <!ELEMENT recipe (title, ingredient+, preparation, nutrition)> <!ELEMENT title (#PCDATA)> <!ELEMENT ingredient (ingredient*, preparation?)> <!ATTLIST ingredient name CDATA #REQUIRED amount CDATA #IMPLIED unit CDATA #IMPLIED> <!ELEMENT preparation (step+)> <!ELEMENT step (#PCDATA)> <!ELEMENT nutrition EMPTY> <!ATTLIST nutrition calories CDATA #REQUIRED fat CDATA #REQUIRED>

  8. A Query over the Recipes Document <titles> {for $r in doc("recipes.xml")//recipe return $r/title} </titles>returns <titles> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> <title>Ricotta Pie</title> … </titles>

  9. Part to be returned as it is given {To be evaluated} Iteration $var - variables XPath Query Features <titles> {for $r in doc("recipes.xml")//recipe return $r/title} </titles> doc(String) returns input document Sequence of results,one for each variable binding

  10. Features: Summary • The result is a new XML document • A query consists of parts that are returned as is • ... and others that are evaluated (everything in {...} ) • Calling the function doc(String) returns an input document • XPath is used to retrieve node sets and values • Iteration over node sets: forbinds a variable to all nodes in a node set • Variables can be used in XPath expressions • return returns a sequence of results, one for each binding of a variable

  11. XPath is a Fragment of XQuery • doc("recipes.xml")//recipe[1]/title returns <title>Beef Parmesan with Garlic Angel Hair Pasta</title> • doc("recipes.xml")//recipe[position()<=3] /title returns <title>Beef Parmesan with Garlic Angel Hair Pasta</title>, <title>Ricotta Pie</title>, <title>Linguine alla Pescadora</title> anelement a list of elements

  12. Beware: Attributes in XPath • doc("recipes.xml")//recipe[1]/ingredient[1] /@name • → attribute name {"beef cube steak"} • string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name) • → "beef cube steak" a constructor for an attribute node a value of type string

  13. Beware: Attributes in XPath (cntd.) • <first-ingredient>{string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)}</first-ingredient> • → <first-ingredient>beef cube steak</first-ingredient> an element with string content

  14. Beware: Attributes in XPath (cntd.) • <first-ingredient>{doc("recipes.xml")//recipe[1] /ingredient[1]/@name} </first-ingredient> →<first-ingredient name="beef cube steak"/> an element with an attribute • Note: The XML that we write down is only the surface structure ofthe data model that is underlying XQuery

  15. An attribute is cast as a string Beware: Attributes in XPath (cntd.) • <first-ingredient • oldName="{doc("recipes.xml")//recipe[1] /ingredient[1]/@name}">Beef</first-ingredient> • → <first-ingredient oldName="beef cube steak"> • Beef • </first-ingredient>

  16. element constructor attribute constructor Constructor Syntax For all constituents of documents, there are constructors element first-ingredient { attribute oldName {string(doc("recipes.xml")//recipe[1] /ingredient[1]/@name)}, "Beef" } equivalent to the notation on the previous slide

  17. Iteration with the For-Clause Syntax: for $var in xpath-expr Example: for $r in doc("recipes.xml")//recipe return string($r) • The expression creates a list of bindings for a variable $var If $var occurs in an expression exp, then exp is evaluated for each binding • For-clauses can be nested: for $r in doc("recipes.xml")//recipefor $v in doc("vegetables.xml")//vegetable return ...

  18. What Does This Return? for $i in (1,2,3)for $j in (1,2,3) return element {concat("x",$i * $j)} {$i * $j}

  19. Nested For-clauses: Example <my-recipes> {for $r in doc("recipes.xml")//recipe return <my-recipe title="{$r/title}"> {for $i in $r//ingredient return <my-ingredient> {string($i/@name)} </my-ingredient> } </my-recipe> } </my-recipes> Returns my-recipes with titles as attributes and my-ingredients with names as text content

  20. The Let Clause Syntax: let $var := xpath-expr • binds variable$var to a list of nodes, with the nodes in document order • does not iterate over the list • allows one to keep intermediate results for reuse (not possible in SQL) Example: let $ooreps := doc("recipes.xml")//recipe [.//ingredient/@name="olive oil"]

  21. Let Clause: Example <calory-content>{let $ooreps := doc("recipes.xml")//recipe [.//ingredient/@name="olive oil"] for $r in $ooreps return<calories>{$r/title/text()} {": "} {string($r/nutrition/@calories)}</calories>}</calory-content> Note the implicit string concatenation Calories of recipes with olive oil

  22. Let Clause: Example (cntd.) The query returns: <calory-content> <calories>Beef Parmesan: 1167</calories> <calories>Linguine alla Pescadora: 532</calories> </calory-content>

  23. The Where Clause Syntax: where <condition> • occurs beforereturn clause • similar to predicates in XPath • comparisons on nodes: “=“ for node equality “<<“ and “>>” for document order • Example: for $r in doc("recipes.xml")//recipe where $r//ingredient/@name="olive oil" return ...

  24. Quantifiers • Syntax:some/every$varin<node-set>satisfies<expr> • $var is bound to all nodes in <node-set> • Test succeeds if<expr>is true for some/every binding • Note: if <node-set>is empty, then “some” is false and “all” is true

  25. Quantifiers (Example) • Recipes that have some compound ingredient • Recipes where every top levelingredient is non-compound for $r in doc("recipes.xml")//recipe where some $i in $r/ingredient satisfies $i/ingredient return $r/title for $r in doc("recipes.xml")//recipe where every $i in $r/ingredient satisfies not($i/ingredient) return $r/title

  26. an attribute an element Element Fusion “To every recipe, add the attribute calories!” <result> {let $rs := doc("recipes.xml")//recipe for $r in $rs return <recipe> {$r/nutrition/@calories} {$r/title} </recipe>} </result>

  27. Element Fusion (cntd.) The query result: <result> <recipe calories="1167"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </recipe> <recipe calories="349"><title>Ricotta Pie</title></recipe> <recipe calories="532"><title>Linguine Pescadoro</title></recipe> <recipe calories="612"><title>Zuppa Inglese</title></recipe> <recipe calories="8892"> <title>Cailles en Sarcophages</title> </recipe> </result>

  28. Fusion with Mixed Syntax We mix constructor and XML–Syntax: element result {let $rs := doc("recipes.xml")//recipe for $r in $rs return <recipe> {attribute calories {$r/nutrition/@calories}} {$r/title} </recipe>}

  29. The Same with Constructor Syntax Only element result {let $rs := doc("recipes.xml")//recipe for $r in $rs return element recipe { attribute calories{$r/nutrition/@calories}, $r/title } }

  30. Join condition Join “Pair every ingredient with the recipes where it is used!” let $rs := doc("recipes.xml")//recipe for $i in $rs//ingredient for $r in $rs where $r//ingredient/@name=$i/@name return <usedin> {$i/@name} {$r/title} </usedin>

  31. Join (cntd.) The query result: <usedin name="beef cube steak"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </usedin>, <usedin name="onion, sliced into thin rings"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </usedin>, <usedin name="green bell pepper, sliced in rings"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </usedin>

  32. Join Exercise Return the ingredients that • occur with different amounts in different context and return • the recipes where they are used • together with the amount being used in those recipes, while returning every pair only once. Could a query for these ingredients be expressed in XPath?

  33. Join condition Document Inversion “For every ingredient, return all the recipes where it is used!” <result> {let $rs := doc("recipes.xml")//recipe for $i in $rs//ingredient return <ingredient> {$i/@*} {$rs[.//ingredient/@name=$i/@name]/title} </ingredient>} </result>

  34. Document Inversion (cntd.) The query result: <result> <ingredient amount="1" name="Alchermes liquor" unit="cup"> <title>Zuppa Inglese</title> </ingredient> … <ingredient amount="2" name="olive oil" unit="tablespoon"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> <title>Linguine Pescadoro</title> </ingredient> …

  35. Eliminating Duplicates The function distinct-values(Node Set) • extracts the values of a sequence of nodes • creates a duplicate free list of values Note the coercion: nodes are cast as values! Example: let $rs := doc("recipes.xml")//recipereturn distinct-values($rs//ingredient/@name) yields xdt:untypedAtomic("beef cube steak"), xdt:untypedAtomic("onion, sliced into thin rings"), ...

  36. Avoiding Multiple Results in a Join We want that every ingredient is listed only once: Eliminate duplicates using distinct-values! <result>{let $rs := doc("recipes.xml")//recipe for $in in distinct-values( $rs//ingredient/@name) return<recipes with="{$in}">{$rs[.//ingredient/@name=$in]/title}</recipes> }</result>

  37. Avoiding Multiple Results (cntd.) The query result: <result> <recipes with="beef cube steak"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </recipes> <recipes with="onion, sliced into thin rings"> <title>Beef Parmesan with Garlic Angel Hair Pasta</title> </recipes>... <recipes with="salt"> <title>Linguine Pescadoro</title> <title>Cailles en Sarcophages</title> </recipes> ...

  38. The Order By Clause Syntax: order byexpr [ascending|descending] for $iname in doc("recipes.xml")//@name order by $iname descending return string($iname) yields "whole peppercorns","whole baby clams","white sugar",...

  39. The Order By Clause (cntd.) The interpreter must be told whether the values should be regarded as numbersor as strings (alphanumerical sorting is default) for $r in $rsorder by number($r/nutrition/@calories)return $r/title Note: • The query returns titles ... • but the ordering is according to calories, which do not appear in the output Also possible in SQL! What if combined with distinct-values?

  40. FLWOR Expresssions (pronounced “flower”) We have now seen the main ingredients of XQuery: • For and Let clauses, which can be mixed • a Where clause imposing conditions • an Orderby clause, which determines the order of results • a Return clause, which constructs the output. Combining these yields FLWOR expressions.

  41. Conditionals if(expr)thenexprelseexpr Example let $is := doc("recipes.xml")//ingredientfor $i in $is[not(ingredient)]let $u := if (not($i/@unit)) then attribute unit {"pieces"} else () creates an attributeunit="pieces" if none existsand an empty item list otherwise

  42. Collects all attributes in a list and adds a unitif needed Conditionals (cntd.) We use the conditional to construct variants of ingredients: let $is := doc("recipes.xml")//ingredientfor $i in $is[not(ingredient)] let $u := if (not($i/@unit)) then attribute {"unit"} {"pieces"} else ()return<ingredient> {$i/@* | $u}</ingredient>

  43. Conditionals (cntd.) The query result: <ingredient name="beef cube steak" amount="1.5" unit="pound"/>, ... <ingredient name="eggs" amount="12" unit="pieces"/>,…

  44. Grouping and Aggregation Aggregation functionscount, sum, avg, min, max Example: The number of simple ingredients per recipe for $r in doc("recipes.xml")//recipereturn<number> {attribute title {$r/title/text()}} {count($r//ingredient[not(ingredient)])}</number>

  45. Grouping and Aggregation (cntd.) The query result: <number title="Beef Parmesan with Garlic Angel Hair Pasta"> 11</number>, <number title="Ricotta Pie">12</number>, <number title="Linguine Pescadoro">15</number>, <number title="Zuppa Inglese">8</number>, <number title="Cailles en Sarcophages">30</number>

  46. Nested Aggregation “The recipe with the maximal number of calories!” let $rs := doc("recipes.xml")//recipelet $maxCal := max($rs//@calories)for $r in $rswhere $r//@calories = $maxCalreturn string($r/title) returns "Cailles en Sarcophages"

  47. Exercises Write queries that produce • A list, containing for every recipe the recipe's title element and an element with the number of calories • The same, ordered according to calories • The same, alphabetically ordered according to title • The same, ordered according to the fat content • The same, with title as attribute and calories as content. • A list, containing for every recipe the top level ingredients, dropping the lower level ingredients

  48. Sample Solution <results>{for $r in doc("recipes.xml")//recipe return<recipe>{attribute title {$r/title}, for $i in $r/ingredient return if (not($i/ingredient)) then $i else<ingredient> {$i/@*}</ingredient>}</recipe>}</results>

  49. User-defined Functions declare function local:fac($n as xs:integer) as xs:integer { if ($n = 0) then 1 else $n * local:fac($n - 1) }; local:fac(10) FunctionDeclaration FunctionCall

  50. Example: Nested Ingredients declare function local:nest($n as xs:integer, content as xs:string) as element() { if ($n = 0) then element ingredient{$content} else element ingredient{local:nest($n - 1,$content)} }; local:nest(3,"Stuff")

More Related