1 / 41

XQuery

XQuery. In this lecture. Summary of XQuery FLWOR expressions – For, Let, Where, Order by, Return FOR and LET expressions Collections and sorting. Sample Data for Queries.

Download Presentation

XQuery

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XQuery

  2. In this lecture • Summary of XQuery • FLWOR expressions – For, Let, Where, Order by, Return • FOR and LET expressions • Collections and sorting

  3. Sample Data for Queries <bib><book> <publisher> Addison-Wesley </publisher> <author> Serge Abiteboul </author> <author> <first-name> Rick </first-name> <last-name> Hull </last-name> </author> <author> Victor Vianu </author> <title> Foundations of Databases </title> <year> 1995 </year></book><bookprice=“55”> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database and Knowledge Base Systems </title> <year> 1998 </year></book> </bib>

  4. bib Data Model for XPath The root The root element book book publisher author . . . . Addison-Wesley Serge Abiteboul

  5. FLWOR (“Flower”) Expressions • FLWOR expression supports iteration and binding of variables to intermediate results • useful for computing joins between two or more documents and for restructuring data FOR ... LET... WHERE...ORDER BY…RETURN... • The for and let clauses generate an ordered sequence of tuples of bound variables, called the tuple stream • The optional where clause serves to filter the tuple stream • The optional order by clause can be used to reorder the tuple stream • The return clause constructs the result of the FLWOR expression

  6. FOR-WHERE-RETURN Find all book titles published after 1995: FOR$xINdocument("bib.xml")/bib/book WHERE$x/year/text() > 1995 RETURN$x/title Result: <title> abc </title> <title> def </title> <title> ghi </title>

  7. Equivalently (perhaps more geekish) FOR-WHERE-RETURN FOR$xINdocument("bib.xml")/bib/book[year/text() > 1995] /title RETURN$x And even shorter: document("bib.xml")/bib/book[year/text() > 1995] /title

  8. FOR-WHERE-RETURN • Find all book titles and the year when they were published: FOR$xINdocument("bib.xml")/ bib/bookRETURN <answer> <title>{ $x/title/text() } </title> <year>{ $x/year/text() } </year> </answer> Result: <answer> <title> abc </title> <year> 1995 </ year > </answer> <answer> <title> def </title> < year > 2002 </ year > </answer> <answer> <title> ghk </title> < year > 1980 </ year > </answer>

  9. Nesting For each author of a book by Morgan Kaufmann, list all books she published: FOR$b IN document(“bib.xml”)/bib,$aIN$b/book[publisher /text()=“Morgan Kaufmann”]/author RETURN <result> { $a, FOR$tIN$b/book[author/text()=$a/text()]/title RETURN$t} </result> In the RETURN clause comma concatenates XML fragments

  10. Result <result> <author>Jones</author> <title> abc </title> <title> def </title> </result> <result> <author> Smith </author> <title> ghi </title> </result>

  11. Aggregates Find all books with more than 3 authors: FOR$x IN document("bib.xml")/bib/bookWHEREcount($x/author)>3 RETURN$x count = a function that counts avg = computes the averagesum = computes the sumdistinct-values = eliminates duplicates

  12. Aggregates Same thing: FOR$x IN document("bib.xml")/bib/book[count(author)>3] RETURN$x

  13. Aggregates Print all authors who published more than 3 books – be aware of duplicates ! FOR$b IN document("bib.xml")/bib, $aINdistinct-values($b/book/author/text())WHEREcount($b/book[author/text()=$a])>3 RETURN <author> { $a } </author>

  14. Aggregates <big_publishers> FOR$pINdistinct(document("bib.xml")//publisher) LET$b := document("bib.xml")/book[publisher = $p] WHEREcount($b) > 100 RETURN$p </big_publishers> count = a (aggregate) function that returns the number of elms

  15. Aggregates Find books whose price is larger than average: LET$a=avg(document("bib.xml")/bib/book/@price) FOR$b in document("bib.xml")/bib/book WHERE$b/@price > $a RETURN$b

  16. Flattening • “Flatten” the authors, i.e. return a list of (author, title) pairs FOR$bINdocument("bib.xml")/bib/book,$xIN$b/title/text(),$yIN$b/author/text()RETURN <answer> <title> { $x } </title> <author> { $y } </author> </answer> Result:<answer> <title> abc </title> <author> efg </author></answer><answer> <title> abc </title> <author> hkj </author></answer>

  17. Re-grouping • For each author, return all titles of her/his books Result:<answer> <author> efg </author> <title> abc </title> <title> klm </title> . . . . </answer> FOR$b IN document("bib.xml")/bib,$xIN$b/book/author/text()RETURN <answer> <author> { $x } </author> { FOR$yIN$b/book[author/text()=$x]/titleRETURN$y } </answer> What aboutduplicateauthors ?

  18. Re-grouping • Same, but eliminate duplicate authors: FOR$b IN document("bib.xml")/bibLET$a := distinct-values($b/book/author/text())FOR$xIN$aRETURN <answer> <author> {$x} </author> { FOR$yIN$b/book[author/text()=$x]/titleRETURN$y } </answer>

  19. Re-grouping • Same thing: FOR$b IN document("bib.xml")/bib,$xINdistinct-values($b/book/author/text())RETURN <answer> <author> {$x} </author> { FOR$yIN$b/book[author/text()=$x]/titleRETURN$y } </answer>

  20. Another Example Find book titles by the coauthors of “Database Theory”: FOR$b IN document("bib.xml")/bib,$xIN$b/book[title/text() = “Database Theory”],$yIN$b/book[author/text() = $x/author/text()]RETURN <answer> { $y/title/text() } </answer> Result: <answer> abc </ answer > < answer > def </ answer > < answer > abc </ answer > < answer > ghk </ answer >

  21. Distinct-values Same as before, but eliminate duplicates: FOR$b IN document("bib.xml")/bib,$xIN$b/book[title/text() = “Database Theory”]/author/text(),$yINdistinct-values($b/book[author/text() = $x] /title/text()) RETURN <answer> { $y } </answer> Result: <answer> abc </ answer > < answer > def </ answer > < answer > ghk </ answer > distinct-values = a function that eliminates duplicates

  22. FOR$x in document(“db.xml”)/db/Product/rowORDER BY$x/price/text()RETURN <answer> { $x/name, $x/price } </answer> SELECT x.name, x.priceFROM Product xORDER BY x.price SQL XQuery SQL and XQuery Side-by-side Find all product names, prices,sort by price Product(pid, name, maker, price)

  23. Xquery’s Answer <answer> <name> abc </name> <price> 7 </price></answer> <answer> <name> def </name> <price> 23 </price></answer> . . . . Notice: this is NOT awell-formed document !(WHY ???)

  24. Producing a Well-Formed Answer <myQuery> { FOR$x in document(“db.xml”)/db/Product/rowORDER BY$x/price/text()RETURN <answer> { $x/name, $x/price } </answer> }</myQuery>

  25. Xquery’s Answer <myQuery> <answer> <name> abc </name> <price> 7 </price> </answer> <answer> <name> def </name> <price> 23 </price> </answer> . . . .</myQuery> Now it is well-formed !

  26. FOR$r in document(“db.xml”)/db,$x in $r/Product/row,$y in $r/Company/rowWHERE$x/maker/text()=$y/cid/text() and $y/city/text() = “Seattle”RETURN { $x/name } SELECT x.nameFROM Product x, Company yWHERE x.maker=y.cid and y.city=“Seattle” SQL XQuery SQL and XQuery Side-by-side Product(pid, name, maker, price)Company(cid, name, city, revenues) Find all products made in Seattle FOR$y in /db/Company/row[city/text()=“Seattle”],$x in /db/Product/row[maker/text()=$y/cid/text()]RETURN { $x/name }

  27. <product> <row> <pid> 123 </pid> <name> abc </name> <maker> efg </maker> </row> <row> …. </row> … </product><product> . . . </product>. . . .

  28. SQL and XQuery Side-by-side For each company with revenues < 1M count the products over $100 SELECT y.name, count(*)FROM Product x, Company yWHERE x.price > 100 and x.maker=y.cid and y.revenue < 1000000GROUP BY y.cid, y.name FOR$r in document(“db.xml”)/db,$y in $r/Company/row[revenue/text()<1000000]RETURN <proudCompany> <companyName> { $y/name/text() } </companyName> <numberOfExpensiveProducts> { count($r/Product/row[maker/text()=$y/cid/text()][price/text()>100]) } </numberOfExpensiveProducts> </proudCompany>

  29. SQL and XQuery Side-by-side Find companies with at least 30 products, and their average price SELECT y.name, avg(x.price)FROM Product x, Company yWHERE x.maker=y.cidGROUP BY y.cid, y.nameHAVING count(*) > 30 An element FOR$r in document(“db.xml”)/db,$y in $r/Company/rowLET$p := $r/Product/row[maker/text()=$y/cid/text()]WHEREcount($p) > 30RETURN <theCompany> <companyName> { $y/name/text() } </companyName> <avgPrice> avg($p/price/text()) </avgPrice> </theCompany> A collection

  30. Sorting in XQuery <publisher_list> FOR$bIN document("bib.xml")//book[publisher = ‘M.K’] ORDERBY$b/price/text() RETURN <book> { $b/title , $b/price } </book> </publisher_list>

  31. If-Then-Else FOR$h IN //holding RETURN <holding> $h/title, IF$h/@type = "Journal" THEN$h/editor ELSE$h/author </holding> SORTBY (title)

  32. Existential Quantifiers FOR$b IN //book WHERESOME$p IN $b//paraSATISFIES contains($p, "sailing") AND contains($p, "windsurfing") RETURN$b/title

  33. Universal Quantifiers FOR$b IN //book WHEREEVERY$p IN $b//paraSATISFIES contains($p, "sailing") RETURN$b/title

  34. Duplicate Elimination • distinct-values(list-of-text-values) <row> <a>3</a> <b>100</b> </row> <row> <a>8</a> <b>500</b> </row> <row> <a>3</a> <b>100</b> </row> <row> <a>3</a> <b>200</b> </row> <row> <a>8</a> <b>500</b> </row> <row> <a>3</a> <b>100</b> </row> <row> <a>8</a> <b>500</b> </row> <row> <a>3</a> <b>200</b> </row>

  35. FOR v.s. LET • FOR$x in expr -- binds $x to each element in the list expr • LET$x = expr -- binds $x to the entire list expr • Useful for common subexpressions and for aggregations

  36. FOR v.s. LET Summary: • FOR-LET-WHERE-RETURN = FLWR FOR/LET Clauses List of tuples WHERE Clause List of tuples RETURN Clause Instance of Xquery data model

  37. FOR v.s. LET FOR • Binds node variables iteration LET • Binds collection variables one value

  38. FOR v.s. LET Returns: <result> <book>...</book></result> <result> <book>...</book></result> <result> <book>...</book></result> ... FOR$xINdocument("bib.xml")/bib/book RETURN <result> $x </result> LET$x:=document("bib.xml")/bib/book RETURN <result> $x </result> Returns: <result> <book>...</book> <book>...</book> <book>...</book> ... </result>

  39. Collections in XQuery • Ordered and unordered collections • /bib/book/author = an ordered collection • Distinct(/bib/book/author) = an unordered collection • LET$a = /bib/book $a is a collection • $b/author  a collection (several authors...) Returns: <result> <author>...</author> <author>...</author> <author>...</author> ... </result> RETURN <result> $b/author </result>

  40. Collections in XQuery What about collections in expressions ? • $b/@price list of n prices • $b/@price * 0.7  list of n numbers • $b/@price * $b/@quantity  list of n x m numbers ?? • $b/@price * ($b/@quant1 + $b/@quant2)  $b/@price * $b/@quant1 + $b/@price * $b/@quant2 !!

  41. Sorting in XQuery <publisher_list> FOR$pINdistinct(document("bib.xml")//publisher) RETURN <publisher> <name> $p/text() </name> , FOR$bIN document("bib.xml")//book[publisher = $p] RETURN <book> $b/title , $b/@price </book> SORTBY(priceDESCENDING) </publisher> SORTBY(name) </publisher_list>

More Related