1 / 58

about XML/Xquery/RDF

about XML/Xquery/RDF. < h1 > Bibliography </ h1 > < p > < i > Foundations of Databases </ i > Abiteboul, Hull, Vianu <br> Addison Wesley, 1995 < p > < i > Data on the Web </ i > Abiteoul, Buneman, Suciu < br > Morgan Kaufmann, 1999. < bibliography >

kele
Download Presentation

about XML/Xquery/RDF

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. about XML/Xquery/RDF

  2. <h1> Bibliography </h1> <p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995 <p> <i> Data on the Web </i> Abiteoul, Buneman, Suciu <br> Morgan Kaufmann, 1999 <bibliography> <book> <title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> … </bibliography> HTML vs. XML “Self-describing” -Schema info part of the data -Good for data exchange (albeit baroque for storage)

  3. Why are Database folks so excited about XML? • XML is just a syntax for (self-describing) data • This is still exciting because • No standard syntax for relational data • With XML, we can • Translate any legacy data to XML • Can exchange data in XML format • Ship over the web, input to any application

  4. Jim Hendler XML machine accessible meaning This is what a web-page in natural language looks like for a machine

  5. < > name < > education < > CV < > work < > private Jim Hendler XML machine accessible meaning XML allows “meaningful tags” to be added toparts of the text

  6. < > < name > name <education> < > education < CV > < > CV <work> < > work <private> < > private Jim Hendler XML machine accessible meaning But to your machine, the tags look like this….

  7. Jim Hendler XML machine accessible meaning Schemas help…. < CV > …by relating common termsbetween documents private

  8. name> < > name < > <educ> education < CV > < > CV < > work <> < > <> private Jim Hendler But other people use other schemas Someone else has one like this….

  9. Jim Hendler But other people use other schemas < CV > …which don’t fit in private Moral: There is still need for ontology mapping..

  10. 11/18

  11. The X-standards… • XML: an on-the-wire representation for data • Xquery: a query language for XML • Xschema: a schema description language for XML data • RDF: a language for meta-data description • WSDL/SOAP/UDDI: languages for describing services

  12. XML Terminology • tags: book, title, author, … • start tag: <book>, end tag: </book> • elements: <book>…<book>,<author>…</author> • elements are nested • empty element: <red></red> abbrv. <red/> • an XML document: single root element well formed XML document: if it has matching tags

  13. <h1> Bibliography </h1> <p> <i> Foundations of Databases </i> Abiteboul, Hull, Vianu <br> Addison Wesley, 1995 <p> <i> Data on the Web </i> Abiteoul, Buneman, Suciu <br> Morgan Kaufmann, 1999 • <bibliography> • <book> <title> Foundations… </title> • <author> Abiteboul </author> • <author> Hull </author> • <author> Vianu </author> • <publisher> Addison Wesley </publisher> • <year> 1995 </year> • </book> • … • </bibliography> HTML describes presentation XML describes content

  14. XML Terminology • tags: book, title, author, … • start tag: <book>, end tag: </book> • elements: <book>…<book>,<author>…</author> • elements are nested • empty element: <red></red> abbrv. <red/> • an XML document: single root element well formed XML document: if it has matching tags

  15. More XML: Attributes <bookprice = “55” currency = “USD”> <title> Foundations of Databases </title> <author> Abiteboul </author> … <year> 1995 </year> </book> Attributes are single-valued --No guidance on when to use them

  16. Object identifiers More XML: Oids and References <personid=“o555”> <name> Jane </name> </person> <personid=“o456”> <name> Mary </name> <childrenidref=“o123 o555”/> </person> <personid=“o123” mother=“o456”><name>John</name> </person> oids and references in XML are just syntax

  17. TEXT More Structure XML Less Structure Structured (relational) Data XML vs. Relational Data • XML is meant as a language that supports both Text and Structured Data • Conflicting demands... • XML supports semi-structured data • In essence, the schema can be union of multiple schemas • Easy to represent books with or without prices, books with any number of authors etc. • XML supports free mixing of text and data • using the #PCDATA type • XML is ordered (while relational data is unordered)

  18. DTDs Notice that DTD is not In XML syntax…  <!DOCTYPE paper [ <!ELEMENT paper (section*)> <!ELEMENT section ((title,section*) | text)> <!ELEMENT title (#PCDATA)> <!ELEMENT text (#PCDATA)> ]> Semi- structured <paper> <section> <text> </text> </section> <section> <title> </title> <section> … </section> <section> … </section> </section> </paper>

  19. XML Schemas • More recent proposal (with XML syntax) • unifies previous schema proposals • generalizes DTDs • uses XML syntax • two documents: structure and datatypes • http://www.w3.org/TR/xmlschema-1 • http://www.w3.org/TR/xmlschema-2

  20. RDF: Meta-data Standard for Web <rdf:Descriptionabout=“www.mypage.com”> <about> birds, butterflies, snakes </about> <author> <rdf:Description> <firstname> John </firstname> <lastname> Smith </lastname> </rdf:Description> </author> </rdf:Description> Good’ol semantic networks..?

  21. Querying XML • Requirements: • Need to handle lack of schema. • We may not know much about the data, so we need to navigate the XML. • Need to support both “information retrieval” and “SQL-style” queries. • Ordered vs. un-ordered XML • “Human readable” • like SQL?  • Candidates • Many… based on conflicting requirements • XSL: Makes IR folks happy • XML-QL: Makes DB folks happy • Xquery : W3C’s attempt to make everybody (un)happy

  22. 11/20 Agenda: Xquery examples Information Integration

  23. Xquery Resources • XQuery 1.0: An XML Query Language • W3C Working Draft 20 December 2001 • XML Query Use Cases • W3C Working Draft 20 December 2001 • Microsoft .Net Xquery Language Demo • http://131.107.228.20/ • Supports querying on the documents described in the W3C Use Cases • Xquery Tutorial by Fankhauser & Wadler • www.research.avayalabs.com/user/wadler/papers/xquery-tutorial/ xquery-tutorial.pdf

  24. FLoWeR Expressions Xquery queries are made up of FLWR expressions that work on “paths” • For binds variables to nodes • Let computes aggregates • Where applies a formula to find matching elements • Return constructs the output elements Path expressions are of the form: element//element/element[attrib=value]

  25. Comparison to SQL • Look at the use case description on Xquery manual • Supports all (?) SQL style queries (with different syntax of course) [default queries in the demo] • Has support for • “construction”—outputting the answers in arbitrary XML formats (use case XMP ) • “path expressions” --- navigating the XML tree (use case seq) • Simple text queries [use case text] • Allows queries on “Tag” elements • Removes the “data/meta-data” barrier in queries • For each book that has at least one author, list the title and first two authors, and an empty "et-al" element if the book has additional authors. [XMP use case 6]

  26. DTD for http://www.bn.com/bib.xml <!ELEMENT bib (book* )> <!ELEMENT book (title, (author+ | editor+ ), publisher, price )> <!ATTLIST book year CDATA #REQUIRED > <!ELEMENT author (last, first )> <!ELEMENT editor (last, first, affiliation )> <!ELEMENT title (#PCDATA )> <!ELEMENT last (#PCDATA )> <!ELEMENT first (#PCDATA )> <!ELEMENT affiliation (#PCDATA )> <!ELEMENT publisher (#PCDATA )> <!ELEMENT price (#PCDATA )>

  27. Example Query Query Result <bib> { for $b in /bib/book where $b/publisher = "Addison-Wesley" and $b/@year > 1991 return <book year={ $b/@year }> { $b/title } </book> } </bib> “For all books after 1991, return with Year changed from a tag to an attribute” <bib> <book year="1994"> <title>TCP/IP Illustrated</title> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> </book> </bib>

  28. Example Query (2) • Return the books that cost more at amazon than fatbrain Let $amazon := document(http://www.amazon.com/books.xml), Let $fatbrain := document(http://www.fatbrain.com/books.xml) For $am in $amazon/books/book, $fat in $fatbrain/books/book Where $am/isbn = $fat/isbn and $am/price > $fat/price Return <book>{ $am/title, $am/price, $fat/price }<book> Join

  29. XML frenzy in the DB Community • Now that XML is there, what can we do with it? • Convert all databases from Relational to XML? • Or provide XML views of relational databases? • Develop theory of native XML databases? • Or assume that XML data will be stored in relational databases.. • Issues: What sort of storage mechanisms? What sort of indices?

  30. XML middleware for Databases • XML adapters (middle-ware) received significant attention in DB community • SilkRoute (AT&T) • Xperanto (IBM) • Issues: • Need to convert relational data into XML • Tagging (easy) • Need to convert Xquery queries into equivalent SQL queries • Trickier as Xquery supports schema querying

  31. Xquery Tutorial Craig Knoblock University of Southern California

  32. References • XQuery 1.0: An XML Query Language • W3C Working Draft 20 December 2001 • XML Query Use Cases • W3C Working Draft 20 December 2001 • Microsoft .Net Xquery Language Demo • http://131.107.228.20/ • Supports querying on the documents described in the W3C Use Cases • Xquery Tutorial by Fankhauser & Wadler • www.research.avayalabs.com/user/wadler/papers/xquery-tutorial/ xquery-tutorial.pdf

  33. DTD for http://www.bn.com/bib.xml <!ELEMENT bib (book* )> <!ELEMENT book (title, (author+ | editor+ ), publisher, price )> <!ATTLIST book year CDATA #REQUIRED > <!ELEMENT author (last, first )> <!ELEMENT editor (last, first, affiliation )> <!ELEMENT title (#PCDATA )> <!ELEMENT last (#PCDATA )> <!ELEMENT first (#PCDATA )> <!ELEMENT affiliation (#PCDATA )> <!ELEMENT publisher (#PCDATA )> <!ELEMENT price (#PCDATA )>

  34. Data for www.bn.com/bib.xml <bib> <book year="1994"> <title>TCP/IP Illustrated</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price> 65.95</price> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> <author><last>Stevens</last><first>W.</first></author> <publisher>Addison-Wesley</publisher> <price>65.95</price> </book>

  35. Data for www.bn.com/bib.xml (cont.) <book year="2000"> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author> <publisher>Morgan Kaufmann Publishers</publisher> <price> 39.95</price> </book> <book year="1999"> <title>The Economics of Technology and Content for Digital TV</title> <editor> <last>Gerbarg</last><first>Darcy</first> <affiliation>CITI</affiliation> </editor> <publisher>Kluwer Academic Publishers</publisher> <price>129.95</price> </book> </bib>

  36. Document References • Document can either be referenced explicitly or in the default namespace • In the Microsoft Demo • /Bib = document("http://www.bn.com/bib.xml")/bib • We will use /bib throughout, but you must use the expansion to run the demo • In Theseus the document for xquery is passed as input

  37. Projection • Return the names of all authors of books /bib/book/author = <author><last>Stevens</last><first>W.</first></author> <author><last>Stevens</last><first>W.</first></author> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author>

  38. Project (cont.) • The same query can also be written as a for loop /bib/book/author = for $bk in /bib/book return for $aut in $bk/author return $aut = <author><last>Stevens</last><first>W.</first></author> <author><last>Stevens</last><first>W.</first></author> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author>

  39. Selection • Return the titles of all books published before 1997 /bib/book[@year < "1997"]/title = <title>TCP/IP Illustrated</title> <title>Advanced Programming in the Unix environment</title>

  40. Selection (cont.) • Return the titles of all books published before 1997 /bib/book[@year < "1997"]/title = for $bk in /bib/book where $bk/@year < "1997" return $bk/title = <title>TCP/IP Illustrated</title> <title>Advanced Programming in the Unix environment</title>

  41. Selection (cont.) • Return book with the title “Data on the Web” /bib/book[title = "Data on the Web"] = <book year="2000"> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author> <publisher>Morgan Kaufmann Publishers</publisher> <price> 39.95</price> </book>

  42. Selection (cont.) • Return the price of the book “Data on the Web” /bib/book[title = "Data on the Web"]/price = <price> 39.95</price> How would you return the book with a price of $39.95?

  43. Selection (cont.) • Return the book with a price of $39.95 for $bk in /bib/book where $bk/price = " 39.95" return $bk = <book year="2000"> <title>Data on the Web</title> <author><last>Abiteboul</last><first>Serge</first></author> <author><last>Buneman</last><first>Peter</first></author> <author><last>Suciu</last><first>Dan</first></author> <publisher>Morgan Kaufmann Publishers</publisher> <price> 39.95</price> </book>

  44. Construction • Return year and title of all books published before 1997 for $bk in /bib/book where $bk/@year < "1997" return <book>{ $bk/@year, $bk/title }</book> = <book year="1994"> <title>TCP/IP Illustrated</title> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> </book>

  45. Grouping • Return titles for each author for $author in distinct(/bib/book/author/last) return <author name={ $author/text() }> { /bib/book[author/last = $author]/title } </author> = <author name="Stevens"> <title>TCP/IP Illustrated</title> <title>Advanced Programming in the Unix environment</title> </author> <author name="Abiteboul"> <title>Data on the Web</title> </author> …

  46. Join • Return the books that cost more at amazon than fatbrain Let $amazon := document(http://www.amazon.com/books.xml), Let $fatbrain := document(http://www.fatbrain.com/books.xml) For $am in $amazon/books/book, $fat in $fatbrain/books/book Where $am/isbn = $fat/isbn and $am/price > $fat/price Return <book>{ $am/title, $am/price, $fat/price }<book>

  47. Example Query 1 <bib> { for $b in /bib/book where $b/publisher = "Addison-Wesley" and $b/@year > 1991 return <book year={ $b/@year }> { $b/title } </book> } </bib> What does this do?

  48. Result Query 1 <bib> <book year="1994"> <title>TCP/IP Illustrated</title> </book> <book year="1992"> <title>Advanced Programming in the Unix environment</title> </book> </bib>

  49. Example Query 2 <results> { for $b in document("http://www.bn.com/bib.xml")/bib/book, $t in $b/title, $a in $b/author return <result> { $t } { $a } </result> } </results>

More Related