330 likes | 493 Views
XML and XPath. E X tensible M arkup L anguage (XML). a W3C standard to complement HTML A markup language much like HTML origins: structured text SGML motivation: HTML describes presentation XML describes content HTML XML SGML
E N D
EXtensible Markup Language (XML) • a W3C standard to complement HTML • A markup language much like HTML • origins: structured text SGML • motivation: • HTML describes presentation • XML describes content • HTML XML SGML • http://www.w3.org/TR/2000/REC-xml-20001006 (version 2, 10/2000) Web Services: XML+XPath
HTML Describes the Presentation Web Services: XML+XPath
HTML Describes the Presentation <h1>Bibliography</h1> <p><i>Foundations of Databases</i> Abiteboul, Hull, Vianu <br>Addison Wesley, 1995 <p><i>Data on the Web</i> Abiteoul, Buneman, Suciu <br>Morgan Kaufmann, 1999 Web Services: XML+XPath
XML Describes the Contents <bibliography> <book><title> Foundations… </title> <author> Abiteboul </author> <author> Hull </author> <author> Vianu </author> <publisher> Addison Wesley </publisher> <year> 1995 </year> </book> … </bibliography> Web Services: XML+XPath
More on XML • XML was designed to describe data contents • XML tags are not predefined in XML.You must “invent” your own tags • XML uses a Document Type Definition (DTD) oran XML Schema to describe the data • XML with a DTD or XML Schema is designedto be self-descriptive Web Services: XML+XPath
Applications and Impact of XML • Data can be stored outside HTML • Data can be exchanged between incompatible systems • Financial information can be exchanged over the Internet • Plain text files can be used to share data • Data is available to more users • Used to create new languages • WML, WSDL, BPEL, …… Web Services: XML+XPath
Benefits of Using XML • XML is structured • XML documents are easily committed to a persistencelayer • XML is platformindependent, textual information • XML is an open standard • XML is language independent • DOM and SAX are open, language-independent set of interfaces • XML is web enabled • XML is totally extensible • XML supports shareable structure (using DTDs) • XML enables interoperability Web Services: XML+XPath
XML Terminology • tags: book, title, author, … • start tag: <book>, end tag: </book> • elements: <book>…</book>, <author>…</author> • elements are nested • empty element: <red></red> abbrv. <red/> • an XML document: single root element well formed XML document: if it has matching tags Web Services: XML+XPath
Attributes are Alternative Ways to Represent Data <bookprice = “55” currency = “USD”> <title>Foundations of Databases</title> <author>Abiteboul</author> … <year> 1995 </year> </book> Web Services: XML+XPath
CDATA Section • Syntax: <![CDATA[ .....any text here...]]> <script><![CDATA[function matchwo(a, b){if (a < b && a < 0) then { return 1 } else { return 0 } }]]></script> Web Services: XML+XPath
Entity References • Syntax: &entityname; • Example: <element> this is less than < </element> • Some entities: Web Services: XML+XPath
Processing Instructions • Syntax: <?targetargument?> • Example:<product><name> Alarm Clock </name><?ringBell 20?> <price>19.99</price></product> • What do they mean ? Web Services: XML+XPath
XML Comments • Syntax <!-- .... comment text... --> • Yes, they are part of the data model !!! Web Services: XML+XPath
Name Conflicts • Solution: prefix <table> <tr> <td>Apples</td> <td>Bananas</td> </tr></table> <table> <name>African Coffee Table </name> <width>80</width> <length>120</length></table> <h:table> <h:tr> <h:td>Apples</h:td> <h:td>Bananas</h:td> </h:tr></h:table> <f:table> <f:name>African Coffee Table </f:name> <f:width>80</f:width> <f:length>120</f:length></f:table> Web Services: XML+XPath
XML Namespaces • syntactic: <number> , <isbn:number> • semantic: provide URL for schema <tag xmlns:mystyle = “http: // …”> … <mystyle:title> … </mystyle:title> <mystyle:number> … </tag> Web Services: XML+XPath
XML Namespaces • http://www.w3.org/TR/REC-xml-names (1/99) • name ::= [prefix:]localpart <bookxmlns:isbn=“www.isbn-org.org/def”> <title> … </title> <number> 15 </number> <isbn:number> … </isbn:number> </book> Web Services: XML+XPath
XPath • http://www.w3.org/TR/xpath (11/99) • Building block for other W3C standards: • XSL Transformations (XSLT) • XML Link (XLink) • XML Pointer (XPointer) • XML Query • Was originally part of XSL Web Services: XML+XPath
Example for XPath Queries <bib><book> <publisher> Addison-Wesley </publisher> <author> Serge Abiteboul </author> <author> <first-name> Rick </first-name> <last-name> Hull </last-name></author> <author> Victor Vianu </author> <title> Foundations of Databases </title> <year> 1995 </year></book><bookprice=“55”> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database and Knowledge Base Systems </title> <year> 1998 </year></book> </bib> Web Services: XML+XPath
Data Model for XPath bib The root Comment Processing instruction The root element book book publisher author . . . . Addison-Wesley Serge Abiteboul Web Services: XML+XPath
Simple Expressions /bib/book/year Result: <year> 1995 </year> <year> 1998 </year> /bib/paper/year Result: empty (there were no papers) Web Services: XML+XPath
Restricted Kleene Closure //author Result: <author> Serge Abiteboul </author> <author><first-name> Rick </first-name> <last-name> Hull </last-name> </author> <author> Victor Vianu </author> <author> Jeffrey D. Ullman </author> /bib//first-name Result: <first-name> Rick </first-name> Web Services: XML+XPath
Functions /bib/book/author/text() Result: Serge Abiteboul Victor Vianu Jeffrey D. Ullman Rick Hull doesn’t appear because he has firstname, lastname • Functions in XPath: • text() = matches the text value • node() = matches any node (= * or @* or text()) • name() = returns the name of the current tag Web Services: XML+XPath
Wildcard //author/ * Result: <first-name> Rick </first-name> <last-name> Hull </last-name> * Matches any element Web Services: XML+XPath
Attribute Nodes /bib/book/@price Result: “55” @price means that price is has to be an attribute Web Services: XML+XPath
Qualifiers /bib/book/author[first-name] Result: <author><first-name> Rick </first-name> <last-name> Hull </last-name> </author> /bib/book/author[firstname][address[//zip][city]]/lastname Result: <lastname> … </lastname> <lastname> … </lastname> Web Services: XML+XPath
More Qualifiers /bib/book[@price < “60”] /bib/book[author/@age < “25”] /bib/book[author/text()] Web Services: XML+XPath
XPath Summary bib matches a bib element * matches any element / matches the root element /bib matches a bib element under root bib/paper matches a paper in bib bib//paper matches a paper in bib, at any depth //paper matches a paper at any depth paper|book matches a paper or a book @price matches a price attribute bib/book/@price matches price attribute in book, in bib bib/book[@price<“55”]/author/lastnamematches… Web Services: XML+XPath
XPath: More Details • An XPath expression, p, establishes a relation between: • A context node, and • A node in the answer set • In other words, p denotes a function: • S[p] : Nodes -> {Nodes} • Examples: • author/firstname • . = self • .. = parent • part/*/*/subpart/../name = part/*/*[subpart]/name Web Services: XML+XPath
The Root <bib><paper>1</paper><paper>2</paper></bib> • bib is the document element • The root is above bib • /bib = returns the document element • / = returns the root • Why? Because we may have comments before and after <bib>; they become siblings of <bib> • This is advanced xmlogy Web Services: XML+XPath
XPath Navigation • We can navigate along 13 axes: ancestor ancestor-or-self attribute child descendant descendant-or-self following following-sibling namespace parent preceding preceding-sibling self We’ve only seen these, so far Web Services: XML+XPath
Examples • Examples: • child::author/child::lastname = author/lastname • child::author/descendant::zip = author//zip • child::author/parent::* = author/.. • child::author/attribute::age = author/@age • What does this mean ? • paper/publisher/parent::*/author • /bib//address[ancestor::book] • /bib//author/ancestor::*//zip Web Services: XML+XPath
More Examples • name() = the name of the current node • /bib//*[name()=book] same as /bib//book • What does this mean ? /bib//*[ancestor::*[name()!=book]] • Navigation axis gives us strictly more power ! Web Services: XML+XPath