1 / 31

XPath

XPath. Tao Wan March 04, 2002. What is XPath?. A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. Primary purpose: Address ‘part’ of an XML document, and provide basic facilities for manipulation of strings, numbers and booleans. Outline.

gerik
Download Presentation

XPath

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. XPath Tao Wan March 04, 2002

  2. What is XPath? • A language designed to be used by XSL Transformations (XSLT), Xlink, Xpointer and XML Query. • Primary purpose: Address ‘part’ of an XML document, and provide basic facilities for manipulation of strings, numbers and booleans.

  3. Outline • Introduction • Data Model • Xpath Syntax • Location Path • General Xpath Expressions • Core Function Library • XPath utilities • Conclusion

  4. Introduction • W3C Recommendation. November 16, 1999 • Latest version: http://www.w3.org/TR/xpath • XPath uses a compact, string-based, rather than XML element-based syntax. • Operates on the abstract, logical structure of an XML document rather than its surface syntax. • Uses a path notation (like in URLs) to navigate through this hierarchical tree structure. Introduction

  5. Introduction Cont. • Xpath models an XML doc as a tree of nodes and defines a way to compute a string-value for each type of node. • Supports Namespaces. • Expression (Expr) is the primary syntactic construct of Xpath. Introduction

  6. Data Model • The way to represent an XML document. • This tree consists of 7 nodes: • Root Node • Element Nodes • Attribute Nodes • Namespace Nodes • Processing Instruction Nodes • Comment Nodes • Text Nodes • The tree structure is ordered in order of the occurrence of nodes’ start-tag in the XML doc. Data Model

  7. Data Model Example <?xml version=“1.0”> <?xml-stylesheet type=“text/xsl” href=“bib.xsl” ?> <! -- simple XML document --> <bib><book price=“25.00” pages=“400”> <publisher> IDG books</publisher> <author> <first-name>Rick</first-name> <last-name> Hull </last-name> </author> <author> Simon North</author> <title> XML complete </title> <year> 1997 </year></book><book> <publisher> Freeman </publisher> <author> Jeffrey D. Ullman </author> <title> Principles of Database </title> <year> 1998 </year></book> </bib> Data Model

  8. Xpath Syntax • Expression is the primary syntactic construct in XPath • Evaluated to yield an object of 4 basic types. • node-set (unordered collection of nodes without duplicates). • boolean (true/false) • number (float) • string (sequence of UCS chars) • Expression Evaluation occurs will respect to a context. (XSLT/XPointer specified context) • Location path is one important kind of expression. • Location paths select a set of nodes relative to the context node. Expression

  9. Location Path • Location Path provides the mechanism for ‘addressing’ parts of an XML doc, similar to file system addressing. Ex: /book/year (select all the year elements that have a book parent) • Every location path can be expressed using a straightforward but rather verbose syntax: • unabbreviated syntax (verbose syntax) Ex: child::* (select all element children of the context node) • abbreviated syntax Ex. * (equivalent to unabbreviation above) Location Path

  10. Location Path Cont. • Two types of paths: Relative & Absolute • Relative location path: consists of a sequence of one or more location steps separated by / • absolute location path: consists of / optionally followed by a relative location path • Composed of a series of steps (1 or more) Ex. Child::bib/child::book (select the book element children of the bib element children of the context node) Ex. / (select the root node of the document containing the context node) Location Path

  11. Location Path Examples • Verbose syntax (has syntactic abbreviations for common cases)Examples (unabbreviated) • child::book selects the book element children of the context node • child::* selects all element children of the context node • attribute::price selects the price attribute of the context node • descendant::book selects all bookdescendants of the context node • self::book selects the context node if it is a book element (otherwise selects nothing) • child::*/child::book selects all bookgrandchildren of the context node • / selects the document root (which is always the parent of the document element) Location Path

  12. Location Steps • 3 parts • axis (specifies relationship btwn selected nodes and the context node) • node test (specifies the node type and expanded-name of selected nodes) • predicates (arbitrary expressions to refine the selected set of nodes) • The syntax for location step is the axis name and node test separated by a double colon followed by zero or more expressions, each in square bracket. • Evaluate a location step is to generate an initial node-set from axis (relationship to context node) and node-test (node-type and expanded-name), then filter that node-set by each of the predicates in turn. ex: child::book[position( )=1] child is the name of the axis, book is the node test, and [position()=1] is a predicate • ex: descendant::book[position( )=1] • selects the all book element descendants of the context node firstly, then filter the one • which is first book descendant of context node. Location Step

  13. Location Steps We’ve only seen these, so far • Axes • 13 axes defined in XPath • Ancestor, ancestor-or-self • Attribute • Child • Descendant, descendant-or-self • Self • Following • Preceding • Following-sibling, preceding-sibling • Namespace • Parent • Node test • Identifies type and expanded-name of node. • Can use a name, wildcard or function to evaluate/verify type and name. ex. Child::text() select the text node children of context node. Child::book select book element children of context node. Attribute::* select all attribute children of context node. Location step

  14. Location Step Cont. • Predicate • A predicate filters a node-set with respect to an axis to produce a new node-set. • Use XPath expressions (normally, boolean expressions) in square brackets following the basis (axis & node test). Ex. Child::book[attribute::price=“25”] (select all book children of the context node that have a price attribute with value 25. • A predicateExpr is evaluated by evaluating the Expr and converting the result to a boolean (True or False)

  15. Examples • Axis and Node Test: descendant::publisher (selects the publisher elements that are descendant of the context node) attributes::* (selects all attributes of the context node) • Basis and Predicate: child::book[3] (selects the 3rdbook of the children of the context node) child::*[self::author or self::year][position()=last()] (selects the last author or year child of the context node) child::book[attribute::page=“400”][5] (selects the fifth book child of the context node that has a page attribute with value 400) Location Path

  16. Abbreviated Syntax • Abbreviated syntax is the simpler way to express location path. • For common case, abbreviation can be used to express concisely (not every case). • Each abbreviation can be converted to unabbreviated one. child:: can be omitted from a location step (child is the default axis)ex. bib/book is equivalent to child::bib/child::book attribute:: can be abbreviated to @ ex. Book[@price=“25”] is short for child::book[attribute::price=“25”] // is short for /descendant-or-self::node()/ ex. Book//author is short for book/descendant-or-self::node()/child::author A location step of . is short for self::node()ex: .//book is short for self::node()/descendant-or-self::node()/child::book Location step of .. is short for parent::node() ex. ../title is short for parent::node()/child::title Location Path

  17. Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Function Calls Expressions

  18. Function Calls • Function call expression is evaluated by using the FunctionName to identify a function in the expression evaluation context function library. • An argument is converted • to type string (as if calling the string function), • to type boolean (as if calling the Boolean function), • to type number (as if calling the number function), • An argument that is not of type node-set cannot be converted to a node-set. Ex. position() function returns the current node’s position in the context node list as a number. Expressions

  19. Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Expressions

  20. Node-sets • A location path can be used as an expression. • The expression returns the set of nodes selected by the path. Expressions

  21. Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Expressions

  22. Booleans • A boolean can only have two values: true or false • The following operators can be used in boolean expressions or combine two boolean expressions according to the usual rules of boolean logic: • or • and • =, != • <=, <, >=, > Ex. Book=‘XML complete’ or book=‘Principles of Database Expressions

  23. Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Expressions

  24. Numbers • A number represents a floating-point number, no pure integers exist in Xpath. • The basic arithmetic operators include: +, -, *, div and mod. Ex. @id div 10 Expressions

  25. Expressions • Function Calls • Node-sets • Booleans • Numbers • Strings Expressions

  26. Strings • Strings consist of a sequence of zero or more character. • May be enclosed in either single or double quotes. • Comparison operators: =, != Expressions

  27. Core Function Library • XPath defines a core set of functions to evaluate expressions. • All implementations of Xpath must implement the core function library. • Four type of functions: • Node Set Functions: operate on or return info about node sets. • String Functions: are used for basic string operations. Ex. substring(“12345”, 0, 3) returns “12” • Boolean Functions: all return true or false. • Number Functions: are used for basic number operations. Core Library

  28. Xpath Utilities • Miscellaneous utilities related to Xpath • http://www.xmlsoftware.com/xpath/ • XPath Visualiser: • This is a powerful tool for the evaluation of an XPath expression and visual presentation of the resulting node-set. • allowing you to experiment with XPath for finding the correct expression. • The display of the XML source document is similar to the default IE display with the same syntax color and collapsible & expandable container nodes. • very straightforward XPath learning process. Xpath Utilities

  29. XPath Visualiser Context Node Xpath input Tree View of XML Doc Xpath evaluating result Result is highlighted Xpath Utilities

  30. Conclusion • Xpath is complete pattern match language. • Provides an concise way for addressing parts of an XML document. • Base for XSLT, Xpointer and XML Query WG. Supported by W3C. • Implementing XPath basically requires learning the abbreviated syntax of location path expressions and the functions of the core library. Conclusion

  31. Reference • XML Path Language (XPath) V1.0 http://www.w3.org/TR/xpath • XML in a Nutshell http://www.oreilly.com/catalog/xmlnut/chapter/ ch09.html • Managing XML and Semistructured Datahttp://www.cs.washington.edu/homes/suciu/COURSES/590DS/06xpath.htm • Xpath utilities http://www.xmlsoftware.com/xpath/ Xpath Reference

More Related