1 / 37

Introduction to XPath

Introduction to XPath. Bun Yue Professor, CS/CIS UHCL. Resources. XPath 1.0: http://www.w3.org/TR/xpath XPath 2.0: http://www.w3.org/TR/xpath20/ EditiX (free edition): http://free.editix.com/

foy
Download Presentation

Introduction to XPath

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Introduction to XPath Bun Yue Professor, CS/CIS UHCL

  2. Resources • XPath 1.0: http://www.w3.org/TR/xpath • XPath 2.0: http://www.w3.org/TR/xpath20/ • EditiX (free edition): http://free.editix.com/ • XPath 1.0 testbed by whitebeam: http://www.whitebeam.org/library/guide/TechNotes/xpathtestbed.rhtm

  3. Introduction to XPath 1.0 • XPath is used to address parts of an XML document. • XPath is a W3C recommendation. • The newest version is 2.0, which is largely backward compatible. • XPath is used by XPointer, XSLT and XQuery. • XPath is designed to access elements, but not creating new elements. • Designed to be embedded in a host language, such as XSLT or XQuery.

  4. Location Path • XPath uses path expressions to address parts of the documents, called location path. • A location path is composed of a sequence of location steps, separated by a '/'.

  5. Location Path • A location path can be absolute or relative. • an absolute location path starts with '/', the document root. • a relative location path does not start with '/'. Its path is relative to a context node.

  6. XPath 1.0 Results • The result of an XPath 1.0 may be one of the following four types: • Number • String • Boolean • node-set: a set of node • As a set, there is no duplicate node. • Not the same as a document fragment. • To be replaced by sequence in XPath 2.0.

  7. Example /stocks/stock matches all element nodes stock that are children of the root element stocks.

  8. Editix • In Editix, use “>View > Windows > XPath View” to execute XPath expressions. • May select XPath 1.0 or 2.0.

  9. Location Step • A location step is composed of three parts: • a node axis (required): to describe direction for navigation. • a node test (required): to specify the node type, and • a set of node predicate (optional): to specify additional inclusion test.

  10. Example //stocks/child::stock[@symbol=“IBM"]/lastprice Consider the location step: child::stock[@symbol=“IBM"] axis: childnode test: stockpredicate: [@symbol=“IBM"]

  11. Axis • An axis is the first part of the location step and is followed by :: before the node test and predicates. • There are 13 axes in XPath 1.0. • The default axis is the child axis. • The symbol @ can be used for the attribute axis.

  12. Axes in XPath 1.0 • child: the children of the context node. (not including attribute nodes). • descendant: contains the descendants of the context node. • parent: contains the parent of the context node, if there is one. • ancestor: the ancestors of the context node; including the root node if the context node is not the root node. • following-sibling: all the following siblings of the context node. • preceding-sibling: all the preceding siblings of the context node.

  13. Axes in Path 1.0 • following: all nodes in the same document as the context node that are after the context node in document order, excluding any descendants and excluding attribute nodes and namespace nodes • preceding: all nodes in the same document as the context node that are before the context node in document order, excluding any ancestors and excluding attribute nodes and namespace nodes • attribute: contains the attributes of the context node; the axis will be empty unless the context node is an element

  14. Axes in XPath 1.0 • namespace: the namespace nodes of the context node; the axis will be empty unless the context node is an element • self: contains just the context node itself • descendant-or-self: the context node and the descendants of the context node • ancestor-or-self: the context node and the ancestors of the context node; thus, the ancestor axis will always include the root node.

  15. Shorthand • . is the shorthand for self::node() • .. is the shorthand for parent::node(). • // is the shorthand for /descendant-or-self::node()/

  16. Node tests in XPath 1.0 • The second part of a location step. It is required. • There are three kind of node tests: • NameTest: the name of the node. • NodeType test: • node(): all nodes, including comments and PI, excluding attributes and the document root. • text() • comment() • processing-instruction('pi-name') • * is a wildcard character matching any name. It is a name test.

  17. Predicate tests • Predicate tests are the last part of a location steps. • They are enclosed by [] and are optional. • There may be more than one predicate test. • XPath built-in functions can be used to construct predicate (boolean) expression as the added condition for inclusion. • Boolean operators: and, or.

  18. Example //text() matches all text nodes. //@p[.='1'] select all attributes with the name p with value 1. //person[first][last]

  19. XPath Functions • There are many XPath 1.0 functions for testing and other purposes. • Many of them are obvious. The non-obvious ones are explained below.

  20. XPath 1.0 Functions • boolean(): convert to boolean data type. • false(): returns false always. • lang(arg): returns True iff the xml:lang attribute of the context node is the same as a sublanguage of the language specified by the argument string arg. • not(arg): negation of arg. • true() • count(arg): number of nodes in the nodeset argument arg.

  21. XPath Functions • id(arg): select elements with their id argument arg. • last(): returns the context size of the expression evaluation context • local-name(arg): returns the local name of the first node in the node-set argument arg; returns the local name of the context node if arg is missing. • name() • namespace-uri() • position(): returns the promixity position (starting from one) of the context node within the axis.

  22. XPath 1.0 Functions • ceiling(arg): ceiling of the number argument arg. • floor(arg) • number(arg): convert arg to number. • round(arg): • sum(arg): sum of values of the node set argument arg. • concat(): string concatenation of arguments. • contains(arg1. arg2): true iff arg1 contains arg2.

  23. XPath 1.0 Functions • normalize-space(arg): returns the string argument arg with white space stripped. • starts-with(arg1, arg2): whether arg1 starts with arg2. • string(): convert to string. • string-length(arg): the number of characters of the string arg. • substring(arg1, arg2, arg3): returns the substring of arg1 that starts with the index arg2 for a length of arg3.

  24. XPath 1.0 Functions • substring-after(arg1, arg2): the substring of arg1 after arg2. • substring-before(): the substring of arg1 after arg2. • translate(arg1, arg2, arg3): returns arg1 with each character in arg2 translated to the corresponding characters in arg3.

  25. XPath 1.0 Classwork • To be handed in the class. • Use Familytree.xml

  26. XPath 2.0 • W3C related specifications: • XQuery 1.0 and XPath 2.0 Data Model • XQuery 1.0 and XPath 2.0 Functions and Operators • XQuery 1.0 and XPath 2.0 Formal Semantics • XML Path Language (XPath) 2.0 • XSL Transformations (XSLT) Version 2.0 • XSLT 2.0 and XQuery 1.0 Serialization • XQuery 1.0: An XML Query Language

  27. Major Changes in XPath 2.0 • Sequences to replace node-sets as the main data model. • XML Schema data types • Variable binding • A rich set of functions • Richer expressions • New comment styles • …

  28. Sequences and items • A sequence is an orderedheterogeneous collection of items. • An item can be • A node • An atomic value

  29. Sequences Example: (1, 5 to 8, "Bun Yue", 2.1) (1+2, 5) (1 to 50)[. mod 3 = 1] /* | //person (1, 2, (3, (4, 5))) is (1,2,3,4,5)

  30. Sequences • Items within a sequence • Can be in any arbitrary order. • Can be heterogeneous. • Can be repeating. • Sequences are not nested. • XPath 2.0 results are sequences. Atomic values are considered to be sequences with a single item.

  31. For expression & variable binding • for $varname in (expression) return (expression) Example: for $person in //person return count($person/email) for $person in //person return fn:count($person/email)

  32. If statement Example: if (//person[first/text()='Boris']) then 'found Boris' else 'no Boris'

  33. XPath 2.0 Functions • Many new functions: http://www.w3schools.com/XPath/xpath_functions.asp • Some categories: • Sequences • Aggregate functions • Nodes • Numeric • String, with regular expressions

  34. Quantified Expressions • Applied to a sequence: • some • every • Format: • some $v in sequence satisfies condition • every $v in sequence satisfies condition

  35. Example if (every $person in //person satisfies $person/email) then "everyone has email address" else "oh oh"

  36. Classwork • To be handed in the class. • Use Familytree.xml

  37. Questions

More Related