XPath Query Evaluation

XPath Query Evaluation- A Top Down Approach Mohammed Pithapurwala (mp66@cse.buffalo.edu) Pejus Das (pejusdas@cse.buffalo.edu)

Introduction • XPath Query Evaluation • Uses: • Select nodes in XML document • XSLT, XQuery • Polynomial V/s Exponential • Top Down Algorithm

XPath • What is XPath? • child::section[position()<6] / descendant::cite / attribute::href • selects all href attributes in cite elements in the first 5 sections of an article document • Structure of XPath expression • Axes • Node types • Node test • Returns • Number, node set, string, boolean

Implementation • XPath Axes • Child • Parent • Descendant • Axes Functions • FirstChild • nextSibling • Child := firstchild.nextsibling* • Parent := (nextsibling-1)*.firstchild-1 • Descendant := firstchild.(firstchild  nextsibling)*

Code Snippet public static Element firstChild(Element currNode) { Element fChild; fChild = null; List childNode = currNode.getChildren(); Iterator iterator = childNode.iterator(); if(iterator.hasNext()) { fChild = (Element) iterator.next(); } return(fChild); }

Node Test & Expressions • Node Test Expression • T(node()) = all nodes in the document • T(attribute(href)) – all nodes labelled href • attribute(S) := child(S) T(attribute()) • Node Numbering • < doc, X • The node order relative to the axes X in document order • idxx(x,S) • Context • c = x, k, n • x: node • k: position of the node • n: context size • Evaluation of XPath relative to context

XPath Evaluation • X::t[e] • X  {child, parent, descendant, ….} • t : node test expression • e: expression • Expressions • e  {node set, number, string, boolean} • ArithOp  {+, -, *, div, mod} • EqOP  {, }

XPath Semantics x, k, n := P(x) position() (x, k, n) := k last() (x, k, n) := n For all other kinds of expressions, e = Op(e1, …, em) Op(e1, …, em)(c) := Op(e1(c),….,em(c)) maps a context to a value type.

Intuitive Algorithm P [::te1 … em (x) := begin S := {y | x  y, y  T(t)}; for 1  i  m (in ascending order) do S := {y  S | ei (y, idx(y,S), |S| = true}; return S; end; P1|2(x) := P1(x) P2(x) P/ (x) := P(root) P1/2(x) := Uy  P[1](x)P2(y)

Runtime • Ex: • Doc: <a><b/><b/></a> • Query: //a/b/parent::a/b/parent::a/b • Construct more queries: /parent::a/b • procedure process-location-step(n0, Q) • /* n0 is the context node; query Q is a list of location steps */ • begin • node set S := apply Q.head to node n0; • if (Q.tail is not empty) then • for each node n 2 S do process-location-step(n, Q.tail); • End • Complexity: Time(|Q|) = |D||Q|

Algorithm • S::t[e1]…[em](X1, … ,Xk) := • begin • S := {x,y| x Xi , x  y, and y T(t)}; • for each 1≤ i ≤ m (in ascending order) do • begin • Fix some order S = x1,y1 , …, xl,yl for S; • r1,…rl := ei(t1,…,tl) where tj =  yj , idx (yj,, Sj ), |Sj|  and Sj := {z |  xj, z  S}; • S := {xi,yi |ri is true}; • end; • for each 1 ≤ i ≤ k do • Ri := {y |  x, y  S, x  Xi}; • return R1, … ,Rk ; • end;

Algorithm (contd….) S/(X1, …., Xk) := S({root}, …., k times) S1/2(X1, …., Xk) := S2(S1(X1, …., Xk)) S1|2(X1, …., Xk) := S1(X1, …., Xk) U (S2(X1, …., Xk))

Semantics Function (x1, k1, n1, …, xl, kl, nl) := S({x1}, …., {xl}) position()(x1, k1, n1, …, xl, kl, nl) := k1, …., kl last()(x1, k1, n1, …, xl, kl, nl) := n1, …., nl And Op(e1, …. em(c1, …., cl) := Op  (e1(c1, …., cl), …., em(c1, …., cl)) For remaining kind of expressions

Benchmark Results in seconds for IE6 vs. the implementation

References • G. Gottlob, Ch. Koch, R. Pichler: XPath Processing in a Nutshell. SIGMOD Record, March'03. • G. Gottlob, Ch. Koch, R. Pichler: Efficient Algorithms for Processing XPath Queries. ACM TODS, to appear.

Thank You!!

XPath Query Evaluation - A Top Down Approach

XPath Query Evaluation - A Top Down Approach

Presentation Transcript

Query Evaluation

Top-K Query Evaluation on Probabilistic Data

Functional Top Down Approach to Bracing

TopX 2.0 — A (Very) Fast Object-Store for Top-k XPath Query Processing

Top- K Query Evaluation with Probabilistic Guarantees

Problems top-down approach

A first course in Telecommunications: a top-down approach

Query Evaluation

Forming Nanostructures by the Top-Down Approach

Improving WLAN Efficiency and QoE – A Top Down Approach

Query Evaluation

Query Evaluation

TopX 2.0 — A (Very) Fast Object-Store for Top-k XPath Query Processing

The Complexity of XPath Evaluation

Teaching Computer Graphics with a Top-Down Approach

Query Evaluation

Improving WLAN Efficiency and QoE – A Top Down Approach