1 / 37

Inbal Yahav

A Framework for Using Materialized XPath Views in XML Query Processing VLDB ‘04. By: Andrey Balmin Fatma Ozcan Kevin S. Beyer Roberta J. Cochrane Hamid Pirahesh. DB Seminar, Spring 2005. Inbal Yahav. A Framework for Using Materialized XPath Views in XML Query Processing.

Albert_Lan
Download Presentation

Inbal Yahav

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Framework for Using Materialized XPath Views in XML Query Processing VLDB ‘04 By: Andrey Balmin Fatma Ozcan Kevin S. Beyer Roberta J. Cochrane Hamid Pirahesh DB Seminar, Spring 2005 Inbal Yahav

  2. A Framework for Using Materialized XPath Views in XML Query Processing ‘Prolog’ • Materialized Views Vs. Views • Materialized XPath views Vs. materializes views (relational databases) ? DB Seminar, Spring 2005 Inbal Yahav

  3. A Framework for Using Materialized XPath Views in XML Query Processing Agenda • Introduction • Materialized XPath Views • XPath Matching Algorithm • Definitions • Description • Examples • Complexity • Compensation Expression • Additional References DB Seminar, Spring 2005 Inbal Yahav

  4. A Framework for Using Materialized XPath Views in XML Query Processing Introduction With increase amounts of data, presented and exchanged as XML document, there is a need to efficiently query those documents. To address this request: • W3C has proposed an XML query language – XQuery (will be discussed later) • ANSI and ISO has defined SQL / XML (Extends relational databases to handle XML) Both uses Xpath to navigate through the XML document. DB Seminar, Spring 2005 Inbal Yahav

  5. A Framework for Using Materialized XPath Views in XML Query Processing XQuery • XML query language • Input: XML document (tree) • Output: XML sub tree • Syntax: FLWOR (For-[Let]-Where-[Order]-Return) • Example: XML Document XQuery – Q1 Result DB Seminar, Spring 2005 Inbal Yahav

  6. A Framework for Using Materialized XPath Views in XML Query Processing Introduction XPath Complexity The containment problem, as discussed in “Containment and Equivalence for an XPath Fragment” by G. Miklau and D. Suciu (will by reviewed at the 8’th lecture), is shown to be CO-NPcomplete Meaning: The problem T’  T is NP complete, and therefore cannot be solved in a polynomial time. Just for intuition.. : Consider the query: //A//b And the following XML document: DB Seminar, Spring 2005 Inbal Yahav

  7. A Framework for Using Materialized XPath Views in XML Query Processing Materialized XPath Views • The Goal • In relational databases indexes and materialized views are two well-known techniques to accelerate processing of expensive SQL queries. • We would like to expand the materialized views idea to speed up processing of XQuery or SQL / XML queries. • The System • Suggest a materialized XPath view structure. • Define an algorithm that checks whether a given view can be used to answer a given query. (Match algorithm) • Define a compensation expression that computes the query result using the information available from the view. DB Seminar, Spring 2005 Inbal Yahav

  8. A Framework for Using Materialized XPath Views in XML Query Processing Materialized XPath Views • Storing XPath Views • To be able to answer all kinds of queries, the view structure should contain: • Node path (a list of ancestors) • Typed values • Reference to the node • Note that storing only typed values and node path (without references to the document nodes), will not allow us execute queries whose result is a node collection. For example: • Sometimes it also beneficial to store actual copies of XML fragments in an XPath view. DB Seminar, Spring 2005 Inbal Yahav

  9. A Framework for Using Materialized XPath Views in XML Query Processing Materialized XPath Views Example: V = //@* DB Seminar, Spring 2005 Inbal Yahav

  10. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm • After defining the materialized XPath view structure, we have to make sure that a given view can be utilized in a user query. • Definitions • We represent XPath expressions as labeled binary trees, called XPS trees. (XPath Step) • An XPS node, represents a step in the XPath statement, is labeled with: • Axis  {“root”, “child”, “descendant”, “self”, “attribute”, “descendant-or-self”, “parent”} • Test  {name test, wildcard test (*), kind test (node(), text())} Node 1 2 3 4 //employees /* /employee [@salary] Axis: dos Test: employees Axis: descendent Test: * DB Seminar, Spring 2005 Inbal Yahav

  11. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm • Definitions – cont’ • Each XPS node has two children: • The first child is called predicate (and / or, a comparison operator, a constant, XPS node) • The second child, called next, points to the next step (XPS node) • If one of the children does not exist, we represent it with null For example: DB Seminar, Spring 2005 Inbal Yahav

  12. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm Transition Rules DB Seminar, Spring 2005 Inbal Yahav

  13. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm • General • The algorithm computes all possible mappingsfrom XPS nodes of the view to XPS nodes of the query expression, in a single top-down pass of the view expression. • The basic algorithm deals only with and / or predicates, and child, attribute or descendant axis. • Each function of the table evaluates to Boolean. • Important! The view expression can not be more restrictive than the query. DB Seminar, Spring 2005 Inbal Yahav

  14. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchStep(v,q) Rule 1.1: It is sufficient for one of the conjunction to be mapped by a node of v. (For the other conjunction we use the reference) Rule 1.2: If the view is more restricted than the query, it cannot be used. (For example: Q2) DB Seminar, Spring 2005 Inbal Yahav

  15. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchStep(v,q) – cont’ Rule 1.3: When the view node contains a “descendant” axis, we keep looking for matches down the tree. Rule 1.4: If the axis matches, we try to match the predicate and the next children of the view node. (axis  {predicate, child, attribute}) Rule 1.5: If none matches, the algorithm returns false. DB Seminar, Spring 2005 Inbal Yahav

  16. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchChildren(v,q) Rule 2.1: If the tests matches, we try to match the predicate and the next step of v. matchPred(vpred,q) Rule 3.1: If v does not have a predicate, then the step trivially matches. DB Seminar, Spring 2005 Inbal Yahav

  17. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchPred(vpred,q) – cont’ Rule 3.2: If v has a predicate and q does not (meaning: v is more restricted than q), then the match fails. Rule 3.3 and 3.4: Match both conjuncts is case of conjunction, and one disjunct in case of disjunction. (For example if v contains all the orders with both price and amount attributes, it cannot be used for a query that requires either price or amount). Rule 3.5: Match both predicates, and the view’s predicate with query’s next child, in case of nested XPath expressions. DB Seminar, Spring 2005 Inbal Yahav

  18. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchNext(vnext,q) Rule 4.1 and 4.2: Same as 3.1 and 3.2. Rule 4.3: Match next children, and the view’s next child with the query’s predicate, in case of nested XPath expressions. (For example: v = //a[b/c] q = //a/b[c] ) DB Seminar, Spring 2005 Inbal Yahav

  19. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm Example Consider the following hierarchy of employees: • Each employee has: • Name, Salary and Bonus attributes • Zero or more sub-employees elements DB Seminar, Spring 2005 Inbal Yahav

  20. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm Example – cont’ Consider the following view: //Employee//@*, witch contains all attributes in a sub-tree of any employee, And the query: //Employee[@Bonus]/Employee[@bonus]/@Salary, that asks for the salary of employees who, together with their direct managers, have bonuses. DB Seminar, Spring 2005 Inbal Yahav

  21. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchStep(1,11) matchChildren(1,11) matchPred(null,11) ^matchNext(2,11) matchNext(2,11) T DB Seminar, Spring 2005 Inbal Yahav

  22. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchStep(2, null)  matchStep(2,12) matchStep(2,12) matchChildren(2,12)  matchChildren(2,14) matchChildren(2,12) F Later.. DB Seminar, Spring 2005 Inbal Yahav

  23. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchNext(3,12) matchPred(null,12) ^matchNext(3,12) matchStep(3,13)  matchStep(3,14) matchStep(3,14) T Later.. DB Seminar, Spring 2005 Inbal Yahav

  24. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm matchNext(4,14) matchStep(4,15)  matchStep(4,16) matchChildren(4,15)  matchChildren(4,16) matchChildren(3,14) matchPred(null,14)^matchNext(4,14) T T T T T DB Seminar, Spring 2005 Inbal Yahav

  25. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm T matchStep(XPS1, XPS11) matchChildren(XPS1, XPS11) T  matchPred(null, XPS11) matchNext(XPS2, XPS11) T T  matchStep(XPS2, null) matchStep(XPS2, XPS12) F T  matchChildren(XPS2, XPS12) matchChildren(XPS2, XPS14) T T  matchPred(null, XPS12) matchNext(XPS3, XPS12) T  matchStep(XPS3, XPS13) matchStep(XPS3, XPS14) F T Same way… Inbal Yahav

  26. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm • Recording the Match • Consider the following example: • View v consists n nodes: //a//a…//a • Query q consists of m nodes: /a/a…/a • Where m > n. • Any view node nv can map to any query expression node nq such that nv’s parent maps to some ancestor of nq. • Hence, there are n out of m distinct tree mapping of the view to the query expression. And so on… DB Seminar, Spring 2005 Inbal Yahav

  27. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm • Solution: • Keeping track of all mapping in a match matrix structure (matches between v and q). • Match matrix allows us to encode an exponential number of tree mappings in a polynomial size structure, by recording all possible contexts for each node mapping. • It also reduces running time of the algorithm to polynomial. DB Seminar, Spring 2005 Inbal Yahav

  28. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm Match matrix structure: • Each row corresponds to an XPS node of the view tree. • Each column corresponds to an XPS node of the query tree. • Each cell may contain one of three possible values: Empty, True or False. • Edges represent the context in which the mapping was detected. DB Seminar, Spring 2005 Inbal Yahav

  29. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm • For example, in the previous XPS trees: matchStep(XPSv2, XPSq2) will be called twice, but will be calculated only once. DB Seminar, Spring 2005 Inbal Yahav

  30. A Framework for Using Materialized XPath Views in XML Query Processing Extensions Handling Comparison Predicates • The algorithm as shown so far, lacks rules to match comparison operations. • To solve this problem, the XPS trees are preprocessed in a way that comparison operations become transformed to filters. • After a successful matching, for each filter in the view query tree, we have to check if the filter of a matched node in the query tree is at least as specific. DB Seminar, Spring 2005 Inbal Yahav

  31. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm • Complexity • Space complexity: • Match matrix size: O(|V| * |Q|) Number of XPS nodes in the view Number of XPS nodes in the query expression • Each matrix cell can have at most |Q| incoming edges  The number of edges in the matrix: O(|V|*|Q|2) DB Seminar, Spring 2005 Inbal Yahav

  32. A Framework for Using Materialized XPath Views in XML Query Processing XPath Matching Algorithm • Complexity – cont’ • Constructing the matrix: • matchStep(v,q) function has only |V|*|Q| distinct sets of parameters • Each match can be calculated at most once • In the worst case a function call may expand into |Q| function calls: • Total running time: O(|V|*|Q|2) • Polynomial! DB Seminar, Spring 2005 Inbal Yahav

  33. A Framework for Using Materialized XPath Views in XML Query Processing Compensation Expression • After having a successful match between the query and the view, we will have to extract the query by using the materialized view. • For simplicity we assume that there is only one possible mapping between the view and the query. • This extraction is called compensation, and is achieved in two steps: • Step 1: Eliminate unnecessary conditions. For example: V = //a[@b] Q = //a[@b^@c]  The query expression does not have to include the [@b] condition, since it implies by the view. DB Seminar, Spring 2005 Inbal Yahav

  34. A Framework for Using Materialized XPath Views in XML Query Processing Compensation Expression • Step 2: The query statement is transformed, or relaxed, so that data can be extracted easily from the view table: • We define the last matched node of the view as extraction point, and the last matched node of the query expression as compensation root node. • Than, we reconstruct Q to an equivalent expression that starts at the compensation root node. • Now, as in this example, item elements can be directly extracted from the view table. DB Seminar, Spring 2005 Inbal Yahav

  35. A Framework for Using Materialized XPath Views in XML Query Processing Compensation Expression Extract from the table Use reference DB Seminar, Spring 2005 Inbal Yahav

  36. A Framework for Using Materialized XPath Views in XML Query Processing Additional References • XQuery and SQL / XML: http://www.datadirect.com • XML Seminar: http://www.jgreen.de • General: http://msdn.microsoft.com DB Seminar, Spring 2005 Inbal Yahav

  37. The End DB Seminar, Spring 2005 Inbal Yahav

More Related