VAMANA
This presentation is the property of its rightful owner.
Sponsored Links
1 / 61

Master Thesis Presentation 22 nd April 2004 Venkatesh Raghavan PowerPoint PPT Presentation


  • 52 Views
  • Uploaded on
  • Presentation posted in: General

VAMANA A High Performance, Scalable and Cost Driven XPath Engine. Master Thesis Presentation 22 nd April 2004 Venkatesh Raghavan Advisor: Prof. Elke Rundensteiner Reader : Prof. Micha Hofri. Outline. Motivation Related Work Background for VAMANA Approach Our Physical Algebra

Download Presentation

Master Thesis Presentation 22 nd April 2004 Venkatesh Raghavan

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Master thesis presentation 22 nd april 2004 venkatesh raghavan

VAMANA

A High Performance, Scalable and Cost Driven XPath Engine

Master Thesis Presentation

22nd April 2004

Venkatesh Raghavan

Advisor: Prof. Elke Rundensteiner

Reader : Prof. Micha Hofri


Outline

Outline

  • Motivation

  • Related Work

  • Background for VAMANA Approach

  • Our Physical Algebra

  • Query Execution Engine

  • Query Optimization

  • Experimental Evaluation

  • Conclusions


Motivation

Motivation

Many applications are migrating to native XML database.

  • Need for an XML query engine

    • High Performance

      • Support queries to that emphasize the structural semantics of XML query languages.

      • Efficient querying engine and database management system tailored for XML data.

    • Scalable

      • To support large XML document.s

      • Support all 13 XPath axes.

    • Cost Based

      • Schema independent cost model provides dynamically

        calculated heuristics.

      • Intelligent cost-based transformations, further improving

        performance.


Outline1

Outline

  • Motivation

  • Related Work

    • Relational Solutions

    • DOM Solutions

    • Current Index Based Solutions

  • Background for VAMANA Approach

  • Our Physical Algebra

  • Query Execution Engine

  • Query Optimization

  • Experimental Evaluation

  • Conclusions


Relational solution

Relational Solution

  • Mature data management tools

    • Query processing

    • Crash recovery

    • Concurrency control

  • Shredding XML documents

  • XPeranto [6] and Rainbow [7]

  • Many mapping algorithms

    • From XML Schema to Relations: A Cost-based Approach

      to XML Storage [24]

    • Workload based query mapping algorithm

.xml


Flip side

Flip-side

  • Data Model mismatch

    • Tables Vs XML semi-structured data model

  • Semantic mismatch

    • SQL Vs Xquery

  • Data Fragmentation

    • Increase query execution cost (More Joins)

    • High update cost

  • Overhead

    • Relational Mapping adds overhead

      • Managing data

      • Handling order.


Dom solution

DOM Solution

  • W3C - Document Object Model is language independent API for accessing various parts of XML document.

  • Traditional top-down tree traversal.

  • Disadvantages

    • Very main memory intensive.

    • On an average 4-5 times the file size [11] .

    • Most of the DOM based engines do not support all XPath axes.

    • Even if they do

      Imagine Pixar rendering “Finding Nemo” in Windows machine.

    • Requires complex recursive traversal for even for a few XPath axes.


Master thesis presentation 22 nd april 2004 venkatesh raghavan

  • Galax[9](developed by Bell and AT\&T labs)

    • They do not support all XPath axes.

    • Performs very poorly against large XML documents

    • Non-cost driven logical level optimization

  • Jaxen [22]

    • Java API to support different XML API (JDom, DOM, ElectricXML,dom4j

    • Can handle document having file sizes  10Mb

      • Intel Celeron PC with 512MB of RAM

  • IPSI, Pathan, etc.


Current index solutions

Current Index Solutions

  • Apache Xindice[13]

    • User-defined pattern indexes

    • Capable to index small to medium size documents < 5Mb

  • Natix[23]

    • The XML data tree is partitioned into small sub-trees and

      each sub-tree is stored into a data page.

  • TOX [25] (University of Toronto)

    • ToX storage engine stores the XML documents in either a relational database or an object oriented database.


Contd

Contd.

  • TIMBER[14](University of Michigan, University of British Columbia and AT&T labs)

    • TAX Algebra

      • Pattern trees

    • Query execution

      • Structural joins

    • Query Optimization

      • Estimating costs of all promising sets of evaluation plans.

      • Problem is the exponential increase in possibilities for complex query.

      • They claim to only select from an elite set of possible evaluation plans.

    • Cost Estimation

      • Primary Histograms

        • Expensive to maintain for frequent updates

      • Counting Twigs – Frequently occuring co-related

        sub-path queries.


Outline2

Outline

  • Motivation

  • Related Work

  • Background for VAMANA Approach

    • Multi-Axis Storage Structure

    • Running Example

  • Our Physical Algebra

  • Query Execution Engine

  • Query Optimization

  • Experimental Evaluation

  • Conclusions


Master thesis presentation 22 nd april 2004 venkatesh raghavan

Resultant Tuples

XPath Expression

XPath Compiler

Default Query Plan

Optimized Query Plan

---

---

Optimizer

Query Execution

Engine

---

---

---

---

Transformation Library

Default Query Plan

Cost Estimator

Axis or Value Based Queries

MASS Storage Structure

Loader

XML Documents


Multi axis storage structure

Clustering

Axes

self, child, following-sibling, preceding-sibling, attribute, namespace

CL1 and CL3

self, parent, ancestor, ancestor-or-self, descendent, descendent-or-self, preceding, following

CL2 and CL4

Multi-Axis Storage Structure

  • Efficient storage and access structure for XML document.

    • XPath axes, nodetests, range position predicates.

  • Provides statistics for costing.

    • Number of tuples per page.

    • Count.

  • Fast Lexicographical keys

  • Four Clusters + Value-based index


  • Running examples

    Running Examples

    E.g. 1: descendant::name/parent::*/self::person/address

    <person id="person0">

    <name>Krishna Merle</name>

    <emailaddress>mailto:[email protected]</emailaddress>

    E.g. 2: //province[text() = “Vermont” ]/ancestor::person

    • <person id="person41">

    • <name>Muneo Yemenis</name>

    • ...

    • <phone>+0 (807) 6372999</phone>

    • <province>Vermont</province>


    Outline3

    Outline

    • Motivation

    • Related Work

    • Background for VAMANA Approach

    • Our Physical Algebra

      • VAMANA Approach

      • Operators

      • Context Node

    • Query Execution Engine

    • Query Optimization

    • Experimental Evaluation

    • Conclusions


    Vamana approach

    VAMANA Approach

    • Data-Flow style of querying.

      • Data flows

      • Control flows

    • Pipelined-iterative fashion

      • Avoid temporary copies of intermediate results whenever possible.

      • Facilitates the reduction of I/O operations

        • All tuples for a particular context node are clustered together.

        • Sequential traversal over node sets .

        • Hence minimal I/O and key comparisons.

      • Used in most of commercial relational database system.

    • Structural joins

      Best Case : When maximum number of tuples in the join - pairs

      Worst Case : Few of joins - pairs


    Operators

    Operators

    where, opsymbol of the operator type.

    cond represents a set of conditions applied by the operator

    id is an identifier that uniquely identifies in given plan P

    Root Operator

    R

    Step Operator

    Φ

    Literal Operator

    L

    Node Base

    Exist Operator

    ξ

    Binary Operator

    β

    Join Operator

    J


    Context node

    Context Node

    We extend the idea..

    • The context node of any given VAMANA operator opidcond defines uniquely the position of an XML node in the index structure. The position is obtained by the structural path information encoded in the context node.

    “..context node is defined as the current node being processed…”

    – XPath (1.0 & 2.0)


    Concepts

    Concepts

    //a/b[/c]

    Context Side

    /b

    Predicate Child

    Predicate Side

    Context Child

    EXIST

    //a

    /c


    Dynamic context set

    Dynamic Context Set

    /b

    root

    //a

    MASS index

    a,a,a,a,a,b


    Xpath compiler

    Φ2child::phone

    Φ2ancestor::person

    a. Q1

    b. Q2

    R1

    R1

    Φ5 child::text

    L4 ‘Vermont’

    Φ3 self::person

    Φ6 //::province

    β3EQ

    Φ5 descendant::name

    Φ4 parent::*

    XPath Compiler

    Q1: descendant::name/self::*/parent::person/address

    Q2: //province[text() = “Vermont” ]/ancestor::person


    Outline4

    Outline

    • Motivation

    • Related Work

    • Background for VAMANA Approach

    • Our Physical Algebra

    • Query Execution Engine

    • Query Optimization

    • Experimental Evaluation

    • Conclusions


    Execution states

    Execution States

    • An operator can be in

      • INITIAL

        • Has not yet started fetching tuples.

      • FETCHING

        • When the operator has not yet exhausted all nodes from MASS that meets its condition(s).

        • When the operator is waiting for its context-child to return tuples.

        • When the operator is waiting for the predicate condition to process the nodes.

      • OUT_OF_NODE

        • When the operator has exhausted all the nodes from MASS that satisfy the condition(s) specified by the node.

        • When the context-child has no further tuples to return operator.


    Step1 setting context for the leaf operators on context path

    INTIAL

    FETCHING

    OUT_OF_NODE

    Step1:Setting Context for the “leaf operators on context-path”

    R1

    //province[text()=“Vermont”]/ancestor::person

    Φ2 ancestor::person

    Φ6 //::province

    β3EQ

    L4 ‘Vermont’

    Φ5 child::text


    Step2 ask the root node for tuples

    Step2:Ask the root node for tuples.

    R1

    Φ2 ancestor::person

    Φ6 //::province

    a.d.y.a


    Master thesis presentation 22 nd april 2004 venkatesh raghavan

    a.d.y.a.a

    “Massachussets”

    Φ6 //::province

    a.d.y.a

    β3EQ

    L4 ‘Vermont’

    Φ5 child::text

    a.d.y.a

    a.d.y.a

    a.d.y.a.a


    Master thesis presentation 22 nd april 2004 venkatesh raghavan

    Φ6 //::province

    a.d.y.a

    β3EQ

    L4 ‘Vermont’

    Φ5 child::text

    a.d.y.a

    a.d.y.a


    Master thesis presentation 22 nd april 2004 venkatesh raghavan

    a.d.y.b

    a.d.y.b.a

    “Vermont”

    a.d.y.b

    Φ6 //::province

    a.d.y.b

    β3EQ

    L4 ‘Vermont’

    Φ5 child::text

    a.d.y.a

    a.d.y.b

    a.d.y.b.a


    Master thesis presentation 22 nd april 2004 venkatesh raghavan

    R1

    Φ2 ancestor::person

    a.d.y.b

    a.d.y

    Φ6 //::province

    a.d.y.b


    Outline5

    Outline

    • Motivation

    • Related Work

    • Background for VAMANA Approach

    • Our Physical Algebra

    • Query Execution Engine

    • Query Optimization

      • Query Clean-Up

      • Cost Model

      • Transformation

    • Experimental Evaluation

    • Conclusions


    Optimization

    Optimization

    Query Plan (P “)+ Heuristics (L( P “) )

    Query Plan (P I)

    Default

    Query Plan (P )

    Clean Up

    Cost Estimator

    Transformation

    Optimal Query Plan (P opt )

    Transformed Query Plan (P t )


    Clean up

    Φ2child::phone

    Φ2child::phone

    Φ3 self::person

    Φ3 self::person

    Φ5 descendant::name

    Φ5 descendant::name

    Φ4 parent::*

    Clean Up

    a. Default Query Plan

    b. Cleaned Query Plan


    Vamana cost model

    VAMANA Cost Model

    • The cost is usually calculated with respect to the root of the XML document or a node specified by the user.

    • Query costs are obtained from the actual data rather than a data dictionary and thus are always up to date.

    • Our cost model, does not suffer the overhead of parsing the entire document. This is the case for does histogram-based costing like StaTiX[14].

    • Starting from the leaf operators we propagate the cost upwards towards the root operator.


    Vamana cost model1

    VAMANA Cost Model

    COUNT(opidcond)

    • This heuristics is only calculated for step operators (Φidaxis::nodetest). It represents the count of the number of XML nodes in the underlying index structure that satisfy the node test of the step operator axis::nodetest.

    • MASS provides an API to efficiently gather count of a particular node test in its storage structure.

      TC(opidcond)

    • For a literal operator(Lid value), text count is the number of occurrences of a particular literal value in the index structure.


    Contd1

    Contd..

    IN (opi)

    The maximum number of tuples that the operator opi will receive in total from its context child.

    • Case 1:

      • For a leaf step operator on the context path of the query plan, the total number of tuples received is equal to the number of tuples available in the underlying index structure, i.e. IN(opi ) = COUNT(opi ).

    • Case 2:

      • For all non-leaf operator(s), IN(opi ) = OUT(opj), where opj is the context child of opi.

    • Case 3:

      • For all leaf step operator(s) on the predicate path of

        query plan, the total number of tuples received is equal

        to the number of tuples received by its predicate

        operator.


    Contd2

    Contd..

    OUT(opi)

    The maximum number of tuples that the current operator opi returns

    • Case 1:

      • A leaf step operator on the context path of the query plan returns all the tuples that occur in the underlying index structure with respect to the context of the leaf operator, i.e., OUT(opi )= COUNT(opi ).

    • Case 2:

      • A literal operator(s) returns the same values every time a request for tuples is received. To facilitate the optimization of literal operators by using a value-index, we define output as OUT(opi ) = TC(opi ).

    • Case 3:

      • For binary predicate operators that have value-based

        equivalence, the OUT(opi) is calculated as follows

    Minimum(# tuples from parent operator, TC of the literal value)


    Contd3

    Contd..

    • Non-leaf operators

      • Context path

      • Predicate path

    • Leaf operators

      • predicate path


    Contd4

    Contd..

    “Cost determined with respect to context node provider”


    Example 1

    Example 1:

    Φ2child::address

    Count : 1256

    IN : 4825

    OUT : 1256

    Φ3parent::person

    Count : 2550

    IN : 4825


    Example 2

    Example 2:

    Φ3parent::person

    Count : 2550

    IN : 4825

    OUT : 4825

    Φ6descendant::name

    Count : 4825

    OUT : 4825


    Heuristics

    Heuristics

    Ratio = IN/OUT

    • Higher the ratio, better the selectivity.

    δ i = scale0..1 (IN/OUT)

    • Inverted index <scaled(IN/OUT), opi>.


    Transformation library

    Transformation Library

    • XPath equivalence rules [5] extend for VAMANA physical algebra.

      Rule 1 : /descendant::n/parent::m/.. //m[child::n]/..

      Rule 2 : /descendant-or-self::n/child::m/..  //m[parent::n]/..

      Rule 3 : p/following-sibling::n/parent::m  p[following-sibling::n]parent::m

      Rule 4 : /child::m/preceding-sibling::n  descendant::n[following-sibling::n]

    • Binary predicate

      • Value based equivalence.

        • Value-index optimization


    Master thesis presentation 22 nd april 2004 venkatesh raghavan

    COUNT= 4825

    IN= 4825 OUT= 4825

    Φ5 descendant::name

    Q1

    R1

    COUNT= 1256

    IN = 4825 OUT= 1256

    Φ2child::address

    COUNT= 2550

    IN= 4825 OUTT = 4825

    Φ3 parent::person


    Master thesis presentation 22 nd april 2004 venkatesh raghavan

    COUNT= 1256

    IN = 2550 OUT= 2550

    COUNT= 2550

    IN= 2550 OUT= 2550

    COUNT= 4825

    IN= 4825 OUT= 4825

    Φ5 descendant::name

    Φ5 child::name

    IN= 4825 OUT= 2550

    COUNT= 4825

    IN= 2550 OUT= 4825

    /descendant::n/parent::m/.. //m[child::n]/..

    R1

    R1

    Φ2child::address

    COUNT= 1256

    IN = 4825 OUT= 2550

    Φ2child::address

    Φ3 //::person

    COUNT= 2550

    IN= 4825 OUT= 4825

    Φ3 parent::person

    ξ6

    a. Initial Query Plan

    b. Transformed Query Plan


    Master thesis presentation 22 nd april 2004 venkatesh raghavan

    COUNT= 4825

    IN= 1256 OUT= 4825

    Φ5 child::name

    R1

    /descendant-or-self::n/child::m/..  //m[parent::n]/..

    COUNT= 1256

    IN = 1256 OUT= 1256

    Φ2//::address

    IN= 1256 OUT= 1256

    ξ7

    COUNT= 2550

    IN= 1256 OUT= 1256

    Φ3 parent::person

    Optimal Query Plan

    IN= 4825 OUT= 1256

    ξ6


    Running example 2

    Φ5 child::text

    Φ6 //::province

    L4 ‘Vermont’

    β3EQ

    Running Example 2:

    R1

    Φ2ancestor::person

    COUNT= 2550

    IN =1256OUT= 1256

    COUNT= 1256

    IN = 1256 OUT= 1256

    IN = 1256 OUT= 13

    TC = 13

    COUNT= 304819

    IN = OUT =304819


    Master thesis presentation 22 nd april 2004 venkatesh raghavan

    Φ2ancestor::person

    Φ2ancestor::person

    Φ6 parent::province

    Φ5 child::text

    Φ5 value:: ‘Vermont’

    L4 ‘Vermont’

    Φ6 //::province

    β3EQ

    R1

    R1

    a. Default Query Plan

    b. Transformed Query Plan


    Can we produce bad queries

    Can we produce BAD queries?

    • No!

    • Our optimization aims to reduce the number of tuples the parent operator receives.

    • Hence we try to push the most selective operators downward.

    • An operator is considered for transformation only if:

      • It is selective.

      • There exist an equivalent transformation rule.

        i.e. its parent is not affected by the transformation.

      • The number of tuples filtered generated by the operator is reduced.

    • Now, whether the transformation process ends is another question.

      • To solve this infinite running we have to brute stop

        the optimization process after a specific number of iterations.


    Outline6

    Outline

    • Motivation

    • Related Work

    • Background for VAMANA Approach

    • Our Physical Algebra

    • Query Execution Engine

    • Query Optimization

    • Experimental Evaluation

      • Criterions

      • Queries

      • Results

    • Conclusions


    Experimental evaluation

    Experimental Evaluation

    • XMark[17] auction database.

    • Compared the CPU execution time of the test queries over different XPath engines

      • Galax [9]

      • Jaxen [22]

      • Others

        • IPSI [10]

        • Pathan [11]

        • Xindices [14]

    • Variable factors

      • Document size

        • Factor (100Kb, 1Mb, 5Mb, 10Mb, 20Mb, 30Mb, 40Mb, 50Mb)

      • Queries


    Xpath queries

    XPath Queries

    • XMark [17] Queries not very useful in evaluation XPath queries.

    • Criterions for selecting the queries

      • Cover major forward and reverse XPath axes.

      • Experimentally prove the correctness of the execution strategy.

      • Empirically prove that optimization does not make the query more expensive

    • Query List

      • Q1 : //person/address

      • Q2 : //watches/watch/ancestor::person

      • Q3 : /descendant::name/parent::*/parent::person/address

      • Q4 : //itemref/following-sibling::price/parent::*

      • Q5 : //province[text()="Vermont"]/ancestor::person

    • DNF-2hrs : Did Not Finish – In 2 Hours

    • DNF-MO : Did Not Finish – Memory Overflow

    • NS : Not Supported


    Improved performance

    Improved Performance

    Q1: //person/address

    Q2: //watches/watch/ancestor::person


    Correctness of query optimization

    Correctness of Query Optimization

    Q3: /descendant::name/parent::*/parent::person/address


    Master thesis presentation 22 nd april 2004 venkatesh raghavan

    Q4: //itemref/following-sibling::price/parent::*

    Q5: //province[text()="Vermont"]/ancestor::person


    Outline7

    Outline

    • Motivation

    • Related Work

    • Background for VAMANA Approach

    • Our Physical Algebra

    • Query Execution Engine

    • Query Optimization

    • Experimental Evaluation

    • Conclusions


    Conclusion

    Conclusion

    • Efficient, cost driven XML query execution (XPath)

    • Execution Engine

      • VAMANA's index-only pipelined execution is novel to XML query evaluation as most of the existing engines use structural joins for query evaluation.

      • Physical algebra that supports such execution for any given valid XPath expression.

    • Cost Model

      • The VAMANA cost model is simple but very effective for identifying highly selective nodes. And the costing can be specific to a particular XML document or even to a specific point in the XML document

      • The dynamic cost estimation support makes costing up-to-date in environments which have frequent XML updates.

    • The model is well supported by a set of transformation rules used for operator optimization.

    • Our experiments shows the correctness, effectiveness and feasibility of our model.


    Future work

    Future Work

    • VAMANA is the building block for the next phase of developing a XQuery Engine.

    • VAMANA currently implements only a subset of the XPath equivalence rules stated in [5].

    • The current cost model can be exploited for XQuery optimization.


    Acknowledgement

    Acknowledgement

    • Advisor : Prof. Elke Rundensteiner

    • Reader : Prof. Micha Hofri

    • MASS-VAMANA team:

      • Kurt W. Deschler

        • Developing MASS

        • Helping me in design, development of VAMAN Architecture

        • Unix and systems related issues

      • Tammy L Worthington

        • XPath Compiler

    • Faculty : Inputs from Professor Dan Dougherty

    • Research Group : DSRG

    • Systems : Jesse Banning & Micheal Voorhis


    References 1

    References [1]

    • T. Bray, J. Paoli, C. Sperberg-McQueen, E. Maler and F.Yergeau. Extensible Markup Language (XML) 1.0 W3C Recommendation. Available at http://www.w3.org/TR/2004/REC-xml-20040204/, Feb 2004.

    • J. Clark and S. DeRose. XML Path Language (XPath) 1.0, W3C Recommendation. Available at http://www.w3.org/TR/xpath, 2002.

    • A. Berglund, S. Boag, D. Chamberlin. M. F. Fernandez, M. Kay, J. Robie and J. Sim ' e on, XML Path Language (XPath) 2.0, W3C Working Draft. Available at http://www.w3.org/TR/xpath20/, Nov 2003.

    • K. Deschler and E. Rundensteiner. MASS- Multi Axis Storage Structure , CIKM, New Orleans, USA, Nov 2003

    • D. Olteanu, H. Meuss, T. Furche, and F. Bry. Symmetry in XPath. Proceedings of Seminar on Rule Markup Techniques, no. 02061, Schloss Dagstuhl, Germany, Feb 2002.

    • M. Carey, J. Kiernan, J. Shanmugasundaram, E. Shekita and S. N. Subramanian. XPERANTO: Middleware for Publishing Object-Relational Data as XML Documents , The VLDB Journal, pages 646-648, 2000.

    • X. Zhang, M. Mulchandani, S. Christ, B. Murphy and E. Rundensteiner. Rainbow: Mapping-Driven XQuery Processing System , SIGMOD, page 614, 2002. Kweelt. From http://kweelt.sourceforge.net.

    • Galax system. Available at http://db.bell-labs.com/galax/.

    • IPSI-XQ - XQuery demonstrator. Available at http://www.ipsi.fraunhofer.de

    • Pathan, Available at http://software.decisionsoft.com

    • A. Marian and J. Sim ' e on. Projecting XML Documents , VLDB'2003. Berlin, Germany, September 2003

    • W. Meier. eXist: An Open Source Native XML Database. Web and Database-Related Workshops, Erfurt, Germany, October 2002.

    • Apache Xindice. Available at: http://xml.apache.org/xindice/

    • H. Jagadish, S. Al-Khalifa, A. Chapman, L. Lakshmanan, A. Nierman, S. Paparizos, J. Patel, D. Srivastava, N. Wiwatwattanan, Y. Wu and C. Yu. TIMBER: A native XML database , The VLDB Journal, 11(4): 274-291, 2002.


    Reference 2

    Reference [2]

    • J. Freire, J. Haritsa, M. Ramanath, P. Roy, and J. Simeon. Statix: Making XML count. In Proc. of SIGMOD, pages 181-191, 2002.

    • XMark - The XML Benchmark project. www.xml-benchmark.org

    • R. Goldman and J. Widom. DataGuides: Enabling Query Formulation and Optimization in Semi-structured Databases , the 23rd Int. Conf. on Very Large Databases (VLDB), Athens, Greece, pages 436-445, 1997.

    • T. Milo and D. Suciu. Index structure for path expression , In Proceedings of 7th International Conference on Database Theory, pages 277-295, 1999.

    • Q. Li and B. Moon. Indexing and Querying XML Data for Regular Path Expressions , Proceedings of 27th International Conference on Very Large Database (VLDB'2001), Rome, Italy, pages 361-370, September 2001.

    • S. Boag, D. Chamberlin, M. Fern ' a ndez, D. Florescu, J. Robie, J. Sim ' e on, An XML Query Language (XQuery) 1.0, W3C Working Draft. Available at http://www.w3.org/TR/xquery/, Nov 2003.

    • B. McWhirter and J. Strachnan. Jaxen: Universal XPath Engine . Available at http://jaxen.org/.

    • C.C. Kanne and G. Moerkotte, Efficient storage of XML data . In Proc. ICDE Conf., poster abstract, page 198, San Diego, USA, 2000.

    • P. Bohannon and J. Freire and P. Roy and J. Simeon, From (XML) Schema to Relations: A Cost-based Approach to XML Storage , ICDE, 2002

    • T. Worthington and E. Rundenstiner, XPath Processor , WPI MQP Report, April 2002.

    • Flavio Rizzolo, Alberto Mendelzon. Indexing XML Data with ToXin , WebDB, pages 49-54, Santa Barbara, USA, 2001.

    • Z. Chen, H. Jagadish, F. Korn, N. Koudas, S. Muthukrishnan, R. Ng and D. Srivastava, Counting Twig Matches in a Tree , ICDE, pages 595-604, 2001.

    • Document Object Model (DOM), Available at http://www.w3.org/DOM/.


  • Login