1 / 9

Streaming XPath Engine

Streaming XPath Engine. Oleg Slezberg Amruta Joshi. Overview. Motivation Querying Streaming XML XPath Challenges (predicates, //, nesting…) Basic Objective Comparative Analysis of Algorithms Implementation Implemented engine in Java using JDK 1.4.2

Download Presentation

Streaming XPath Engine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Streaming XPath Engine Oleg Slezberg Amruta Joshi

  2. Overview • Motivation • Querying Streaming XML • XPath Challenges (predicates, //, nesting…) • Basic Objective • Comparative Analysis of Algorithms • Implementation • Implemented engine in Java using JDK 1.4.2 • Apache Xerces 2.6.2 for parsing (both XML and XPath) • Used existing XSQ Java implementation • Benchmark for evaluation - XPathMark

  3. XStream • Builds parse tree for input query • Maintains an event stack • Keeps matching input streaming document for each node

  4. Our Contributions • Correction – • Verification – • Performance Figures – • Recursive Query Handling – • Query Evaluation Support –

  5. Performance • Benchmark: XPathMark, set of 23 queries (mostly predicate queries) • Criteria: Queries Per Second Rate • Test Setup: Run on elaine2, 900 MHz 2-CPU processor • Results: • XSQ QPS: 4.39 Coverage: 17% • TurboXPath QPS: 5.75 Coverage: 21%+ • Time = XML Parsing + Processing • QPS: XStream 30% faster + better coverage on given benchmark

  6. Recursive Query Handling • For query node n and elements e1, e2 in d • Both e1 and e2 match n • e1 contains e2 • Example: • Document <a><a><b/></a><b></b></a> • Query //a/b • FA-based algorithms • Exponential number of states

  7. Query Evaluation Support • 2 Questions: • Filtering • Does this document match the query? • F1: XML => boolean • Evaluation • What parts of the document match the query? • F2: XML => XML • Modifications: • Output buffers for predicate owner • Predicate node buffers • Predicate evaluation

  8. Multiple Simultaneous Queries • combine the queries OR-ing them together: • q = (q1) | (q2) | … | (qn); • Resulting query has multiple output nodes • Associate a query-id with output node

  9. Conclusion • Streaming XPath Engine • All Objectives met! (XPath Stream Evaluator implemented, Performance Analysis) • Algorithm correction and enhancements • Future Directions • Backward Axis Support • Function Support – reuse predicate evaluation model • Extended expression type support • Predicate Pipelining

More Related