1 / 24

The Connection Factory Jeroen van Rotterdam, CTO May 19th, WWW9

The Connection Factory Jeroen van Rotterdam, CTO May 19th, WWW9. Contents. - Xhive setup - Xpath - Xpath performance issues within XML collections. Xhive. - OO-XML database - Highly scalable - High granularity - W3C DOM L2 compliant - Xpath 1.0 compliant. Architecture. Architecture.

karlk
Download Presentation

The Connection Factory Jeroen van Rotterdam, CTO May 19th, WWW9

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Connection Factory Jeroen van Rotterdam, CTO May 19th, WWW9

  2. Contents - Xhive setup - Xpath - Xpath performance issues within XML collections

  3. Xhive - OO-XML database - Highly scalable - High granularity - W3C DOM L2 compliant - Xpath 1.0 compliant

  4. Architecture

  5. Architecture

  6. Why XPath Competing solutions: - XML-QL: Where-In constructs - XQL: limited - SQL: no alternative Xpath a complete pattern match language.

  7. Xpath Advantages: - fairly complete - multiple axes - supported by W3C - base for Xpointer, Xlink - base for XML Query WG - user based functions Disadvantages: - document oriented - minor different tree model - no updates

  8. Extending DOM Collection setup: Every document is a “Bastard Node”

  9. Library Node Advantages - Natural extension of DOM - extendible - closely related to directory structures - searchable with Xpath

  10. Library Node Disadvantages - potential bottleneck

  11. Xpath - Xpath in a large PDOM collection environment: 1. Address memory issues 2. Solve differences in specs 3. Address performance issues

  12. Memory issues - Avoid recursion - make subresults persistent capable

  13. Solve differences Differences in specs are f.i.: - getParent on attributes vs. ownerElement - namespace nodes

  14. Performance Increase Xpath performance: - Query analysis - Avoid reparsing - Lazy evaluation - Index structures - Cache strategy - DTD analysis - Statistical data

  15. Performance 1. Query analysis: a. Can I simplify my query f.i: /child::chapter[5+5]

  16. Performance 1. Query analysis: b. Does your query depends on the context node. Absolute queries are context independent: “Give me all chapters where the title is the same as the book title” //chapter[title=string(/book/title)] Evaluate string(/book/title) only once.

  17. Performance 2. Storing parsed queries: “Compile”, optimize queries only once

  18. Performance 3. Lazy evaluation: f.i. operations on Nodesets - booleans (evaluate first node) - strings (first in doc order) - number (string to number) Example: “give me all chapters which have paragraphs” /chapter[paragraph] Finding 1 paragraph will do

  19. Performance 4. Indexing: - getFirstChildElementByName(String name) - getNextSiblingElementBySameName() - getFirstChildByType( short type ) - getNextSiblingByType( short type )

  20. Performance 5. Caching strategy: top level paging/cluster strategy

  21. Performance 6. Use DTD information: f.i. /child::chapter/child::book[4] Might return null if you have info on the DTD’s used.

  22. Performance 7. Gather statistical info: DTD’s or Xschema specify structures that may occur, not what’s actually in your collection.

  23. Conclusion - DOM within database environments - Xpath on top of a PDOM - Xpath is fairly complete - Focus on performance

  24. WWW9 Beta testers, Developers wanted. Email: info@xhive.com Have fun…...

More Related