1 / 77

Evaluation of Partial Path Queries on XML Data

Evaluation of Partial Path Queries on XML Data. Stefanos Souldatos (NTUA, GREECE) Xiaoying Wu (NJIT, USA) Dimitri Theodoratos (NJIT, USA) Theodore Dalamagas (NTUA, GREECE) Timos Sellis (NTUA, GREECE). Evaluation of Partial Path Queries on XML Data. Partial path queries Query processing

oria
Download Presentation

Evaluation of Partial Path Queries on XML Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Evaluation of Partial Path Queries on XML Data Stefanos Souldatos (NTUA, GREECE) Xiaoying Wu (NJIT, USA) Dimitri Theodoratos (NJIT, USA) Theodore Dalamagas (NTUA, GREECE) Timos Sellis (NTUA, GREECE)

  2. Evaluation of Partial Path Queries on XML Data Partial path queries Query processing Query evaluation Experiments Conclusion 

  3. theHotel.gr Athens City Island Creta Athens Creta Location Island City City Center Poros Chania Heraklio Difficulties on Querying XML Data Creta

  4. theHotel.gr Athens City Island Creta  Athens Creta Location Island City City Center Poros Chania Heraklio Difficulties on Querying XML Data Search problem Name: Xiaoying Wu Place:Athens Center, Heraklio Purpose:Sightseeing Problem: structural difference Parthenon (438 BC) Phaistos’ Disk (1700 BC) Creta

  5. theHotel.gr Athens City Island Creta   Athens Creta Location Island City City Center Poros Chania Heraklio Difficulties on Querying XML Data Search problem Name:Theodore Dalamagas Place:Islands Purpose:Sea sports Problem: structural inconsistency Windsurf Jet ski Creta

  6. theHotel.gr Athens City Island Creta Athens Creta Location Island City City Center Poros Chania Heraklio Difficulties on Querying XML Data Search problem Name:Dimitri Theodoratos Place:Heraklio Purpose:HDMS Conference Problem: unknown structure HDMS 2008 Creta

  7. Difficulties on Querying XML Data Search problem Name:Stefanos Souldatos Place:Any island Purpose:Escape from PhD! Problem:  multiple sources Creta  1400 islands theHotel.gr hotels.gr holidays.gr

  8. theHotel.gr Athens City Island Creta Athens Creta Location Island City City Center Poros Chania Heraklio Difficulties on Querying XML Data Can we use existing query languages (XPath, XQuery) to express our queries? Can we use existing techniques to evaluate our queries? Creta

  9. theHotel.gr City Island theHotel.gr theHotel.gr City City Island Island Path Queries in XPath no structure (keywords) full structure (path patterns) partial path queries //theHotel.gr [descendant-or-self::* [ancestor-or-self::City] [ancestor-or-self::Island]] //theHotel.gr//City [descendant-or-self::* [ancestor-or-self::Island]] /theHotel.gr/City//Island

  10. r a c c b a d Partial Path Queries root node (optional) query node labelled by “a” child relationship descendant relationship r a partial path query

  11. r r a c c c a b a a d b d Partial Path Queries QUERY PROCESSING QUERY EVALUATION partial path query partial path query in canonical form

  12. Evaluation of Partial Path Queries on XML Data Partial path queries Query processing Query evaluation Experiments Conclusion  

  13. r a c c b a d Query Processing • Full form • Satisfiability • Redundant nodes • Canonical form

  14. IR1 r a c c b a d Query Processing INFERENCE RULES (IR1) |- r//ai (IR2) x/y |- x//y (IR3) x//y, y//z |- x//z (IR4) x/ai, x//bj |- ai//bj (IR5) ai/x, bj//x |- bj//ai (IR6) x/y, y/w, x//z, z//w |- x/z (IR7) x/y, x//z, w/z, w//y |- x/z (IR8) x/y, y/w, x/z |- z/w (IR9) x//y, y//w, x/z |- z//w (IR10) x/y, w/y, w/z |- x/z (IR11) x//y, w/y, w//z |- x//z (IR12) x/y, y/w, z/w |- x/z (IR13) x//y, y//w, z/w |- x//z x,y,z,w: query nodes ai/bj: nodes labelled by a/b • Full form • Satisfiability • Redundant nodes • Canonical form

  15. IR4 r a c c b a d Query Processing INFERENCE RULES (IR1) |- r//ai (IR2) x/y |- x//y (IR3) x//y, y//z |- x//z (IR4) x/ai, x//bj |- ai//bj (IR5) ai/x, bj//x |- bj//ai (IR6) x/y, y/w, x//z, z//w |- x/z (IR7) x/y, x//z, w/z, w//y |- x/z (IR8) x/y, y/w, x/z |- z/w (IR9) x//y, y//w, x/z |- z//w (IR10) x/y, w/y, w/z |- x/z (IR11) x//y, w/y, w//z |- x//z (IR12) x/y, y/w, z/w |- x/z (IR13) x//y, y//w, z/w |- x//z x,y,z,w: query nodes ai/bj: nodes labelled by a/b • Full form • Satisfiability • Redundant nodes • Canonical form

  16. r a c c b a d IR4 Query Processing INFERENCE RULES (IR1) |- r//ai (IR2) x/y |- x//y (IR3) x//y, y//z |- x//z (IR4) x/ai, x//bj |- ai//bj (IR5) ai/x, bj//x |- bj//ai (IR6) x/y, y/w, x//z, z//w |- x/z (IR7) x/y, x//z, w/z, w//y |- x/z (IR8) x/y, y/w, x/z |- z/w (IR9) x//y, y//w, x/z |- z//w (IR10) x/y, w/y, w/z |- x/z (IR11) x//y, w/y, w//z |- x//z (IR12) x/y, y/w, z/w |- x/z (IR13) x//y, y//w, z/w |- x//z x,y,z,w: query nodes ai/bj: nodes labelled by a/b • Full form • Satisfiability • Redundant nodes • Canonical form

  17. r c c a a b d Query Processing INFERENCE RULES (IR1) |- r//ai (IR2) x/y |- x//y (IR3) x//y, y//z |- x//z (IR4) x/ai, x//bj |- ai//bj (IR5) ai/x, bj//x |- bj//ai (IR6) x/y, y/w, x//z, z//w |- x/z (IR7) x/y, x//z, w/z, w//y |- x/z (IR8) x/y, y/w, x/z |- z/w (IR9) x//y, y//w, x/z |- z//w (IR10) x/y, w/y, w/z |- x/z (IR11) x//y, w/y, w//z |- x//z (IR12) x/y, y/w, z/w |- x/z (IR13) x//y, y//w, z/w |- x//z x,y,z,w: query nodes ai/bj: nodes labelled by a/b • Full form • Satisfiability • Redundant nodes • Canonical form

  18. r c c x y a a b d Query Processing • Full form • Satisfiability • Redundant nodes • Canonical form A query is unsatisfiable if its full form contains a trivial cycle:

  19. r c a a x x y b d y x y y y y z y y z y z Query Processing A node y is redundant if one of the following patterns occur: • Full form • Satisfiability • Redundant nodes • Canonical form a) d) b) c c)

  20. r c a a b d Query Processing • Full form • Satisfiability • Redundant nodes • Canonical form canonical form of satisfiable query = full form – IR2 – IR3 – redundant nodes The canonical form of a query is a directed acyclic graph (dag)

  21. Evaluation of Partial Path Queries on XML Data Partial path queries Query processing Query evaluation Experiments Conclusion   

  22. Evaluation Algorithms • Based on PathStack [Bruno et al. ’02] • Produce all possible path queries… • Decompose into root-to-leaf paths… • PartialMJ: Decompose a spanning tree into paths… • Extending PathStack [Bruno et al. ’02] • PartialPathStack: Produce a topological order of the query nodes and extend PathStack to handle it…

  23. r a b c d e f g Based on PathStack 1. Producing all possible path queries… r r r r a a a a b b b c c c b c d d d d e e e e f f f f g g g g

  24. r r r r r a a a a a b b c c b c d c c b b e f d d d d g e e f f f f g g g g e e Based on PathStack 1. Producing all possible path queries…

  25. r a b c d e f g Based on PathStack 1. Producing all possible path queries… Problems:  too many queries to evaluate  multiple traversal of the XML tree

  26. r r r r a a a a c b b c d d d d e e f f g g Based on PathStack 2. Decomposing into root-to-leaf paths…

  27. r r r r a a a a c b b c d d d d e e f f g g Based on PathStack 2. Decomposing into root-to-leaf paths… PathStack

  28. r r r r a a a a c b b c d d d d e e f f g g Based on PathStack 2. Decomposing into root-to-leaf paths… Problems:  path overlaps  more than one components to evaluate  intermediate results

  29. r r r a c a a b b d d e f g Based on PathStack PartialMJ. Using a spanning tree… Remove edges to create a spanning tree

  30. r r r r a a b c c a a d b b e f d d e f g g Based on PathStack PartialMJ. Using a spanning tree…

  31. r r r r a a b c c a a d b b e f d d e f g g Based on PathStack PartialMJ. Using a spanning tree… PathStack

  32. r r r r a a b c c a a d b b e f d d e f g g Based on PathStack PartialMJ. Using a spanning tree… Join conditions (identity, structural, path)

  33. r r r r a a b c c a a d b b e f d d e f g g Based on PathStack PartialMJ. Using a spanning tree… Join conditions (identity, structural, path)

  34. r r r r a a b c c a a d b b e f d d e f g g Based on PathStack PartialMJ. Using a spanning tree… Join conditions (identity, structural, path)

  35. r r r r a a b c c a a d b b e f d d e f g g Based on PathStack PartialMJ. Using a spanning tree…

  36. r a b c d e f g Based on PathStack PartialMJ. Using a spanning tree… Problems:  path overlaps  more than one components to evaluate  intermediate results

  37. r a b c d e f g Extending PathStack PartialPathStack. Employ a topological order… r a b c d e f g

  38. r r a a b b c d c e f d g e f g Extending PathStack PartialPathStack. Employ a topological order… PartialPathStack

  39. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 d1 d1 sink nodes c1 e1 d2 c2 e2

  40. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 d1 d1 sink nodes c1 e1 r d2 c2 e2

  41. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 d1 d1 sink nodes c1 e1 r a1 d2 c2 e2

  42. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 d1 d1 sink nodes c1 e1 r a1 b1 d2 c2 e2

  43. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 d1 d1 sink nodes c1 e1 r a1 b1 d1 d2 c2 e2

  44. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 d1 d1 sink nodes c1 e1 r a1 b1 d1 c1 d2 c2 e2

  45. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 OUTPUT!!! d1 d1 sink nodes c1 e1 r a1 b1 d1 c1 e1 d2 c2 e2

  46. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 OUTPUT!!! d1 d1 sink nodes c1 e1 r a1 b1 d1 c1 e1 d2 c2 e2

  47. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 OUTPUT!!! d1 d1 sink nodes c1 e1 r a1 b1 d1 c1 e1 d2 c2 e2

  48. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 OUTPUT!!! d1 d1 sink nodes c1 e1 r a1 b1 d1 c1 e1 d2 c2 e2

  49. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results r a1 b1 OUTPUT!!! d1 d1 sink nodes c1 e1 r a1 b1 d1 c1 e1 d2 c2 e2

  50. r Sr Sa Sb Sd Sc Se a d b c e PartialPathStack Example tree query results ra1b1d1c1e1 r a1 b1 OUTPUT!!! d1 d1 sink nodes c1 e1 r a1 b1 d1 c1 e1 d2 c2 e2

More Related