1 / 40

gStore: Answering SPARQL Queries Via Subgraph Matching

gStore: Answering SPARQL Queries Via Subgraph Matching. 1 Peking University, 2 Hong Kong University of Science and Technology, 3 University of Waterloo. Lei Zou 1 , Jinghui Mo 1 , Lei Chen 2 , M. Tamer Özsu 3 , Dongyan Zhao 1. Outline. Background & Related Work Overview of gStore

phyre
Download Presentation

gStore: Answering SPARQL Queries Via Subgraph Matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. gStore: Answering SPARQL Queries Via Subgraph Matching 1Peking University, 2Hong Kong University of Science and Technology, 3University of Waterloo Lei Zou1, Jinghui Mo1, Lei Chen2, M. Tamer Özsu3, Dongyan Zhao1

  2. Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions

  3. Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions

  4. Semantic Web “Semantic Web Technologies” is a collection of standard technologies to realize a Web of Data.

  5. RDF Data Model URI Literals URI

  6. RDF Graph Literal Vertex Entity Vertex

  7. SPARQL Queries SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. } Query Graph

  8. Subgraph Match vs. SPARQL Queries

  9. Naïve Triple Store SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. } Too many Self-Joins SQL: Select T3.Subject From T as T1, T as T2, T as T3 Where T1.Predict=“BornOnDate” and T1.Object=“1809-02-12” and T2.Predict=“DiedOnDate” and T2.Object=“1865-04-15” and T3. Predict=“hasName” and T1.Subject = T2.Subject and T2. Subject= T3.subject

  10. Existing Solutions Three categories of solutions are proposed to speed up query processing: Property Table; Jena [K. Wilkinson et al. SWDB 03], … 2. Vertically Partitioned Solution; SW-store [D. J. Abadi et al. VLDB 07],… 3. Exhaustive-IndexingRDF-3x [T. Neumann et al. VLDB 08], Hexastore [C. Weiss et al. VLDB 08 ],…

  11. Existing Solutions-Property Table SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. } Reducing # of join steps SQL: Select People.hasName from People where People.BornOnDate = “1809-02-12” and People.DiedOnDate = “1865-04-15”.

  12. Existing Solutions-Vertically Partitioned Solution Fast Merge Join

  13. Existing Solutions- Exhaustive-Indexing Range query & Merge Join Each SPARQL query statement can be translated into one “range query”. SPARQL Query: Select ?name Where { ?m <hasName> ?name. ?m <BornOnDate> “1809-02-12”. ?m <DiedOnDate> “1865-04-15”. }

  14. Some Limitations Difficult to handle ``wildcard queries’’. Difficult to handle updates.

  15. Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions

  16. Intuition of gStore Finding Matches over a Large Graph is not a trivial task.

  17. Preliminaries Literal Vertex Entity Vertex

  18. Storage Schema in gStore Encoding all neibhors into a “bit-string”, called signature.

  19. Encoding Technique (1) “Abr”, “bra”, ”rah”, ”aha”, …., 0000 0010 0000 0000 ( hasName, “Abraham Lincoln”) 1000 0000 0000 0000 0010 0000 0000 1000 0010 0100 0001 0000 0000 0100 0000 ( BornOnDate, “1809-02-12”) 0100 0000 0000 0100 0010 0100 1000 0000 0000 0000 0001 OR ( DiedOnDate, “1865-04-15”) 1000 0010 0100 0001 0000 1000 0000 0000 0010 0100 0000 OR ( DiedIn, “y:Washington_D.c”) 0000 0010 0000 1100 0010 0100 1001 0000 0010 0000 1000 0010 0100 0001

  20. Encoding Technique (2)

  21. Encoding Technique (3)

  22. Outline Background & Related Work Overview of gStore Encoding Technique VS-tree & Query Algorithm Experiments Conclusions

  23. A Straightforward Solution (1) u2 u1 L1 L2

  24. A Straightforward Solution (2) L1 L2 Large Join Space ! 

  25. VS-tree

  26. Pruning Technique Reduced Join Space!  u2 u1 10010

  27. An Example for Pruning Effect Query: ?x1 y:hasGivenName ?x5 ?x1 y:hasFamilyName ?x6 ?x1 rdf:type <wordnet_scientist_110560637> ?x1 y:bornIn ?x2 ?x1 y:hasAcademicAdvisor ?x4 ?x2 y:locatedIn <Switzerland> ?x3 y:locatedIn <Germany> ?x4 y:bornIn ?x3

  28. Query Algorithm-Top-Down

  29. Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions

  30. Datasets

  31. Exact Queries

  32. Wildcard Queries

  33. Outline Background & Related Work Overview of gStore Encoding Technique VS*-tree & Query Algorithm Experiments Conclusions

  34. Conclusions Vertex Encoding Technique; An Efficient index Structure: VS-tree; A Novel Filtering Technique.

  35. Q/A Thank You! zoulei@pku.edu.cn

  36. Updates- Insertion in G*

  37. Updates- Insertion in VS*-tree

  38. Updates- Deletion in VS*-tree To be deleted

  39. Framework in gStore

  40. A Straightforward Solution (1) u u & 001 = u

More Related