1 / 33

P2P Database Systems

P2P Database Systems. 1. Advertisement. Local Data Model. bookstore. magazine. book. 2. Semantic Mapping. author. Schema. Element. Data. price. name. price. first-name. last-name. award. 3. Indexing. 4. P2P routing. P2P Databases. Advertisement

reegan
Download Presentation

P2P Database Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. P2P Database Systems

  2. 1 Advertisement Local Data Model bookstore magazine book 2 Semantic Mapping author Schema Element Data price name price first-name last-name award 3 Indexing 4 P2P routing P2P Databases Advertisement <bookstore specialty="novel"> <book style="autobiography"> <author> <first-name>Joe</first-name> <last-name>Bob</last-name> <award>Trenton Literary </award> </author> <price currency=CAD>12</price> </book> <book style="textbook"> <author> <first-name>Mary</first-name> <last-name>Bob</last-name> </author> <price>55</price> </book> <magazine> <name>Times</name> <price currency=USD>4</price> </magazine> </bookstore> XPath Query //author[award] /bookstore/book[author/last-name=Bob]

  3. RDF Peers A Scalable Distributed RDF Repository based on A Structured Peer-to-Peer Network

  4. Architecture

  5. MAAN Protocol • Contains three classes of messages for • (a) Topology Maintenance – used for keeping the correct neighbor connections and routing tables and include JOIN/LEAVE, KEEPALIVE and other network-structure- stabilizing message • (b) STORE – inserts triples into the network • (c) SEARCH – visits the nodes where the triples in question are known to be stored, and returns the matched triples to the requesting node. MAAN: Multi-Attribute-Network

  6. RDF Triple Loader • Reads an RDF document, parses it into RDF triples, and uses MAAN’s STORE message to store the triples into the RDFPeers network. • Gives each resource or literal a value using the SHA1 hash function. • When a RDFPeer receives a STORE message, it stores the triples into its Local RDF Triple Storage component.

  7. Storing RDF Triples

  8. Storing RDF Triples (Continued…)

  9. Native Query • The native query resolver parses native RDFPeers queries and uses MAAN’s SEARCH message to resolve them (using successor routing algorithm). • EXAMPLE: (<info:mincai>, <foaf:name>, ?name)

  10. Continued… • Atomic Triple Patterns

  11. Xpath lookup Queries in p2p Networks

  12. DATA MODEL- XML fragments • A subdocument of the original document • given a document D, a fragment thereof is defined as a subtree of the original document and identified by its absolute linear path.

  13. DATA MODEL-Identifier Given a fragment, The identifier of that fragment is the path from the document root to the fragment root • /site • /site/regions/namerica • /site/regions/namerica/item[1]/quantity • /site/regions/europe/item[1]/quantity

  14. DATA MODEL-Super and Child Fragments super fragment which is the ancestor of the current fragment super fragment path expressionps child fragment fragments labeled ‘sub’ child fragment path expressions pc

  15. DATA MODEL-Example • /site/regions/namerica • ps: /site (not …regions!) • pc: /site/regions/namerica/item[1]/quantity • pc: /site/regions/namerica/item[1]/descr

  16. Paths come as identifiers Each fragment is accessed through its own path Super fragments and children fragments are stored within the local peer Which mechanism to use? A lightweight DHT Implementation based on Chord Maintains list of successors at log distance Guarantees efficient access Xpath in DHT

  17. XP2P extension of Chord ring • XML fragments and Xpaths along the ring: • transform a pc (ps) into a Nx

  18. Fingerprinting path expression • Hash functions (e.g.SHA-1) are fine, but for a suitable solution: • Instead of hashing them, XP2P reduces them to shorter fingerprints. • Two main advantages: • Concatenation property • Authenticity of data content

  19. FINGERPRINTING PATH EXPRESSION Due to Michael Rabin A = (a1, a2, . . . , am) be a binary string. A(t) = a1 ∗ tm-1+ a2 ∗ tm-2+ · · · + am. P(t) be an irreducible polynomial The fingerprint of A is the following: f(A) = A(t)modP(t).

  20. FINGERPRINTING PATH EXPRESSION Concatenation property f(concat(A, B)) = f(concat(f(A), B)). The fingerprinting polynomial is the key: Degree of 64 Acceptable probability of 2−10 path expressions of 50 steps Maximum 230 fragments

  21. FINGERPRINTING PATH EXPRESSION Extension of Chord is used: Fingerprinting instead of hashing Each peer stores minimal access information The fingerprint of its own identifier the fingerprint of the super-fragment ps a list of fingerprints of path expressions of the external sub-fragments, pc

  22. Fragment Lookup - partial lookup Fragment is returned when node holding the XPath identifier is found. Doesn’t look for sub tag

  23. Fragment Lookup – full lookup • Fragment is returned when full fragment is retrieved through sub tags

  24. XPath Expression Lookup ( child axis only ) Full match attempt • 1. Fingerprint XPath into Chord ring. • 2. Lookup for the exact match. • a) Exact match found in match. Algorithm stops. • b) Not found. Go to next step. Partial match ( Bottom-up steps ) • 3. A step from the path expression is pruned. • 4. Check for the match. If not found go to step 3. Partial match ( Top-down steps ) • 5. Analyze the local content for match. If not match proceed to sub fragment in top-down fashion.

  25. Two way navigation / private[2] / profressor / personalData[1] / pictures professor personalData[1] bu2 td1 bu1 bu2 private[2] bu1 td2 / profressor / personalData[1] pictures td1 / profressor td2

  26. XPath Expression Lookup ( containing Descendant axis //) Motivation Idea • Cannot be solved only by fingerprinting. • Exhaustive search is not a feasible solution. • Lookup is done by Top-down fashion. This yields less intermediate results than bottom-up. • Use sub-fragment information of peer for early path detection.

  27. XPath Expression Lookup ( containing Descendant axis //) Linear path expression Part with Descendant axis • A) Linear path expression is solved with composition of exact match and partial match lookups. [ Context Node finding ] • B) Part with Descendant axis with optimistic step-wise algorithm. [ From the Context node ] Query: /s1 / s2 /../ si // sj / sk /../ sn-1 / sn

  28. Optimistic step-wise algorithm • Look for sjin arbitrary peer in local fragment and related path expressions • we can find sj in following locations. • Contained in fragment. [ can be retrieved and proceed] • Intermediate step of related path expressions. [ sjis already evaluated ] • Last step of related path expression.[ promising path expression, new direction to explore ] • Not in any of the related path expression.[ may be a new direction to explore with sub]

  29. Yet another taxonomy

  30. Routing Query Language Translation Query Semantics Static One implicit schema Globally known P2P content sharing: Gnutella, KaZaA etc. Quasi-Static Several schemas, occasionally created Administratively scoped Service discovery: Jini, SLP, Salutation Dynamic Heterogeneous schemas User scoped, semantic mapping needed PDBS: PeerDB, XP2P, RDFPeers etc. Schema Static Quasi-static Dynamic Distributed Search Mechanism : Components

  31. Exact keyword DHT-based P2P content sharing E.g., CFS, eMule Partial keyword Unstructured P2P E.g., Gnutella, Fasttrack Property-value list Most service discovery protocols (SDPs) E.g., Jini, Salutation, UPnP Complex queries PDBSs and some SDPs Hierarchical, relational op. and ranges E.g., PeerDB, XP2P, SLP, Twine. Exact keyword Partial keyword Property-value list Complex queries Distributed Search Mechanism : Components Routing Query Language Translation Query Semantics Schema Expressiveness Static Quasi-static Dynamic

  32. Flat Content-routing Hash Address-routing Hash-summary Signature-routing Flat  Content-routing Preserve semantic info. in query for use in routing decisions at each hop E.g. semi-structured P2P & industrial SDP Hash  Address-routing Hashing looses semantic info. Key to address (of target) mapping E.g., DHT-techniques, SkipNet Hash-summary  Signature-routing Query semantic is preserved Bloom-filter based & lossy aggregation E.g., SSDS, NSS, DPMS, PLR Distributed Search Mechanism : Components Routing Query Language Translation Query Semantics Schema Expressiveness Exact keyword Static Partial keyword Quasi-static Property-value list Dynamic Complex queries

  33. Content-routing Routing Query Language Translation Address-routing Signature-routing Flat Hash Query Semantics Hash-summary Topology Index Schema Expressiveness Exact keyword Static Central Unstructured Quasi-static Partial keyword Partially Decentralized Semi-structured Dynamic Property-value list Pure Decentralized. Structured Complex queries Distributed Search Mechanism : Components

More Related