1 / 11

Semantic Overlay Networks for P2P Systems

Semantic Overlay Networks for P2P Systems. Arturo Crespo and Hector Garcia-Molina. Motivation and Related Work. Partition the P2P network into several thematic networks Queries for a content will not reach nodes without such content

Download Presentation

Semantic Overlay Networks for P2P Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Overlay Networksfor P2P Systems Arturo Crespo and Hector Garcia-Molina

  2. Motivation and Related Work • Partition the P2P network into several thematic networks • Queries for a content will not reach nodes without such content • Flooding in smaller networks with smaller TTL (or more results with same) • Edutella, Hypercup: peers with similar content connect to the same SuperPeer

  3. Node partitioning • When does a node belong to SON A? • When it contains a piece of type A • When it contains more than x pieces of type A • Less nodes per SON=>more results sooner • Less SONs per node=>less connections • As in DBFS, coverage problem

  4. SON Classification • Classification must provide: • Load-balance • Each category has similar number of nodes • Each node belongs to a small number of categories • Easy and accurate way to classify a document

  5. Statistics • Classification from “All Music Guide” based on music style (26 styles without and 255 styles with substyles) • Traces from 1800 Napster nodes • Classifications on decade and tone provide worse results

  6. Statistics of styles • A node belongs to a style if it has 1 document • 24% of the nodes belong to one style • 90% nodes belong to up to eight • 14 styles with 200 to 400 nodes • 2 styles with 1000 to 1200 nodes • 1 style with 1600 to 1800 nodes

  7. Statistics of substyles • 18% of the nodes belong to one substyle • 90% nodes belong to up to 30 styles • 87% of substyles with less than 400 nodes

  8. Statistics of document classification • Classification based on the “All Music Guide” database • Reasons of failure: • Filename format • All documents not songs • Misspellings • Database not complete • 25% of documents classified incorrectly • 4% of nodes classified incorrectly

  9. Layered SONs • Styles and substyles • Minimum required percentage (or number) of documents of type A to join SON A • May not be able to join a substyle but may join a style • Nodes belonging to a substyle, do not join the parent style

  10. Searching • A document is classified to belong to (sub)style A • Search all substyles of style A SONs • Search (sub)style A • Search a higher level SON • Until we get enough results • (How do we locate each SON?)

  11. Results • Acyclic graphs, to measure effect of msgs to nodes without such content • To get half the documents that match a query: • Layered SONs: 461 msgs • Gnutella: 1731 msgs

More Related