1 / 13

Improving Search in Peer-to-Peer Networks

Improving Search in Peer-to-Peer Networks. Beverly Yang Hector Garcia-Molina. Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu). Goals. Basically - just trying to reduce nodes that handle a query. Three search techniques: Iterative Deepening Directed BFS Local Indices

tocho
Download Presentation

Improving Search in Peer-to-Peer Networks

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Improving Search in Peer-to-Peer Networks Beverly Yang Hector Garcia-Molina Presented by Shreeram Sahasrabudhe (sas4@lehigh.edu)

  2. Goals Basically - just trying to reduce nodes that handle a query. • Three search techniques: • Iterative Deepening • Directed BFS • Local Indices • Evaluation and extensive measurements of these techniques on the Gnutella network. • Ready-to-use results and recommendations.

  3. Current Techniques • Gnutella –Breadth First Search (BFS) with depth limit D (typically 7). • Disadvantages • Wastage of resources • Inefficient • Freenet: Depth First Search (DFS) • Disadvantages • Poor Response Time

  4. (TTL b-a) … Iterative Deepening • Required • System Wide policy P={a,b,c} • Time between successive iterations W. P = {a,b ,c} F r e e z e Resend [(TTL a) + query_id] S 1 a b Wait = W

  5. Directed BFS • Send queries to a subset of nodes • Subset nodes selected by heuristics like : Select node … • That has highest number of results for provided queries • Whose response messages have taken lowest avg number of hops. • Who has forwarded most messages to our client • Who has the shortest messages queue

  6. 1 5 process process Local Indices • Each node n maintains an index of data for nodes within r hops • So a node can process a query on behalf of every node within r hops • small r = less storage. (e.g. for r(1)=70KB) P= {1,5} S 2 3 4

  7. More work • Node Join • Sends join message with TTL of r, containing metadata over its collection • A node receiving a join messages sends a return join message with its metadata • Periodic refreshes • Cost ?? • QueryJoinRatio = Average ratio of queries to join messages • QueryUpdateRatio = Average ratio of queries to update messages

  8. Experiment • Data Collection • Observed Gnutella network traffic for 1 month • Determined some general statistics like average number of files shared /user, query strings etc. • Iterative Deepening • For each query Q sent: log response message arriving in 2min. • Ping messages to all neighbors: hops and IP addr. • Same data used for Local Indices • Directed BFS • Same as above, but each query sent to single node.

  9. Cost Nodes at depth N Size of query message Redundant edges between n-1 and n • Bandwidth Cost in BFS: • Processing Cost Size of Record Response messages from nodes n Total Records Size of header

  10. Results • Iterative Deepening • Neighbors = 8 • Desired number of results Z=50 • Policies P={Pd = {d, d+1, … D} for d=1,2,3..D} COST • d = cost • W = cost • “overshooting” • W = time • d = time

  11. Directed BFS • Studied 8 heuristics • ‘Random neighbor’ is baseline for comparison COST

  12. Local Indices

  13. Conclusions • Three new search systems specified and tested. • Recommend: Local Indices with r=1. Savings: 61% bandwidth 49% processing

More Related