efficient search in peer to peer networks n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Efficient Search in Peer to Peer Networks PowerPoint Presentation
Download Presentation
Efficient Search in Peer to Peer Networks

Loading in 2 Seconds...

play fullscreen
1 / 36

Efficient Search in Peer to Peer Networks - PowerPoint PPT Presentation


  • 111 Views
  • Uploaded on

Efficient Search in Peer to Peer Networks. By: Beverly Yang Hector Garcia-Molina Presented By: Anshumaan Rajshiva Date: May 20,2002. P2P Networks. Distributed systems in which nodes of equal roles and capabilities exchange information and services directly with each other. Key Challenges.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Efficient Search in Peer to Peer Networks' - sanaa


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
efficient search in peer to peer networks

Efficient Search in Peer to Peer Networks

By:

Beverly Yang

Hector Garcia-Molina

Presented By:

Anshumaan Rajshiva

Date: May 20,2002

p2p networks
P2P Networks
  • Distributed systems in which nodes of equal roles and capabilities exchange information and services directly with each other.
key challenges
Key Challenges
  • Efficient techniques for search and retrieval of data.
  • Best search techniques for a system depends on the needs of the application.
  • Current search techniques in “loose” P2P systems tend to be very inefficient, either generating too much load on the system, or providing for a very bad user experience.
current reason for inefficiency
Current Reason for Inefficiency
  • Queries are processed by more nodes than desired.
suggested improvement
Suggested Improvement
  • Processing queries through fewer nodes.
techniques
Techniques
  • Iterative Deepening
  • Directed BFS
  • Local Indices
problem framework
Problem Framework
  • P2P: Undirected graph

Vertices: nodes in the n/w

Edges: Open connections between neighbors.

  • Message will travel from A to B in hops.
  • Length of the path: Number of hops
  • Source of query: Node submitting the query
problem framework1
Problem Framework
  • When a node receives a query it should process the query locally and respond to the query/forward/drop the query
  • Address of the source node will be unknown to the responding node (scheme used by Gnutella)
metrics
Metrics
  • Cost: Average Aggregate Bandwidth

Average Aggregate Processing Cost

  • Quality of Results: Number of Results

Satisfaction of the query

slide10
Cost
  • Message propagates across nodes,each node spend some processing resources on behalf of the query
  • Main cost described in terms of bandwidth and processing cost
slide11
Cost
  • Average Aggregate Bandwidth: The average over a set of representative queries of the aggregate BW consumed(in bytes) over each edge on behalf of the query
  • Average Aggregate Processing Cost: The average over a set of representative queries of the aggregate processing power consumed at each node on behalf of the query
quality of results
Quality of Results
  • Results from the perspective of user
  • Number of results: the size of total result set
  • Satisfaction of the query:a query is satisfied if Z or more results are returned, where Z is some value specified by user
  • Time to satisfaction: how long the user must wait for the Zthresult to arrive
current techniques
Current Techniques
  • Gnutella: BFS technique is used with depth limit of D, where D= TTL of the message.At all levels <D query is processed by each node and results are sent to source and at level D query is dropped.
  • Freenet: uses DFS with depth limit D.Each node forwards the query to a single neighbor and waits for a definite response from the neighbor before forwarding the query to another neighbor(if the query was not satisfied), or forwarding the results back to the query source(if query was satisfied).
stop and think
Stop and Think…
  • Quality of results measured only by number of results then BFS is ideal
  • If Satisfaction is metrics of choice BFS wastes much bandwidth and processing power
  • With DFS each node processes the query sequentially,searches can be terminated as soon as the query is satisfied, thereby minimizing cost.But poor response time due to the above
broadcast policy
Broadcast Policy
  • BFS and DFS falls on opposite extremes of bandwidth/processing cost and response time.
  • Need to find some middle ground between the two extremes, while maintaining quality of results.
iterative deepening
Iterative Deepening
  • When satisfaction is the metric of choice
  • Multiple BFS are initiated with successively larger depths, until query is satisfied or the maximum depth limit D is reached
iterative deepening1
Iterative Deepening
  • System wide policy specifying at what depth the iterations are to occur
  • Last depth in policy must be set to D
  • A waiting period W ( time between successive iterations in the policy)must be specified
working of iterative deepening
Working of Iterative Deepening
  • Policy P{a,b,c}
  • S initiates a BFS of depth a by sending out a query message with TTL=a to all its neighbors
  • Once a node at depth a receives and process the message, instead of dropping it, the node will store the message temporarily
  • Query becomes frozen there at all nodes a hops away from S (Frontier nodes)
  • S receives response from those nodes that have processed the query so far.
working of iterative deepening1
Working of Iterative Deepening
  • After waiting for time W if S finds that the query has already been satisfied, then it does nothing.
  • Otherwise, if the query is not yet satisfied,s will start the next iteration, initiating BFS at depth b.
  • S send out a resend message with TTL=a.
  • A node that receives a resend message,simply unfreeze the query(stored temporarily) and forward the same with TTL= b-a to its neighbors.
  • This process continues in the similar fashion till TTL=D is reached.At depth D, the query is dropped
working of iterative deepening2
Working of Iterative Deepening
  • To identify queries with Resend messages, every query is assigned a system wide “unique identifier”.
  • The resend message will contain the identifier of the query it is representing and nodes at the frontier of a search will know which query to unfreeze by inspecting this identifier.
iterative deepening2

Node2

Node1

Level 1

Iterative Deepening

Source

Node3

Node4

Level 2

directed bfs
Directed BFS
  • If minimizing response time is important then Directed BFS.
  • Strategy used is to send queries to a subset of nodes that will return many returns quickly by intelligently selecting those nodes based on some parameters.
  • For this purpose, a node will maintain statistics on its neighbors
directed bfs1
Directed BFS
  • These statistics will be based on the number of results that were received through the neighbors for past queries
  • By sending the queries to small subset of the nodes, the cost incurred will be reduced significantly
  • The quality of results is not decreased significantly,provided we make neighbor selection intelligently
local indices
Local Indices
  • A node maintains an index over the data of each node within r hops of itself, where r is a system wide variable called radius
  • When a node receives a Query message, it can then process the query on behalf of every node within r hops of itself
  • Collections of many nodes can be searched by processing the query at few nodes, while keeping the cost low
local indices1
Local Indices
  • R should be small.
  • The index will be small- typically of the order of 50 KB- independent of the size of the network
working of local indices
Working of Local Indices
  • Policy specifies at which depth query will be processed
  • To create and maintain the indices at each node
  • All nodes at depths not listed in the policy will simply forward the query to the next depth
maintaining indices
Maintaining Indices
  • Joining a new node: sends a join message with TTL=r and all the nodes within r hops update their indices.
  • Join message contains the metadata about the joining node
  • When a node receives this join message it, in turn, send join message containing its meta data directly to the new node.New node updates its indices
maintaining indices1
Maintaining Indices
  • Node dies: Other nodes update their indices based on the timeouts
  • Updating the node: When a node updates its collection, his node will send out a small update message with TTL= r, containing the metadata of the affected item.All nodes receiving this message subsequently update their index.
conclusions
Conclusions
  • Compared to current techniques used in existing systems, the discussed techniques greatly reduce the aggregate cost of processing query over the entire system, while maintaining the quality of results
  • Schemes are simple and practical to implement on the existing systems.
references
References

Beverly Yang, Hector Garcia Molina Efficient search in peer to peer network, Available at http://dbpubs.stanford.edu/pub/2001-47