1 / 19

Peer-to-Peer Infrastructure and Applications

Peer-to-Peer Infrastructure and Applications. Andrew Herbert Microsoft Research, Cambridge +44 1223 479818 aherbert@microsoft.com. Microsoft and the Grid. Shared vision of the “virtual organization” But focused on e-Business rather than e-Science Grid investments

rianna
Download Presentation

Peer-to-Peer Infrastructure and Applications

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peer-to-Peer Infrastructure and Applications Andrew Herbert Microsoft Research, Cambridge +44 1223 479818 aherbert@microsoft.com

  2. Microsoft and the Grid • Shared vision of the “virtual organization” • But focused on e-Business rather than e-Science • Grid investments • Windows clusters for large scale computations • TerraServer projects for large data sets • Globus port to Windows • Globus implementation of OGSA • Wrap Grid services as Web services • Leverage web services as Grid infrastructure • E.g., Hailstorm user authentication

  3. Microsoft Research • Web services enable wide area integration • How to extend this to enable efficient wide scale information sharing and collaboration? • Move to a model of peer-to-peer service implementation in contrast to today’s server-based model • Necessarily scalable and self-organizing • Necessarily simple developer framework • No conflict with WSDL, SOAP etc

  4. Peer-to-Peer today • Music / video download • Napster, Morpheus, Gnutella • Distributed computing • SETI@Home • Research community • looking for general purpose frameworks • discovering useful applications

  5. Peer-to-Peer applications • Publish/Subscribe Event Notification (SCRIBE) • Share load of supporting topics and disseminating messages from publishers to subscribers • Distributed document archive (PAST) • Share load of storing documents reliably • Web caching (SQUIRREL) • Share load of caching web pages • Dynamic directory (OVERLOOK) • Share load of storing directory entries for dynamic data

  6. A P2P framework requires: • Content-based addressing • Hash content to key • Route message to computer hosting that key • Dynamic caching and proxying • Local computers stand in for remote ones • Faster access, reduced load on key holder • Replication and automatic failover • Store at K computers adjacent to key holder • Multicast cascade for group communication • Each computer needs a spanning tree of routes for reaching every other computer

  7. Overlay Networks • Peer-to-peer requires richer routing semantics than IP • IP routes to destination computer, not content • URLs route to destination computer, not content • IP multicast isn’t widely deployed • Solution: Overlay networks • allow applications to participate in hop-by-hop routing decisions • Ideal overlay is efficient, self-organizing, scalable, and fault-tolerant

  8. Pastry outline • Computers (Nodes) have unique Id • Typically 128 bits long • Primitive: Route (msg, key) • Deliver msg to currently alive node with Id closest numerically to key • Scalable, efficient • Per node routing table O(log(N)) entries • Route in O(log(N)) steps • Fault tolerant • Self-fixes routing tables when nodes added, deleted or fail

  9. Pastry routing 0XXX 1XXX 2XXX 3XXX 0112 2321 START 0112 routes a message to key 2000. 2032 First hop fixes first digit (2) 2001 Second hop fixes second digit (20) END 2001 closest live node to 2000.

  10. Pastry routing table Routing table: For each level, nearest peer for other domains Namespace leaf set: nearest Ids to “left” and “right” in name space Each entry gives IP address for host associated with Id

  11. Pastry Routing Demo

  12. Pastry node addition • Want to add new node to the system • Invent a new random nodeId X • Go to a nearby or well-known node A • Route to “key” X via A (finds node Z with Id closest to X) • Obtain leaf set from Z and rebuild • Obtain routing table entries from each node along the route from A to Z and rebuild • Register with each member of A’s namespace leaf set so they adjust their leaf sets and rebuild • Find nearest leaf set node and use itsrouting table to improve locality

  13. Scribe: A Pastry Application Publisher Publisher • Publish / subscribe is a popular model for “event driven” systems with volatile membership • Decouple event publishers from event subscribers • Publishers don’t know in advance who subscribers are • Subscribers don’t know in advance who publishers are • Challenge is how to multicast notifications from topics efficiently Topic of interest Subscriber Subscriber

  14. Scribe: architecture • Topic hashed to a key • Construct a multicast tree based on the Pastry network • Have the (Pastry) node with the closest Id to the topic key be the root • This node replicates knowledge of the topic to its k nearest neighbours for resilience • Pass event notification down through the tree • Each parent forwards event to it’s children • Avoids over stressing network links close to the topic node

  15. Scribe: Topic creation • Each topic is assigned a topicId • Root of the multicast tree= node with nodeId numerically closest • Create(topic): route through Pastry to the topicId Root T Create(T)

  16. Scribe: subscribing 1111 1000 1111 1100 1101 1100 1101 1011 1001 1011 0100 1001 0100 0111 1000 0111

  17. Publish(topic, event) Route through the Pastry network using the topicId as the destination Dissemination along the multicast tree starting from the root Scribe: event dissemination E 1100 1101 1011 1011 0100 0111

  18. Scribe demo

  19. Summary • Peer-to-peer techniques are good for wide area information sharing and collaborative computation • Overlay networks enable peer-to-peer distributed computing • Pastry is an efficient, scalable, self-organizing peer-to-peer framework • Pastry makes it easy to build powerful peer-to-peer applications • For more see: http://research.microsoft.com/~antr/Pastry/

More Related