1 / 9

Peer-to-peer (p2p) systems

Peer-to-peer (p2p) systems. Idea: create distributed systems out of individually owned, unreliable machines Also different administrative domains This has been tried with parallel computers in several projects, including “Grid Computing” In p2p systems, the primary problem is lookup

odessa
Download Presentation

Peer-to-peer (p2p) systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Peer-to-peer (p2p) systems • Idea: create distributed systems out of individually owned, unreliable machines • Also different administrative domains • This has been tried with parallel computers in several projects, including “Grid Computing” • In p2p systems, the primary problem is lookup • given a data item stored at one or more places, find it

  2. p2p lookup problem • Traditional approach for this kind of problem is domain name system (DNS) • DNS is tree based; with p2p, what happens if the root node (or node high in the tree) fails? • Also, the load is not shared, even with caching

  3. p2p lookup problem • Possible approaches • Centralized server (Napster) • Problem: scalability, single point of failure • Flood (Gnutella) • Problem: inefficiency • Unstructured (Freenet) • Problem: not finding the object

  4. DHT: Distributed Hash Tables • Provides a lookup service similar to a hash table • Hash table maps a key to a bucket • With DHT, we want to translate a key to a node • Nodes have some kind of connectivity through an overlay network (just a virtual network) • Goals of DHT: • Load Balance, Decentralization, Scalability, Fault Tolerance (leaving and joining)

  5. DHT: Distributed Hash Tables • Operation is as follows • Given a key k and data d, we call put(k, d) to store that data on a node that is determined by k • Practically, we may have had a filename and hashed it to produce key k • Hash function needs to spread keys out evenly • To retrieve d, call get(k), which automatically retrieves it from the proper node

  6. DHT Issues • Load balancing: need to distribute the data evenly, or else may as well use DNS • Forwarding algorithm: how does a node move a request closer to the node that has the data • Handling joins, departures, and failures

  7. Example DHT: Chord • Nodes form a logical circle in order of node id • Each node has a skiplist, which contains the node (think IP address) half-way around the circle, quarter-way around the circle, etc. • On a put or get, forward to node in skiplist that has the highest id not exceeding k • Gets us to O(log N) time if there are N nodes • But what if there is a failure in a skiplist node? • See next slide

  8. Example DHT: Chord • Load balance: achieved by hashing the node’s IP address to get the node id, and to hash the data to get the key • Decentralization: no node is more important than any other • Scalability: skiplist makes lookups O(log N) • Fault Tolerance: keep a few successors at each node; effective as long as at least one successor is alive

  9. Adding a node in Chord • To add a new node i(assumes iis unique and known) • Initialize predecessor, successors, and skiplist for node i • Made more efficient by having existing Chord node j find i’s place in the ring and using j’s finger table to create i’s quickly • Update skiplist, successors, and predecessor of existing nodes • Move appropriate keys/data to this node • Removing a new node is symmetric

More Related