Peer Pressure: Distributed Recovery in Gnutella Pedram Keyani Brian Larson Muthukumar Senthil Computer Science Department Stanford University
Introduction • Gnutella is a P2P file sharing protocol • The issue we are addressing is distributed recovery from malicious attacks in Gnutella • Our solution is a mechanism for proactive failure detection and recovery • Our experimental process and models • The fruits of our labor: RESULTS!
Failure in Gnutella • Failure of nodes in Gnutella can be caused by any number of reasons • Failure of 4% of the most highly connected nodes in Gnutella fragments the network to the point where it is unusable by anyone • The exact details of this are outlined in work done by Stefan Saroiu
Scale Free Networks (Gnutella, Internet) • Abide by power law where • # of nodes of degree N is proportional toN -lambda • Lambda is observed to be roughly 2.3 • Scale Free networks are highly resilient to large scale random failures but weak for malicious attacks on the most highly connected well known nodes
Exponential Networks • Connections between nodes are random • No preferential connections ensures no node holds the entire network together • They react the same way to malicious attacks and random failures
Our Hypothesis In order to allow Gnutella to recover from malicious attacks nodes must plan for failures by discovering and maintaining backup connections to form an exponential network. These backups will be used to replace dead neighbors in the case of a malicious attack.
Recovery Method • Build and maintain a virtual exponential network connecting all the nodes • Accomplish this through random node discovery • Detect malicious attacks on active network • Switch over to exponential network
Random Node Discovery • Problem: no centralized name authority to give a truly random node • Solution: use random walks through the network to arrive at random node • Random Discovery Ping (RDP) is forwarded to only one of a node’s neighbors, selected in such a way to give a random distribution • RDPs use a hop count of 20, roughly equal to the network diameter
Maintenance of Virtual Exponential Network • Each node discovers N random nodes, where N is the minimum number of connections the node wants to maintain • Then periodically ping these nodes to make sure they are alive • Discover new neighbors to replace them should they die
Failure Detection • Random failures result in loss of 1st degree neighbors • Malicious attacks result in greater loss of 2nd degree neighbors than 1st degree • Keep a history (30 seconds) of 1st and 2nd degree neighbor loss • If 2nd degree loss exceeds 1st degree loss and a threshold (50%), mark as malicious
Reacting to Failures • For each neighbor lost, replace it with a node from the virtual exponential network • Only nodes local to an attack will switch, preserving the rest of the network structure • Do not attempt to discover additional random nodes during an attack • When attack is deemed to be over, return to normal operations
P2P Simulator • Generalized P2P network simulator • Handles message routing, time management • Support for bringing nodes up or down, injecting failures, logging • Also created a compatible Gnutella client, and our enhanced Gnutella client • About 5k lines of Java
Modeling Gnutella • No standard way to do this • Protocol only specifies message formats • Clients free to implement other aspects • Some degree of standardization • We used the most common client in our simulation model - Bearshare
Bootstrapping • How do nodes connect in our simulation? • Defunct www.gnutellahosts.com • Maintain list of highly-available, well-connected nodes • Clients connect by receiving one of these nodes • Bearshare clients do something similar • Connect to service “pubic.bearshare.net” • Keep a range of neighbors (3-10)
Uptime Distribution • How long do nodes stay up in our simulation? • Modeled by a power law function • Most nodes are up for a short period of time, few are up for a long period • Many users just sign off after getting their content • Most users are dialup users • Within a reasonable time slice, nodes have uptimes following the power law distribution
Our Experiments • Ran with recovery method and without • No failures – just ran our simulator without removing any nodes (control) • Malicious attack on most highly connected nodes
Malicious Attack • Ran the experiment for 10 minutes • We removed 5% of the most highly connected nodes over a 5 minute interval in the middle • Representative of a coordinated distributed attack on the network
Metrics • Large number of metrics that we could have used • We picked metrics that measure • How partitioned the network is • How useful the network is in sending queries
Size of Largest Connected Component • Largest set of nodes V, where any vm and vn V have a path between each other • Measures the number of nodes that can potentially communicate with each other • Can get any data from any other node
# of Connected Components • Number of separate pieces of the network • If number of CC’s is large then the network is heavily partitioned • Not possible to retrieve content between CC’s • Want to monitor this number to make sure it is not increasing
Nodes Reachable Within 6 Hops • Sum of number of 1st, 2nd . . ., 6th degree neighbors of a node • End to end measurement of how many nodes you can reach with a query • Typically queries are forwarded about 6 nodes • Rough estimate of the number of nodes a user can search.
Conclusions • By planning for and detecting failures our recovery method can drastically increase the likelihood that the network will not become partitioned • It lessens the impact of malicious attacks on the querying capability of the network
Further Work • Investigating other techniques for random node discovery • Restoring network to a scale free topology immediately following failures • How the Gnutella network has changed over time
Thanks • Stefan Saroiu and Steven Gribble for letting us use their data and giving us advice • Armando Fox, George Candea, Dave Patterson, Aaron Brown Bling-Bling Industries, 2001