Peer pressure distributed recovery in gnutella
1 / 31

Peer Pressure: Distributed Recovery in Gnutella - PowerPoint PPT Presentation

  • Uploaded on

Peer Pressure: Distributed Recovery in Gnutella. Pedram Keyani Brian Larson Muthukumar Senthil Computer Science Department Stanford University. Introduction. Gnutella is a P2P file sharing protocol The issue we are addressing is distributed recovery from malicious attacks in Gnutella

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Peer Pressure: Distributed Recovery in Gnutella' - phyllis-nguyen

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Peer pressure distributed recovery in gnutella

Peer Pressure: Distributed Recovery in Gnutella

Pedram Keyani

Brian Larson

Muthukumar Senthil

Computer Science Department

Stanford University


  • Gnutella is a P2P file sharing protocol

  • The issue we are addressing is distributed recovery from malicious attacks in Gnutella

  • Our solution is a mechanism for proactive failure detection and recovery

  • Our experimental process and models

  • The fruits of our labor: RESULTS!

Failure in gnutella
Failure in Gnutella

  • Failure of nodes in Gnutella can be caused by any number of reasons

  • Failure of 4% of the most highly connected nodes in Gnutella fragments the network to the point where it is unusable by anyone

  • The exact details of this are outlined in work done by Stefan Saroiu

Scale free networks gnutella internet
Scale Free Networks (Gnutella, Internet)

  • Abide by power law where

    • # of nodes of degree N is proportional toN -lambda

    • Lambda is observed to be roughly 2.3

  • Scale Free networks are highly resilient to large scale random failures but weak for malicious attacks on the most highly connected well known nodes

Exponential networks
Exponential Networks

  • Connections between nodes are random

    • No preferential connections ensures no node holds the entire network together

  • They react the same way to malicious attacks and random failures

Our hypothesis
Our Hypothesis

In order to allow Gnutella to recover from malicious attacks nodes must plan for failures by discovering and maintaining backup connections to form an exponential network. These backups will be used to replace dead neighbors in the case of a malicious attack.

Recovery method
Recovery Method

  • Build and maintain a virtual exponential network connecting all the nodes

  • Accomplish this through random node discovery

  • Detect malicious attacks on active network

  • Switch over to exponential network

Random node discovery
Random Node Discovery

  • Problem: no centralized name authority to give a truly random node

  • Solution: use random walks through the network to arrive at random node

  • Random Discovery Ping (RDP) is forwarded to only one of a node’s neighbors, selected in such a way to give a random distribution

  • RDPs use a hop count of 20, roughly equal to the network diameter

Maintenance of virtual exponential network
Maintenance of Virtual Exponential Network

  • Each node discovers N random nodes, where N is the minimum number of connections the node wants to maintain

  • Then periodically ping these nodes to make sure they are alive

  • Discover new neighbors to replace them should they die

Failure detection
Failure Detection

  • Random failures result in loss of 1st degree neighbors

  • Malicious attacks result in greater loss of 2nd degree neighbors than 1st degree

  • Keep a history (30 seconds) of 1st and 2nd degree neighbor loss

  • If 2nd degree loss exceeds 1st degree loss and a threshold (50%), mark as malicious

Reacting to failures
Reacting to Failures

  • For each neighbor lost, replace it with a node from the virtual exponential network

  • Only nodes local to an attack will switch, preserving the rest of the network structure

  • Do not attempt to discover additional random nodes during an attack

  • When attack is deemed to be over, return to normal operations

P2p simulator
P2P Simulator

  • Generalized P2P network simulator

  • Handles message routing, time management

  • Support for bringing nodes up or down, injecting failures, logging

  • Also created a compatible Gnutella client, and our enhanced Gnutella client

  • About 5k lines of Java

Modeling gnutella
Modeling Gnutella

  • No standard way to do this

    • Protocol only specifies message formats

    • Clients free to implement other aspects

    • Some degree of standardization

  • We used the most common client in our simulation model - Bearshare


  • How do nodes connect in our simulation?

  • Defunct

    • Maintain list of highly-available, well-connected nodes

    • Clients connect by receiving one of these nodes

  • Bearshare clients do something similar

    • Connect to service “”

    • Keep a range of neighbors (3-10)

Uptime distribution
Uptime Distribution

  • How long do nodes stay up in our simulation?

  • Modeled by a power law function

  • Most nodes are up for a short period of time, few are up for a long period

    • Many users just sign off after getting their content

    • Most users are dialup users

  • Within a reasonable time slice, nodes have uptimes following the power law distribution

Our experiments
Our Experiments

  • Ran with recovery method and without

  • No failures – just ran our simulator without removing any nodes (control)

  • Malicious attack on most highly connected nodes

Malicious attack
Malicious Attack

  • Ran the experiment for 10 minutes

  • We removed 5% of the most highly connected nodes over a 5 minute interval in the middle

  • Representative of a coordinated distributed attack on the network


  • Large number of metrics that we could have used

  • We picked metrics that measure

    • How partitioned the network is

    • How useful the network is in sending queries

Size of largest connected component
Size of Largest Connected Component

  • Largest set of nodes V, where any vm and vn V have a path between each other

  • Measures the number of nodes that can potentially communicate with each other

  • Can get any data from any other node

Of connected components
# of Connected Components

  • Number of separate pieces of the network

  • If number of CC’s is large then the network is heavily partitioned

    • Not possible to retrieve content between CC’s

    • Want to monitor this number to make sure it is not increasing

Nodes reachable within 6 hops
Nodes Reachable Within 6 Hops

  • Sum of number of 1st, 2nd . . ., 6th degree neighbors of a node

  • End to end measurement of how many nodes you can reach with a query

    • Typically queries are forwarded about 6 nodes

    • Rough estimate of the number of nodes a user can search.


  • By planning for and detecting failures our recovery method can drastically increase the likelihood that the network will not become partitioned

  • It lessens the impact of malicious attacks on the querying capability of the network

Further work
Further Work

  • Investigating other techniques for random node discovery

  • Restoring network to a scale free topology immediately following failures

  • How the Gnutella network has changed over time


  • Stefan Saroiu and Steven Gribble for letting us use their data and giving us advice

  • Armando Fox, George Candea, Dave Patterson, Aaron Brown

Bling-Bling Industries, 2001