Peer pressure distributed recovery in gnutella
This presentation is the property of its rightful owner.
Sponsored Links
1 / 31

Peer Pressure: Distributed Recovery in Gnutella PowerPoint PPT Presentation


  • 43 Views
  • Uploaded on
  • Presentation posted in: General

Peer Pressure: Distributed Recovery in Gnutella. Pedram Keyani Brian Larson Muthukumar Senthil Computer Science Department Stanford University. Introduction. Gnutella is a P2P file sharing protocol The issue we are addressing is distributed recovery from malicious attacks in Gnutella

Download Presentation

Peer Pressure: Distributed Recovery in Gnutella

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Peer pressure distributed recovery in gnutella

Peer Pressure: Distributed Recovery in Gnutella

Pedram Keyani

Brian Larson

Muthukumar Senthil

Computer Science Department

Stanford University


Introduction

Introduction

  • Gnutella is a P2P file sharing protocol

  • The issue we are addressing is distributed recovery from malicious attacks in Gnutella

  • Our solution is a mechanism for proactive failure detection and recovery

  • Our experimental process and models

  • The fruits of our labor: RESULTS!


Failure in gnutella

Failure in Gnutella

  • Failure of nodes in Gnutella can be caused by any number of reasons

  • Failure of 4% of the most highly connected nodes in Gnutella fragments the network to the point where it is unusable by anyone

  • The exact details of this are outlined in work done by Stefan Saroiu


Scale free networks gnutella internet

Scale Free Networks (Gnutella, Internet)

  • Abide by power law where

    • # of nodes of degree N is proportional toN -lambda

    • Lambda is observed to be roughly 2.3

  • Scale Free networks are highly resilient to large scale random failures but weak for malicious attacks on the most highly connected well known nodes


Exponential networks

Exponential Networks

  • Connections between nodes are random

    • No preferential connections ensures no node holds the entire network together

  • They react the same way to malicious attacks and random failures


Scale free and exponential

Scale Free and Exponential


Our hypothesis

Our Hypothesis

In order to allow Gnutella to recover from malicious attacks nodes must plan for failures by discovering and maintaining backup connections to form an exponential network. These backups will be used to replace dead neighbors in the case of a malicious attack.


Recovery method

Recovery Method

  • Build and maintain a virtual exponential network connecting all the nodes

  • Accomplish this through random node discovery

  • Detect malicious attacks on active network

  • Switch over to exponential network


Random node discovery

Random Node Discovery

  • Problem: no centralized name authority to give a truly random node

  • Solution: use random walks through the network to arrive at random node

  • Random Discovery Ping (RDP) is forwarded to only one of a node’s neighbors, selected in such a way to give a random distribution

  • RDPs use a hop count of 20, roughly equal to the network diameter


Maintenance of virtual exponential network

Maintenance of Virtual Exponential Network

  • Each node discovers N random nodes, where N is the minimum number of connections the node wants to maintain

  • Then periodically ping these nodes to make sure they are alive

  • Discover new neighbors to replace them should they die


Failure detection

Failure Detection

  • Random failures result in loss of 1st degree neighbors

  • Malicious attacks result in greater loss of 2nd degree neighbors than 1st degree

  • Keep a history (30 seconds) of 1st and 2nd degree neighbor loss

  • If 2nd degree loss exceeds 1st degree loss and a threshold (50%), mark as malicious


Reacting to failures

Reacting to Failures

  • For each neighbor lost, replace it with a node from the virtual exponential network

  • Only nodes local to an attack will switch, preserving the rest of the network structure

  • Do not attempt to discover additional random nodes during an attack

  • When attack is deemed to be over, return to normal operations


P2p simulator

P2P Simulator

  • Generalized P2P network simulator

  • Handles message routing, time management

  • Support for bringing nodes up or down, injecting failures, logging

  • Also created a compatible Gnutella client, and our enhanced Gnutella client

  • About 5k lines of Java


Modeling gnutella

Modeling Gnutella

  • No standard way to do this

    • Protocol only specifies message formats

    • Clients free to implement other aspects

    • Some degree of standardization

  • We used the most common client in our simulation model - Bearshare


Bootstrapping

Bootstrapping

  • How do nodes connect in our simulation?

  • Defunct www.gnutellahosts.com

    • Maintain list of highly-available, well-connected nodes

    • Clients connect by receiving one of these nodes

  • Bearshare clients do something similar

    • Connect to service “pubic.bearshare.net”

    • Keep a range of neighbors (3-10)


Uptime distribution

Uptime Distribution

  • How long do nodes stay up in our simulation?

  • Modeled by a power law function

  • Most nodes are up for a short period of time, few are up for a long period

    • Many users just sign off after getting their content

    • Most users are dialup users

  • Within a reasonable time slice, nodes have uptimes following the power law distribution


Our experiments

Our Experiments

  • Ran with recovery method and without

  • No failures – just ran our simulator without removing any nodes (control)

  • Malicious attack on most highly connected nodes


Malicious attack

Malicious Attack

  • Ran the experiment for 10 minutes

  • We removed 5% of the most highly connected nodes over a 5 minute interval in the middle

  • Representative of a coordinated distributed attack on the network


Metrics

Metrics

  • Large number of metrics that we could have used

  • We picked metrics that measure

    • How partitioned the network is

    • How useful the network is in sending queries


Size of largest connected component

Size of Largest Connected Component

  • Largest set of nodes V, where any vm and vn V have a path between each other

  • Measures the number of nodes that can potentially communicate with each other

  • Can get any data from any other node


Of connected components

# of Connected Components

  • Number of separate pieces of the network

  • If number of CC’s is large then the network is heavily partitioned

    • Not possible to retrieve content between CC’s

    • Want to monitor this number to make sure it is not increasing


Nodes reachable within 6 hops

Nodes Reachable Within 6 Hops

  • Sum of number of 1st, 2nd . . ., 6th degree neighbors of a node

  • End to end measurement of how many nodes you can reach with a query

    • Typically queries are forwarded about 6 nodes

    • Rough estimate of the number of nodes a user can search.


Results largest cc

Results – Largest CC


Results number of ccs

Results – Number of CCs


Results of nodes within 6 hops

Results - % of nodes within 6 hops


Failure detection results

Failure Detection Results


Random node distribution

Random Node Distribution


Messages per node results

Messages Per Node Results


Conclusions

Conclusions

  • By planning for and detecting failures our recovery method can drastically increase the likelihood that the network will not become partitioned

  • It lessens the impact of malicious attacks on the querying capability of the network


Further work

Further Work

  • Investigating other techniques for random node discovery

  • Restoring network to a scale free topology immediately following failures

  • How the Gnutella network has changed over time


Thanks

Thanks

  • Stefan Saroiu and Steven Gribble for letting us use their data and giving us advice

  • Armando Fox, George Candea, Dave Patterson, Aaron Brown

Bling-Bling Industries, 2001


  • Login