an overview of gnutella
Skip this Video
Download Presentation
An Overview of Gnutella

Loading in 2 Seconds...

play fullscreen
1 / 26

An Overview of Gnutella - PowerPoint PPT Presentation

  • Uploaded on

An Overview of Gnutella. History. The Gnutella network is a fully distributed alternative to the centralized Napster. Initial popularity of the network was spurred on by Napster\'s threatened legal demise in early 2001. . A generic view. object1. object2. peer. peer.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'An Overview of Gnutella' - felcia

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

The Gnutella network is a fully distributed

alternative to the centralized Napster.

Initial popularity of the network was spurred

on by Napster\'s threatened legal demise in

early 2001.


A generic view





No central authority.


What is Gnutella?

Gnutella is a protocol for distributed search

  • peer-to-peer comm
  • decentralized model
  • Two stages:
  • Join Network … later
  • Use Network, I.e discover / search other peers

Gnutella Jargon

Servent: A Gnutella node. Each servent is both a server and a client.

2 Hops

Hops: a hop is a pass through an intermediate node

1 Hop


TTL: how many hops a packet can go before it dies

(default setting is 7 in Gnutella)

gnutella scenario
Gnutella Scenario
  • Step 0: Join the network
  • Step 1: Determining who is on the network
    • "Ping" packet is used to announce your presence on the network.
    • Other peers respond with a "Pong" packet.
    • Also forwards your Ping to other connected peers
    • A Pong packet also contains:
      • an IP address
      • port number
      • amount of data that peer is sharing
      • Pong packets come back via same route
  • Step 2: Searching
    • Gnutella "Query" ask other peers if they have the file you desire A Query packet might ask, "Do you have any content that matches the string ‘Double Helix"?
    • Peers check to see if they have matches & respond (if they have any matches) & send packet to connected peers
    • Continues for TTL
  • Step 3: Downloading
    • Peers respond with a “QueryHit” (contains contact info)
    • File transfers use direct connection using HTTP protocol’s GET method

Simple idea , but lacks scalability, since query flooding wastes bandwidth.

Sometimes, existing objects may not be located due to limited TTL.

Subsequently, various improved search strategies have been proposed.

searching in gnutella
Searching in Gnutella

The topology is dynamic, I.e. constantly changing. How do

we model a constantly changing topology? Usually, we begin

with a static topology, and later account for the effect of churn.

Modeling topology(measurements provide useful inputs)

Random graph

Power law graph

random graph erd s r nyi model
Random graph: Erdös-Rényi model

A random graph G(n, p) is constructed by starting with

a set of n vertices, and adding edges between pairs of

nodes at random. Every possible edge occurs independently

with probability p.

Q. Is Gnutella topology a random graph?

gnutella topology
Gnutella topology

Gnutella topology is actually a power-law graph. (Also called

scale-free graph)

What is a power-law graph? The number of nodes with degree

k = c.k - r

(Contrast this with Gaussian distribution where the number of nodes

with degree k = c. 2 - k. )


Many graphs in the nature exhibit power-law characteristics. Examples, world-wide web

(the number of pages that have k in-links is proportional to k - 2), Thefraction of scientific

papers that receive k citations is k -3 etc.


AT&T Call Graph

How many telephone

numbers receive calls from k

different telephone numbers?

# of telephone numbers

from which calls were made

# of telephone numbers called






power-law fit


= 2.07



proportion of nodes







number of neighbors

Gnutella network

power-law link distribution

summer 2000,

data provided by Clip2



A possible explanation

Nodes join at different times.

The more connections a node has, the more likely

it is to acquire new connections (“Rich gets richer”).

Popular webpages attract new pointers.

It has been mathematically shown that such a growth

process produces power-law network


search strategies
Search strategies
  • Flooding
  • Random walk /
  • - Biased random walk/
  • - Multiple walker random walk
  • (Combined with)
  • One-hop replication /
  • Two-hop replication
  • k-hop replication
on random walk
On Random walk

Let p(d) be the probability that a random walk on a d-D lattice returns to the origin. In 1921, Pólya proved that,

(1) p(1)=p(2)=1, but

(2) p(d)<1 for d>2

There are similar results

on two walkers meeting

each other via random walk


Search via random walk

Existence of a path does

not necessarily mean that

such a path can be discovered

search via random walk
Search via Random Walk

Search metrics

Delay = discovery time in hops

Overhead = total distance covered by the walker

Both should be as small as possible.

For a single random walker, these are equal.

K random walkers is a compromise.

For search by flooding, if delay = hthen

overhead = d + d2 + … + dhwhere

d = degree of a node.

a simple analysis of random walk
A simple analysis of random walk

Letp = Population of the object. i.e. the fraction of nodes hosting the object

T = TTL (time to live)

a simple analysis of random walk1
A simple analysis of random walk

Expected hop count E(h) =

1.p + 2.(1-p).p + 3(1-p)2.p + …+ T.(1-p)T-1.p

= 1/p. (1-(1-p)T) - T(1-p)T

With a large TTL, E(h) = 1/p, which is intuitive.

With a small TTL, there is a risk that search will time out before an existing object is located.

k random walkers
K random walkers

Assume they all k walkers start in unison. Probability that none could

find the object after one hop = (1-p)k. The probability. that none succeeded after T hops = (1-p)kT. So the probability that at least one walker succeeded is 1-(1-p)kT.A typical assumption is that the search is abandoned as soon as at least one walker succeeds

As k increases, the overhead increases, but the delay decreases. There is a tradeoff.

increasing search efficiency
Increasing search efficiency

Major strategies

Biased walk utilizing node degree heterogeneity.

Utilizing structural properties like random graph, power-law graphs, or small-world properties

Topology adaptation for faster search

Introducing two layers in the graph structure using supernodes

one hop replication
One hop replication

Each node keeps track of the indices of the files belonging to its

immediate neighbors. As a result, high capacity / high degree nodes

can provide useful clues to a large number of search queries.

Where is

biased random walk
Biased random walk




Each node records the degree of the neighboring nodes. Search

easily gravitates towards high degree nodes that hold more clues.






power-law graph

number of

nodes found




Deterministic biased walk


the next step
The next step

This growing surge in popularity revealed the limits of

the initial protocol\'s scalability. In early 2001, variations

on the protocol improved the scalability. Instead of

treating every user as client and server, some users

were treated as "ultrapeers” or “supernodes,” routing

search requests and responses for users connected

to them.

the kazaa approach
The KaZaA approach

Where is






Powerful nodes (supernodes) act as local index servers, and

client queries are propagated to other supernodes. Two-layered