Loading in 2 Seconds...

Christos Gkantsidis, Milena Mihail, Amin Saberi Presented by Paul Bogdan February 28 th , 2007

Loading in 2 Seconds...

- By
**yin** - Follow User

- 74 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about 'Christos Gkantsidis, Milena Mihail, Amin Saberi Presented by Paul Bogdan February 28 th , 2007' - yin

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### “Hybrid Search Schemes for Unstructured Peer-to-Peer Networks”“Random Walks in Peer-to-Peer Networks”

### “Random Walks in Peer-to-Peer (P2P) Networks”

Christos Gkantsidis, Milena Mihail, Amin Saberi

Presented by Paul Bogdan

February 28th, 2007

“Hybrid Search Schemes for Unstructured Peer-to-Peer Networks”

Christos Gkantsidis, Milena Mihail, Amin Saberi

Outline

- Random Graph Models
- Flooding and Normalization
- Random Walks and Replication
- Generalized Search Schemes
- Experimental evaluation

Motivation

- Flooding + small time-to-live (TTL) performs well in regular graphs
- Performance metric: number of exchanged messages/distinct response
- Its performance decreases: when TTL increases or for irregular networks
- Random Walk performs better than flooding
- scalability, granularity
- Hybrid + Generalized search schemes:
- Random Walks with lookahead, Random Walks with 1-step replication

Contribution

- Random walks (RW) with shallow flooding offer good performance (analytic justification)

R1: In a random graph model with O(n) nodes of constant degree and

O(n1/2) nodes of degree O(n1/2) the expected time to discover Ω(n) is O(n1/2).

R2: Random Walks with look-ahead 1 or 1-step replication perform better

when there is discrepancy on the degrees of the underlying topology.

- Normalized Flooding (NF) solution

R3: NF achieves comparable performance to flooding in regular graphs.

R4: NF with 1-step replication achieves performance comparable to RW

with 1-step replication.

R5: Local information of the network (nodes degree) offers global benefit.

- Generalized Search Schemes

Random Graph Models

- Random Regular Graphs – Gn,d

Gn,d represents a graph with n nodes and each node is of degree d.

Gn,d has a sum of degree D = nd .

- Random Graphs with super-nodes - Gn,d,α,β

Given α and βconstants, Gn,d,α,βdenotes a graphs with αn1/2 of degreeβn1/2 (i.e. large vertices) and the remaining nodes of degree d (i.e. small vertices).

Gn,d,α,βhas a sum of degree D = (αβ+d)n.

Flooding and Normalization

- Theorem 3.1.: Let us consider Gn,drandom regular graph, flooding scenario from node v with time-to-live τ, S – the number of distinct nodes queried by flooding with |S| ≤ |V| / 2

Claims:

(1)

(2)

(3)

Theorem 3.2.: Let Gn,d,α,β be a random graph with supernodes and a flooding scenario from node v of degree d with time-to-live τ.

Claim: For some τ = O(log log n), the number of distinct responses isΩ(n).

Proof:

Consider flooding with τ = c logd-1(log n)+1 and vertices visited with TTL τ-1.

Assumption: this set (of visited nodes) doesn’t contain a large degree vertex.

From d-regular graphs we know that this set contains at least (d - 1)τ-1 edges.

The probability that no vertex in Γ(Sτ-1(v)) is bounded by (d/(d+αβ))(d - 1)^(τ-1) = (d/(d+αβ))clog n so within the first O(loglog n) steps we see a large vertex.

Flooding and NormalizationFlooding and Normalization

- Theorem 3.3. : Let Gn,d,α,β be a random graph with supernodes, a normalized

flooding scenario from node v with TTL . Then the number of distinct

responses is Ω((d - 1)τ-1) and the number of messages per response is O(1).

Proof:

From Theorem 3.1. the number of minigroups seen is (d - 1)τ-1

The expected number of small vertices is Q = (d *(d - 1)τ-1)/(d+αβ)

LetXi, i = 1,…,N be random variables with P[ Xi=1]=pi and P[Xi=0]=1-pi

Using the above Chernoff bound the probability that less than Q/2 are seen is

vanishingly small.

Random Walks and Replication

- Random Walk with Look-Ahead:
- a random walk with shallow flooding on each step of the walk
- RW with lookahead 1 visits Ω(n) nodes with response O(n^(1/2))
- Theorem 4.2.: Let Gn,d,α,β be a random graph with supernodes and consider a

random walk from a node v. Then, in 1-step replication scenario, the expected

number of messages and response time to obtain distinct

responses is

Theorem 4.3.: Let Gn,d,α,β be a random graph with supernodes and consider

Normalized flooding from v with TTL τ≈ (log n)/(2*log(d-1)). Then, in 1-step

replication scenario, the number of distinct responses is at least

and the number of messages is at most

Proof:

The number of minigroups seen is(d - 1)τ– 1 and using the Chernoff bounds

there will be minigroups corresponding to large vertices.

Generalized Search Schemes

- Searching procedure:
- A node of degreedinitiates a search based on a budgetk

budget = number of messages that are propageted in the network

- Among its d neighbors the node picks certain quantities k1,k2,…,kd such that k1 + k2 + … + kd = k
- For every neighbor i the master node forwards the message with budget ki (forki = 0 the message is not transmitted)
- Each neighbor i reduces the budget by 1 unit and repeat the process until the budget is greater than 0
- Every node that receives the message for the second yime from another neighbor forwards the message with the corresponding budget
- Random Walks + Flooding

Experimental Evaluation

- Methodology
- Performance Metrics
- Median and Mean number of distinct peers discovered (hits)
- Minimum, Maximum, Standard Deviation of the number of hits
- Number of messages
- Granularity of number of messages
- Response time
- Topologies
- Random d-Regular Graphs
- Power Law Graphs
- Bimodal topologies
- Clustered topologies

Normalized Flooding (NF)

- Mean number of unique peers discovered as a function of the initial TTL
- NF and Standard Flooding behave similarly in Regular Graphs
- NF controls the number of messages and provides higher efficiency

Normalized Flooding (NF)

- The number of unique peers increases exponentially with TTL in NF case
- The number of peers increases faster than exponentially with TTL in topologies with high degrees

Random Walk with LookAhead (RWLA)

- RWLA performance is similar to long RW without lookahead (in terms of unique peers discovered)
- RWLA response time is much smaller compared to standard RW

Edge Criticality & Searching with weights

- Generalized Searching performs similarly to Standard Flooding in regular graphs
- Generalized Searching behaves similarly to Standard Flooding in other topologies if normalized edge criticality is used.

Conclusions

- Normalized Flooding (NF) could substitute the Standard Flooding in irregular graphs
- RW with 1-step replication performs better than RW and NF in irregular graphs
- Open for improvements:
- Generalized schemes (analytic investigation)
- Quantifying Directional flooding

Christos Gkantsidis, Milena Mihail, Amin Saberi

Outline

- Motivation
- Statistical Estimation and Random Walks (RW)
- Searching
- Methodology and Topologies importance
- Construction and Summary

Motivation

- Random Walks (RW) were proposed for constructing searching and topology maintenance protocols in P2P networks
- RW improve searching performance as compared to flooding (Cao et al., 2002)
- A RW approach to constructing and maintaining unstructured topologies provides good connectivity properties (i.e. constant degree, constant expansion)
- Claim: RW approach is a good candidate
- to simulate uniform sampling
- the number of simulation steps required can be as low as the number of samples in independent uniform sampling
- Searching and Overlay Topology Construction
- RW searching performs better than flooding for the same number of messages and for cluster and slow dynamic topologies
- Construction of P2P networks by random walks

Statistical Estimation & Random Walks

- Coupon collection and Chernoff bounds
- n - type of coupons & each time one is drawn (uniformly distributed)
- Tn - time by which we extracted coupons belonging to all n types
- Tαn - time by which we encountered αn distinct types, 0 < α < 1
- X1,…,Xk independent Bernoulli trials, P[Xi=1]=piand P[Xi=0]=1-pi
- p -probability that a random drawn object has a particular property
- the probability that the property is found in substantially fewer draws than its frequency in the search space and the quality of the estimator X/k are bounded by

Statistical Estimation & Random Walks

- Random Walks (RW), Convergence and Cover Time
- G = (V,E) undirected graph, |V| = n, and di- degree of vertex I
- Aij -adjacency matrix, P -transition matrix which satisfies
- f: V→{0,1} which satisfies
- Convergence rate metric - the rate at which the RW approaches the stationary distribution
- Cover time metric - the time by which all nodes were visited
- Trajectory sample average - the rate at which the value of f averaged over successive vertices of the RW trajectory approaches p

Statistical Estimation & Random Walks

- Convergence rate is related to the second eigenvalue of P

(1)

- yt – the vertex that the RW visited at time t
- Cover time

(2)

- Trajectory sample average

(3)

(1) :[ 11], (2) :[ 12, 13] , (3) :[ 3, 4, 5, 6]

Statistical Estimation & Random Walks

- Second Eigenvalue, Expansion and Conductance
- S subset of V, C(S) cutset of V (i.e. edges with one point in S and the other one in V\S), vol(S) (i.e. the sum of degrees of vertices in S)
- Expansion
- Conductance
- Known bound

[ 11, 14, 15, 16, 17, 18, 19]

Searching

- Performance metrics for Flooding and RW
- average number of distinct copies of an item located in the search
- number of messages used by the searching algorithm
- RW performs better than flooding if
- multiple search requests for the same item with slow-changing topology
- peer clustering ( see [20, 21, 22, 23, 24, 25] for details)
- Searching analysis
- Methodology
- Flat topologies with Uniformly Distributed Content
- Topologies with Peer Clustering
- Re-issuing the Same Query
- Real topologies

Searching - Methodology

- Performance Metrics
- mean of the number of distinct copies (i.e. Mean)
- discrepancy around the mean (i.e. Std) and the failure probability
- Cost
- number of messages or queries performed during search
- Peer-to-peer topologies ( ≈ 1 million nodes)
- Flat regular expanders, Two tier topologies with clustering, Power law graphs, Samples from real topologies
- Dynamic topologies
- rewiring
- Content placement
- Content clustering affects the performance of searching

Searching – Flat Topologies

- Experiment:
- one request in a network of 500K peers
- Mean hits,Minimum # of hits and Std are similar for Flooding and RW
- the entire distribution of hits is similar for Flooding and RW

Searching -Topologies with Peer Clustering

- Cluster topology consists of
- 5 flat regular graphs of size 40K; from each one pick randomly 1000 nodes to construct another flat regular graph
- Number of hits for RW is more concentrated around the mean compared to Flooding

Searching - Reissuing the Same Query

- Experiment setup – repeat 4 times the below procedure
- each peer sends a request and waits for response
- between requests 2% of the links are rewired
- each peer initiates a new searching
- RW have better performance than Flooding
- Mean Hits and Failure Probability

Searching - Reissuing the Same Query

- Performance of successive searches depends
- on the number of topology changes considered between consecutive searches
- Performance of Flooding increases as the rate of topological changes increases
- RW Performance remains the same for small variations

Searching – Real Topologies

- The number of hits for RW is more concentrated around the mean than in Flooding
- P2P have good expansion properties

Construction

- P2P network construction concerns with:
- peers arrive and leave the network dynamically
- strong and weak decentralization
- low network overhead per addition or deletion

Baseline Construction of Expander Graphs

- ABASE (undirected graph) consists of:
- n vertices where each one chooses randomly d vertices
- total number of edges = nd and expected vertex degree = 2d
- Theorem 4.1. Let G(V,E) a graph constructed by ABASE.

Then, G is an expander with high probability and for positive

constant α < 1

Baseline Construction of Expander Graphs with Constant Overhead in Random Bits

- A’BASE constructionalgorithm:
- start a RW at a random vertex on H (constant degree expander graph)
- when ABASE needs a random number this is taken from the RW on H
- Theorem 4.2. Let G(V,E) a graph constructed by A’BASE.

There are positive constants α, 0 < β < 0.5 such that any

subset S of at least β|V| and at most 0.5|V| has cutset

expansionαalmost surely.

Distributed Construction of Expanders with Constant Overhead on Network Resources

- A’H – construction
- d daemons , one for each Hamilton cycle
- a new arriving node, it contacts the daemon associated with the i-th Hamilton cycle
- it attaches after c number of steps between the peer that currently hosts daemon iand one of its neighbors in the cycle i

Distributed Construction of Expanders with Constant Overhead on Network Resources

- A’M – construction
- d daemons , one for each Hamilton cycle
- the arrival of a new arriving node consists of two X and Y nodes; X and Y contact the central server to discover the location of the d daemons
- X becomes the neighbor of daemon i and Y the neighbor of the initial daemon’s neighbor

Summary

- For Searching
- Random Walks (RW) are superior to Flooding
- For Construction
- RW add new peers with constant overhead
- Open Problems
- Strong Decentralized Construction algorithm
- Can we handle better deletions and expansions of small sets?
- How the P2P network parameters (e.g. capacities) affect the performance of RW?

Download Presentation

Connecting to Server..