160 likes | 320 Views
Topology of P2P Networks. By Alex Golynski. Low-diameter networks. “Building Low-diameter P2P networks” by G. Pandurangan, P. Raghavan, E. Upfal Model P2P network in a dynamic graph Nodes - clients, Edges - connections Nodes can suddenly appear/disappear
E N D
Topology of P2P Networks By Alex Golynski
Low-diameter networks • “Building Low-diameter P2P networks” by G. Pandurangan, P. Raghavan, E. Upfal • Model • P2P network in a dynamic graph • Nodes - clients, Edges - connections • Nodes can suddenly appear/disappear • Need: protocol for managing connections between clients
Assumptions • Probability that a new node arrives during [t, t+dt] is λ * dt (Poisson distribution) • Every second a new node arrives with prob. λ= 0.1 • Not true: day/night time; weekdays/holidays • A node disappear during [t, t+dt] with probability η * dt (exponential distribution) • Every hour I’ll flip a coin to decide if I’m going to disconnect • Memoryless => simpler to analyze
Protocol • Host server maintains a list of nodes (of constant size K) • Not a P2P anymore? • Gnutella does the same thing • Each node maintains a constant number of connections • minimum D and maximum C • Two types of nodes C-nodes & D-nodes • D-nodes - newbies • Never been in the cache • Always have d connections • C-nodes - matures • Have been in the cache • May have more connections • Have a preferred connection (best friend)
Protocol II • I want to join network • Host server gives me a random list of nodes in its cache • no best friend so far • If one of my connections left network • I’ll ask server for replacement • With probability D / # my connections (1 for newbies) • My best friend left network • Server gives me a new random best friend • I’m in the cache and saturated with connections • I’m retiring from the cache as a C-node • Server looks for replacement among D-nodes • This replacement is my best friend
Protocol III • How to find a new replacement d-node: • First I check my neighbours • Then I ask the node I replaced (say, v) in the cache to help me with that • I’m the best friend of v • If it doesn’t find one, it recursively passes my request to the node v replaced in the past • And so on
Implementation • Can be integrated in Gnutella • Checking if node is down – ping • Search for replacement – message passing • Server is not too overloaded • Expected constant # of requests per unit time • And at most order of log(size of network) with high probability • Scalability • Small probability of failure (no d-node is found) • Allow to have C+1 connections • Reject new connection
Statistical analysis • If C>2D then there are constant fraction of d-nodes in the network (with high prob.) • Probability that network is disconnected is small • log2(N)/N, where N is the size of network • With high probability diameter of the network is proportional to log(size) • Best one can hope for • Can try to improve constants…
Drawbacks • Underlying model is not adequate • Can try more complicated distributions for user arrival/departures seen in practice • Hard to analyse in theory => do experiments! • Network physical layer is not considered • Host server can account for proximity of nodes • Instead of giving random connections, it can give close connections (say, based on IP address match)
Mismatch problem • “Topologically-Aware Overlay Construction and Server Selection” by Sylvia Ratnasamy, Mark Handley, Richard Karp, Scott Shenker • Idea: binning • Allocate several well known landmark machines • All they need to do is to respond to ping requests • Every node in network measures round-trip time (RTT) to these landmarks • Based on the results node gets identifier (coordinate): landmark ordering + landmark vector
Content Addressable Networks • Every node has its own zone in (virtual) coordinate space (d-dimensional) • Each node is connected to its neighbours (at most 2d of them) • Routing between nodes can be done in greedy fashion • Send through a neighbour node which is closest to the destination • This follows straight line in the coordinate space (efficient)
Design • Arrival of new node in CAN network can only affect at most 2d neighbouring nodes • Each node in network sends periodic update messages to its neighbours • Sends its and its neighbours’ coordinates • If node dies, one of its neighbours takes over its zone • More complicated scenario, when node detects that too many of its neighbours died • Problem with routing • Space fragmentation
Other ideas • Agent-based approach • Agent, which constantly monitors network • It can optimize overlay structure by asking nodes to add/drop connections • Search-based approach • First node joins network at random (as before) • Then it runs search for neighbour ip addresses and reconnects to these nodes • Different topology • Tourus: CAN • Hypercube: Chord(MIT), Tapestry(Berkley)
References • Building Low-diameter P2P networksG. Pandurangan, P. Raghavan, E. Upfal • Building P2P networks with good topological propertiesG. Pandurangan, P. Raghavan, E. Upfal • A Scalable Content-Addressable Network Sylvia Ratnasamy, Paul Francis, Mark Handley, Richard Karp, Scott Shenker • Topologically-Aware Overlay Construction and Server SelectionSylvia Ratnasamy, Mark Handley, Richard Karp, Scott Shenker • NetProber: a Component for Enhancing Efficiency of Overlay Networks in P2P SystemsLuc Onana Alima, Valentin Mesaros, et al.