410 likes | 509 Views
Explore the evolution of Distributed Hash Tables (DHTs) in P2P systems from key placement strategies to Consistent Hashing and lookup algorithms like Chord. Discover the challenges, applications, and advancements in decentralized directory services.
E N D
Relevant Conferences • IPTPS (International Workshop on Peer-to-Peer Systems) • ICDCS (IEEE Conference on Distributed Computer Systems) • NSDI (USENIX Symposium on Network System Design and Implementation) • PODC (ACM Symposium on Principles of Distributed Computing) • SIGCOMM
Areas of Research Focus • Gnutella-Inspired • The “Directory Service” Problem • BitTorrent-Inspired • The “File Distribution” Problem • P2P Live Streaming • P2P and Net Neutrality
The Applications and The Problems • Napster, Gnutella, KaZaa/FastTrak, Skype • Look for a particular content/object, and find which peer has it the “directory service” problem • Challenge: how to offer a scalable directory service in a fully decentralized fashion • Arrange direct transfer from the peer the “punch a hole in the firewall” problem
Decentralized Directory Services • Structured Networks • DHT (Distributed Hash Tables) • Very active research areas from 2001 to 2004 • Limitation: lookup by keys only • Multi-Attribute DHT • Limited support for query-based lookup • Unstructured Networks • Various improvements to basic flooding based schemes
What Is a DHT? • Single-node hash table: key = Hash(name) put(key, value) get(key) -> value • How do I do this across millions of hosts on the Internet? • Distributed Hash Table
Distributed Hash Tables • Chord • CAN • Pastry • Tapastry • Symphony • Koodle • etc.
The Problem N2 N1 N3 Put (Key=“title” Value=file data…) Internet ? Client Publisher Get(key=“title”) N4 N6 N5 • Key Placement • Routing to find key
Key Placement • Traditional hashing • Nodes numbered from 1 to N • Key is placed at node (hash(key) % N) • Why Traditional Hashing have problems
Consistent Hashing: IDs • Key identifier = SHA-1(key) • Node identifier = SHA-1(IP address) • SHA-1 distributes both uniformly • How to map key IDs to node IDs?
Key 5 K5 Node 105 N105 K20 Circular 7-bit ID space N32 N90 K80 Consistent Hashing: Placement A key is stored at its successor: node with next higher ID
N120 N10 “Where is key 80?” N105 N32 “N90 has K80” N90 K80 N60 Basic Lookup
“Finger Table” Allows log(N)-time Lookups ½ ¼ 1/8 1/16 1/32 1/64 1/128 N80
Finger i Points to Successor of n+2i N120 112 ½ ¼ 1/8 1/16 1/32 1/64 1/128 N80
Lookups Take O(log(N)) Hops N5 N10 N110 K19 N20 N99 N32 Lookup(K19) N80 N60
Chord Lookup Algorithm Properties • Interface: lookup(key) IP address • Efficient: O(log N) messages per lookup • N is the total number of servers • Scalable: O(log N) state per node • Robust: survives massive failures • Simple to analyze
Related Studies on DHTs • Many variations of DHTs • Different ways to choose the fingers • Ways to make it more robust • Ways to make it more network efficient • Studies of different DHTs • What happens when peers leave aka churns? • Applications built using DHTs • Tracker-less BitTorrent • Beehive --- a P2P based DNS system
Directory Lookups: Unstructured Networks • Example: Gnutella • Support more flexible queries • Typically, precise “name” search is a small portion of all queries • Simplicity • High resilience against node failures • Problems: Scalability • Flooding # of messages ~ O(N*E)
Flooding-Based Searches • Duplication increases as TTL increases in flooding • Worst case: a node A is interrupted by N * q * degree(A) messages 1 3 2 4 6 5 7 8 . . . . . . . . . . . .
Problems with Simple TTL-Based Flooding • Hard to choose TTL: • For objects that are widely present in the network, small TTLs suffice • For objects that are rare in the network, large TTLs are necessary • Number of query messages grow exponentially as TTL grows
Idea #1: Adaptively Adjust TTL • “Expanding Ring” • Multiple floods: start with TTL=1; increment TTL by 2 each time until search succeeds • Success varies by network topology
Idea #2: Random Walk • Simple random walk • takes too long to find anything! • Multiple-walker random walk • N agents after each walking T steps visits as many nodes as 1 agent walking N*T steps • When to terminate the search: check back with the query originator once every C steps
Flexible Replication • In unstructured systems, search success is essentially about coverage: visiting enough nodes to probabilistically find the object => replication density matters • Limited node storage => what’s the optimal replication density distribution? • In Gnutella, only nodes who query an object store it => ri pi • What if we have different replication strategies?
Optimal ri Distribution • Goal: minimize ( pi/ ri ), where ri =R • Calculation: • introduce Lagrange multiplier , find ri and that minimize: ( pi/ ri ) + * ( ri - R) => - pi/ ri2 = 0 for all i => ri pi
Square-Root Distribution • General principle: to minimize ( pi/ ri ) under constraint ri =R, make ri proportional to square root of pi • Other application examples: • Bandwidth allocation to minimize expected download times • Server load balancing to minimize expected request latency
Achieving Square-Root Distribution • Suggestions from some heuristics • Store an object at a number of nodes that is proportional to the number of node visited in order to find the object • Each node uses random replacement • Two implementations: • Path replication: store the object along the path of a successful “walk” • Random replication: store the object randomly among nodes visited by the agents
KaZaa • Use Supernodes • Regular Nodes : Supernodes = 100 : 1 • Simple way to scale the system by a factor of 100
Modeling and Understanding BitTorrent • Analysis based on modeling • View it as a type of Gossip Algorithm • Usually do not model the Tit-for-Tat aspects • Assume perfectly connected networks • Statistical modeling techniques • Mostly published in PODC or SIGMETRICS • Simulation Studies • Different assumption of bottlenecks • Varying details of the modeling of the data transfer • Published in ICDCS and SIGCOMM
Studies on Effect of BitTorrent on ISPs • Observation: P2P contributes to cross-ISP traffic • SIGCOMM 2006 publication on studies in Japan backbone traffic • Attempts to improve network locality of BitTorrent-like applications • ICDCS 2006 publicatoin • Academic P2P file sharing systems • Bullet, Julia, etc.
Techniques to Alleviate the “Last Missing Piece” Problem • Apply Network Coding to pieces exchanged between peers • Pablo Rodriguez Rodriguez, Microsoft Research (recently moved to Telefonica Research) • Use a different piece-replication strategy • Dahlia Makhi, Microsoft Research • “On Collaborative Content Distribution Using Multi-Message Gossip” • Associate “age” with file segments
Network Coding • Main Feature • Allowing intermediate nodes to encode packets • Making optimal use of the available network resources • Similar Technique: Erasure Codes • Reconstructing the original content of size n from roughly a subset of any n symbols from a large universe of encoded symbols
Network Coding in P2P: The Model • Server • Dividing the file into k blocks • Uploading blocks at random to different clients • Clients (Users) • Collaborating with each other to assemble the blocks and reconstruct the original file • Exchanging information and data with only a small subset of others (neighbors) • Symmetric neighborhood and links
Network Coding in P2P • Assume a node with blocks B1, B2, …, Bk • Pick random numbers C1, C2, …, Ck • Construct new block E = C1 * B1 + C2 * B2 + … + Ck * Bk • Send E and (C1, C2, …, Ck) to neighbor • Decoding: collect enough linearly independent E’s, solve the linear system • If all nodes pick vector C randomly, chances are high that after receiving ~K blocks, can recover B1 through Bk
Motivations • Internet Applications: • PPLive, PPStream, etc. • Challenge: QoS Issues • Raw bandwidth constraints • Example: PPLive utilizes the significant bandwidth disparity between “Univeristy nodes” and “Residential nodes” • Satisfying demand of content publishers
P2P Live Streaming Can’t Stand on Its Own • P2P as a complement to IP-Multicast • Used where IP-Multicast isn’t enabled • P2P as a way to reduce server load • By sourcing parts of streams from peers, server load might be reduced by 10% • P2P as a way to reduce backbone bandwidth requirements • When core network bandwidth isn’t sufficient
It’s All TCP’s Fault • TCP: per-flow fairness • Browsers • 2-4 TCP flows per web server • Contact a few web servers at a time • Short flows • P2P applications: • Much higher number of TCP connections • Many more endpoints • Long flows
When and How to Apply Traffic Shaping • Current practice: application recognition • Needs: • An application ignostic way to trigger the traffic shaping • A clear statement to users on what happens