310 likes | 384 Views
Explore how the Beehive protocol improves lookup times in P2P networks through adaptive caching and structured DHTs, achieving O(1) performance. Learn about the analytical models, replication levels, and proactive caching strategies used to optimize storage and bandwidth overhead. Evaluate the benefits over legacy DNS with CoDoNS and the advantages of resilience and fast updates. Discover a model-driven approach for improving latency-sensitive applications.
E N D
Beehive: Achieving O(1) Lookup Performance in P2P Overlays for Zipf-like Query Distributions Venugopalan Ramasubramanian (Rama) and Emin Gün Sirer Cornell University
introduction • caching is widely-used to improve latency and to decrease overhead • passive caching • caches distributed throughout the network • store objects that are encountered • not well-suited for a large-class applications
problems with passive caching • no performance guarantees • heavy-tail effect • large percentage of queries to unpopular objects • ad-hoc heuristics for cache management • introduces coherency problems • difficult to locate all copies • weak consistency model
overview of beehive • general replication framework for structured DHTs • decentralization, self-organization, resilience • properties • high performance: O(1) average lookup time • scalable: minimize number of replicas and reduce storage, bandwidth, and network load • adaptive: promptly respond to changes in popularity – flash crowds
0021 0112 0122 prefix-matching DHTs object 0121 • logbN hops • several RTTs on the Internet 2012
key intuition • tunable latency • adjust number of objects replicated at each level • fundamental space-time tradeoff 0021 0112 0122 2012
analytical model • optimization problem minimize: total number of replicas, s.t., average lookup performance C • configurable target lookup performance • continuous range, sub one-hop • minimizing number of replicas decreases storage and bandwidth overhead
analytical model • zipf-like query distributions with parameter • number of queries to rth popular object 1/r • fraction of queries for m most popular objects (m1- - 1) / (M1- - 1) • level of replication • nodes share i prefix-digits with the object • i hop lookup latency • replicated on N/bi nodes
optimization problem minimize (storage/bandwidth) x0 + x1/b + x2/b2 + … + xK-1/bK-1 such that (average lookup time is C hops) K – (x01- + x11- + x21- + … + xK-11-) C and x0 x1 x2 … xK-1 1 b: base K: logb(N) xi: fraction of objects replicated at level i
1 [ ] 1 - dj (K’ – C) 1 + d + … + dK’-1 optimal closed-form solution , 0 i K’ – 1 x*i = , K’ i K 1 where, d = b(1- ) / K’ is determined by setting (typically 2 or 3) x*K’-1 1 dK’-1 (K’ – C) / (1 + d + … + dK’-1) 1
beehive: system overview • estimation • popularity of objects, zipf parameter • local measurement, limited aggregation • replication • apply analytical model independently at each node • push new replicas to nodes at most one hop away
L 2 0 1 * B 0 1 * E 0 1 * I 0 * L 1 0 * 0 * 0 * 0 * 0 * 0 * 0 * 0 * A B C D E F G H I beehive replication protocol home node object 0121 L 3 E 0 1 2 *
mutable objects • leverage the underlying structure of DHT • replication level indicates the locations of all the replicas • proactive propagation to all nodes from the home node • home node sends to one-hop neighbors with i matching prefix-digits • level i nodes send to level i+1 nodes
implementation and evaluation • implemented using Pastry as the underlying DHT • evaluation using a real DNS workload • MIT DNS trace (zipf parameter 0.91) • 1024 nodes, 40960 objects • compared with passive caching on pastry • main properties evaluated • lookup performance • storage and bandwidth overhead • adaptation to changes in query distribution
evaluation: lookup performance passive caching is not very effective because of heavy tail query distribution and mutable objects. beehive converges to the target of 1 hop
evaluation: overhead Storage Bandwidth
evaluation: flash crowds lookup performance
Cooperative Domain Name System (CoDoNS) • replacement for legacy DNS • secure authentication through DNSSEC • incremental deployment path • completely transparent to clients • uses legacy DNS to populate resource records on demand • deployed on planet-lab
advantages of CoDoNS • higher performance than legacy DNS • median latency of 7 ms for codons (planet-lab), 39 ms for legacy DNS • resilience against denial of service attacks • self configuration after host and network failures • fast update propagation
conclusions • model-driven proactive caching • O(1) lookup performance with optimal replicas • beehive: a general replication framework • structured overlays with uniform fan-out • high performance, resilience, improved availability • well-suited for latency sensitive applications www.cs.cornell.edu/people/egs/beehive
typical values of zipf parameter • MIT DNS trace: = 0.91 • Web traces:
security issues in beehive • underlying DHT • corruption in routing tables • [Castro, Druschel, Ganesh, Rowstrom, Wallach] • beehive • misrepresentation of popularity • remove outliers • application • corruption of data • certificates (ex. DNS-SEC)