1 / 31

Beehive : Achieving O(1) Lookup Performance in P2P Overlays for Zipf-like Query Distributions

Beehive : Achieving O(1) Lookup Performance in P2P Overlays for Zipf-like Query Distributions. Venugopalan Ramasubramanian (Rama) and Emin G ü n Sirer. Cornell University. introduction. caching is widely-used to improve latency and to decrease overhead passive caching

evelyn
Download Presentation

Beehive : Achieving O(1) Lookup Performance in P2P Overlays for Zipf-like Query Distributions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beehive: Achieving O(1) Lookup Performance in P2P Overlays for Zipf-like Query Distributions Venugopalan Ramasubramanian (Rama) and Emin Gün Sirer Cornell University

  2. introduction • caching is widely-used to improve latency and to decrease overhead • passive caching • caches distributed throughout the network • store objects that are encountered • not well-suited for a large-class applications

  3. problems with passive caching • no performance guarantees • heavy-tail effect • large percentage of queries to unpopular objects • ad-hoc heuristics for cache management • introduces coherency problems • difficult to locate all copies • weak consistency model

  4. overview of beehive • general replication framework for structured DHTs • decentralization, self-organization, resilience • properties • high performance: O(1) average lookup time • scalable: minimize number of replicas and reduce storage, bandwidth, and network load • adaptive: promptly respond to changes in popularity – flash crowds

  5. 0021 0112 0122 prefix-matching DHTs object 0121 • logbN hops • several RTTs on the Internet 2012

  6. key intuition • tunable latency • adjust number of objects replicated at each level • fundamental space-time tradeoff 0021 0112 0122 2012

  7. analytical model • optimization problem minimize: total number of replicas, s.t., average lookup performance C • configurable target lookup performance • continuous range, sub one-hop • minimizing number of replicas decreases storage and bandwidth overhead

  8. analytical model • zipf-like query distributions with parameter  • number of queries to rth popular object  1/r • fraction of queries for m most popular objects  (m1- - 1) / (M1- - 1) • level of replication • nodes share i prefix-digits with the object • i hop lookup latency • replicated on N/bi nodes

  9. optimization problem minimize (storage/bandwidth) x0 + x1/b + x2/b2 + … + xK-1/bK-1 such that (average lookup time is C hops) K – (x01- + x11- + x21- + … + xK-11-)  C and x0  x1  x2  …  xK-1  1 b: base K: logb(N) xi: fraction of objects replicated at level i

  10. 1 [ ] 1 -  dj (K’ – C) 1 + d + … + dK’-1 optimal closed-form solution , 0  i  K’ – 1 x*i = , K’  i  K 1 where, d = b(1- ) / K’ is determined by setting (typically 2 or 3) x*K’-1 1  dK’-1 (K’ – C) / (1 + d + … + dK’-1)  1

  11. latency - overhead trade off

  12. beehive: system overview • estimation • popularity of objects, zipf parameter • local measurement, limited aggregation • replication • apply analytical model independently at each node • push new replicas to nodes at most one hop away

  13. L 2 0 1 * B 0 1 * E 0 1 * I 0 * L 1 0 * 0 * 0 * 0 * 0 * 0 * 0 * 0 * A B C D E F G H I beehive replication protocol home node object 0121 L 3 E 0 1 2 *

  14. mutable objects • leverage the underlying structure of DHT • replication level indicates the locations of all the replicas • proactive propagation to all nodes from the home node • home node sends to one-hop neighbors with i matching prefix-digits • level i nodes send to level i+1 nodes

  15. implementation and evaluation • implemented using Pastry as the underlying DHT • evaluation using a real DNS workload • MIT DNS trace (zipf parameter 0.91) • 1024 nodes, 40960 objects • compared with passive caching on pastry • main properties evaluated • lookup performance • storage and bandwidth overhead • adaptation to changes in query distribution

  16. evaluation: lookup performance passive caching is not very effective because of heavy tail query distribution and mutable objects. beehive converges to the target of 1 hop

  17. evaluation: overhead Storage Bandwidth

  18. evaluation: flash crowds lookup performance

  19. evaluation: zipf parameter change

  20. Cooperative Domain Name System (CoDoNS) • replacement for legacy DNS • secure authentication through DNSSEC • incremental deployment path • completely transparent to clients • uses legacy DNS to populate resource records on demand • deployed on planet-lab

  21. advantages of CoDoNS • higher performance than legacy DNS • median latency of 7 ms for codons (planet-lab), 39 ms for legacy DNS • resilience against denial of service attacks • self configuration after host and network failures • fast update propagation

  22. conclusions • model-driven proactive caching • O(1) lookup performance with optimal replicas • beehive: a general replication framework • structured overlays with uniform fan-out • high performance, resilience, improved availability • well-suited for latency sensitive applications www.cs.cornell.edu/people/egs/beehive

  23. evaluation: zipf parameter change

  24. evaluation: instantaneous bandwidth overhead

  25. lookup performance: target 0.5 hops

  26. lookup performance: planet-lab

  27. typical values of zipf parameter • MIT DNS trace:  = 0.91 • Web traces:

  28. comparative overview of structured DHTs

  29. O(1) structured DHTs

  30. security issues in beehive • underlying DHT • corruption in routing tables • [Castro, Druschel, Ganesh, Rowstrom, Wallach] • beehive • misrepresentation of popularity • remove outliers • application • corruption of data • certificates (ex. DNS-SEC)

  31. Beehive DNS: Lookup Performance

More Related