Beehive achieving o 1 lookup performance in p2p overlays for zipf like query distributions
Download
1 / 31

Beehive: Achieving O1 Lookup Performance in P2P Overlays for Zipf-like Query Distributions - PowerPoint PPT Presentation


  • 102 Views
  • Uploaded on

Beehive : Achieving O(1) Lookup Performance in P2P Overlays for Zipf-like Query Distributions. Venugopalan Ramasubramanian (Rama) and Emin G ü n Sirer. Cornell University. introduction. caching is widely-used to improve latency and to decrease overhead passive caching

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Beehive: Achieving O1 Lookup Performance in P2P Overlays for Zipf-like Query Distributions' - johana


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Beehive achieving o 1 lookup performance in p2p overlays for zipf like query distributions

Beehive: Achieving O(1) Lookup Performance in P2P Overlays for Zipf-like Query Distributions

Venugopalan Ramasubramanian (Rama)

and

Emin Gün Sirer

Cornell University


Introduction
introduction

  • caching is widely-used to improve latency and to decrease overhead

  • passive caching

    • caches distributed throughout the network

    • store objects that are encountered

  • not well-suited for a large-class applications


Problems with passive caching
problems with passive caching

  • no performance guarantees

  • heavy-tail effect

    • large percentage of queries to unpopular objects

    • ad-hoc heuristics for cache management

  • introduces coherency problems

    • difficult to locate all copies

    • weak consistency model


Overview of beehive
overview of beehive

  • general replication framework for structured DHTs

    • decentralization, self-organization, resilience

  • properties

    • high performance: O(1) average lookup time

    • scalable: minimize number of replicas and reduce storage, bandwidth, and network load

    • adaptive: promptly respond to changes in popularity – flash crowds


Prefix matching dhts

0021

0112

0122

prefix-matching DHTs

object 0121

  • logbN hops

    • several RTTs on the Internet

2012


Key intuition
key intuition

  • tunable latency

    • adjust number of objects replicated at each level

  • fundamental space-time tradeoff

0021

0112

0122

2012


Analytical model
analytical model

  • optimization problem

    minimize: total number of replicas, s.t.,

    average lookup performance C

  • configurable target lookup performance

    • continuous range, sub one-hop

  • minimizing number of replicas decreases storage and bandwidth overhead


Analytical model1
analytical model

  • zipf-like query distributions with parameter 

    • number of queries to rth popular object  1/r

    • fraction of queries for m most popular objects 

      (m1- - 1) / (M1- - 1)

  • level of replication

    • nodes share i prefix-digits with the object

    • i hop lookup latency

    • replicated on N/bi nodes


Optimization problem
optimization problem

minimize (storage/bandwidth)

x0 + x1/b + x2/b2 + … + xK-1/bK-1

such that (average lookup time is C hops)

K – (x01- + x11- + x21- + … + xK-11-)  C

and

x0  x1  x2  …  xK-1  1

b: base K: logb(N)

xi: fraction of objects replicated at level i


Optimal closed form solution

1

[

]

1 - 

dj (K’ – C)

1 + d + … + dK’-1

optimal closed-form solution

, 0  i  K’ – 1

x*i =

, K’  i  K

1

where, d = b(1- ) /

K’ is determined by setting (typically 2 or 3)

x*K’-1 1  dK’-1 (K’ – C) / (1 + d + … + dK’-1)  1



Beehive system overview
beehive: system overview

  • estimation

    • popularity of objects, zipf parameter

    • local measurement, limited aggregation

  • replication

    • apply analytical model independently at each node

    • push new replicas to nodes at most one hop away


Beehive replication protocol

L 2

0 1 *

B

0 1 *

E

0 1 *

I

0 *

L 1

0 *

0 *

0 *

0 *

0 *

0 *

0 *

0 *

A

B

C

D

E

F

G

H

I

beehive replication protocol

home node

object 0121

L 3

E

0 1 2 *


Mutable objects
mutable objects

  • leverage the underlying structure of DHT

    • replication level indicates the locations of all the replicas

  • proactive propagation to all nodes from the home node

    • home node sends to one-hop neighbors with i matching prefix-digits

    • level i nodes send to level i+1 nodes


Implementation and evaluation
implementation and evaluation

  • implemented using Pastry as the underlying DHT

  • evaluation using a real DNS workload

    • MIT DNS trace (zipf parameter 0.91)

    • 1024 nodes, 40960 objects

    • compared with passive caching on pastry

  • main properties evaluated

    • lookup performance

    • storage and bandwidth overhead

    • adaptation to changes in query distribution


Evaluation lookup performance
evaluation: lookup performance

passive caching is not very effective because of heavy tail query distribution and mutable objects.

beehive converges to the target of 1 hop


Evaluation overhead
evaluation: overhead

Storage

Bandwidth


Evaluation flash crowds
evaluation: flash crowds

lookup performance



Cooperative domain name system codons
Cooperative Domain Name System (CoDoNS)

  • replacement for legacy DNS

    • secure authentication through DNSSEC

  • incremental deployment path

    • completely transparent to clients

    • uses legacy DNS to populate resource records on demand

  • deployed on planet-lab


Advantages of codons
advantages of CoDoNS

  • higher performance than legacy DNS

    • median latency of 7 ms for codons (planet-lab), 39 ms for legacy DNS

  • resilience against denial of service attacks

    • self configuration after host and network failures

  • fast update propagation


Conclusions
conclusions

  • model-driven proactive caching

    • O(1) lookup performance with optimal replicas

  • beehive: a general replication framework

    • structured overlays with uniform fan-out

    • high performance, resilience, improved availability

  • well-suited for latency sensitive applications

    www.cs.cornell.edu/people/egs/beehive






Typical values of zipf parameter
typical values of zipf parameter

  • MIT DNS trace:  = 0.91

  • Web traces:




Security issues in beehive
security issues in beehive

  • underlying DHT

    • corruption in routing tables

    • [Castro, Druschel, Ganesh, Rowstrom, Wallach]

  • beehive

    • misrepresentation of popularity

    • remove outliers

  • application

    • corruption of data

    • certificates (ex. DNS-SEC)



ad