Squirrel: A peer-to-peer web cache - PowerPoint PPT Presentation

Squirrel a peer to peer web cache l.jpg
Download
1 / 43

  • 284 Views
  • Updated On :
  • Presentation posted in: Pets / Animals

Squirrel: A peer-to-peer web cache. Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University). PODC 2002 / Sitaram Iyer / Tuesday July 23 / Monterey, CA. Web Caching. Latency, External traffic, Load on web servers and routers.

Related searches for Squirrel: A peer-to-peer web cache

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Squirrel: A peer-to-peer web cache

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Squirrel a peer to peer web cache l.jpg

Squirrel: A peer-to-peer web cache

Sitaram Iyer (Rice University)

Joint work with

Ant Rowstron (MSR Cambridge)

Peter Druschel (Rice University)

PODC 2002 / Sitaram Iyer / Tuesday July 23 / Monterey, CA


Web caching l.jpg

Web Caching

  • Latency,

  • External traffic,

  • Load on web servers and routers.

    Deployed at: Corporate network boundaries, ISPs, Web Servers, etc.


Web cache l.jpg

Web Cache

Browser

Cache

Browser

Web

Server

Client

Centralized Web Cache

Browser

Cache

Browser

Client

Corporate LAN

Internet


Cooperative web cache l.jpg

Cooperative Web Cache

Browser

Cache

Web Cache

Browser

Web Cache

Web

Server

Client

Web Cache

Web Cache

Web Cache

Browser

Cache

Browser

Client

Corporate LAN

Internet


Decentralized web cache l.jpg

Decentralized Web Cache

Squirrel

Browser

Cache

Browser

Web

Server

Client

Browser

Cache

Browser

Client

Corporate LAN

Internet


Distributed hash table l.jpg

Distributed Hash Table

Peer-to-peer location service: Pastry

nodes

k1,v1

k2,v2

k3,v3

Operations:

Insert(k,v)

Lookup(k)

Peer-to-peer routing and location substrate

k4,v4

k5,v5

k6,v6

<key,value>

  • Completely decentralized and self-organizing

  • Fault-tolerant, scalable, efficient


Why peer to peer l.jpg

Why peer-to-peer?

  • Cost of dedicated web cache

    No additional hardware

  • Administrative effort

    Self-organizing network

  • Scaling implies upgrading

    Resources grow with clients


Setting l.jpg

Setting

  • Corporate LAN

  • 100 - 100,000 desktop machines

  • Located in a single building or campus

  • Each node runs an instance of Squirrel

  • Sets it as the browser’s proxy


Mapping squirrel onto pastry l.jpg

Mapping Squirrel onto Pastry

Two approaches:

  • Home-store

  • Directory


Home store model l.jpg

Internet

LAN

Home-store model

client

URL hash

home


Home store model11 l.jpg

Home-store model

client

home

…that’s how it works!


Directory model l.jpg

Directory model

Client nodes always cache objects locally.

Home-store: home node also stores objects.

Directory: the home node only stores pointers to recent clients, and forwards requests.


Directory model13 l.jpg

Internet

LAN

Directory model

client

home


Directory model14 l.jpg

Directory model

client

home

Randomly choose entry from table


Directory advantages l.jpg

Directory: Advantages

Avoids storing unnecessary copies of objects.

Rapidly changing directory for popular objects seems to improve load balancing.

Home-store scheme can incur hotspots.


Directory disadvantages l.jpg

Directory: Disadvantages

Cache insertion only happens at clients, so:

  • active clients store all the popular objects,

  • inactive clients waste most of their storage.

    Implications:

  • Reduced cache size.

  • Load imbalance.


Directory load spike example l.jpg

Directory: Load spike example

  • Web page with many embedded images, or

  • Periods of heavy browsing.

    Many home nodes point to such clients!

Evaluate …


Trace characteristics l.jpg

Trace characteristics


Total external traffic l.jpg

Total external traffic

Redmond

105

No web cache

100

95

Directory

Total external traffic (GB)

[lower is better]

Home-store

90

Centralized cache

85

0.001

0.01

0.1

1

10

100

Per-node cache size (in MB)


Total external traffic20 l.jpg

Total external traffic

Cambridge

6.1

No web cache

6

5.9

Directory

Total external traffic (GB)

5.8

[lower is better]

Home-store

5.7

5.6

Centralized cache

5.5

0.001

0.01

0.1

1

10

100

Per-node cache size (in MB)


Lan hops l.jpg

LAN Hops

Redmond

100%

80%

60%

% of cacheable requests

40%

20%

0%

0

1

2

3

4

5

6

Total hops within the LAN

Centralized

Home-store

Directory


Lan hops22 l.jpg

LAN Hops

Cambridge

100%

80%

60%

% of cacheable requests

40%

20%

0%

0

1

2

3

4

5

Total hops within the LAN

Centralized

Home-store

Directory


Load in requests per sec l.jpg

Load in requests per sec

Redmond

100000

Home-store

Directory

10000

1000

Number of times observed

100

10

1

0

10

20

30

40

50

Max objects served per-node / second


Load in requests per sec24 l.jpg

Load in requests per sec

Cambridge

1e+07

Home-store

Directory

1e+06

100000

10000

Number of times observed

1000

100

10

1

0

10

20

30

40

50

Max objects served per-node / second


Load in requests per min l.jpg

Load in requests per min

Redmond

100

Home-store

Directory

10

Number of times observed

1

0

50

100

150

200

250

300

350

Max objects served per-node / minute


Load in requests per min26 l.jpg

Load in requests per min

Cambridge

Home-store

Directory

10000

1000

Number of times observed

100

10

1

0

20

40

60

80

100

120

Max objects served per-node / minute


Fault tolerance l.jpg

Fault tolerance

Sudden node failures result in

partial loss of cached content.

Home-store:Proportional to failed nodes.

Directory:More vulnerable.


Fault tolerance28 l.jpg

Fault tolerance

If 1% of Squirrel nodes abruptly crash, the fraction of lost cached content is:


Conclusions l.jpg

Conclusions

  • Possible to decentralize web caching.

  • Performance comparable to a centralized web cache,

  • Is better in terms of cost, scalability, and administration effort, and

  • Under our assumptions, the home-store scheme is superior to the directory scheme.


Other aspects of squirrel l.jpg

Other aspects of Squirrel

  • Adaptive replication

    • Hotspot avoidance

    • Improved robustness

  • Route caching

    • Fewer LAN hops


Thanks l.jpg

Thanks.


Backup storage utilization l.jpg

(backup) Storage utilization


Backup fault tolerance l.jpg

(backup) Fault tolerance


Backup full home store protocol l.jpg

(backup) Full home-store protocol

other

req

other

req

req

(LAN)

(WAN)

a : object or notmod from home

b : req

home

client

1

b

b : object or notmod from origin

2

3

origin

server


Backup full directory protocol l.jpg

other

req

other

req

req

a : no dir, go to origin. Also d

a , d : req

1

1

home

2

2

client

b : not-modified

dir

a , d

origin

c ,e : object

3

3

2

4

c ,e : req

server

1

1

dele-

gate

object or

e

3

not-modified

origin

e : cGET req

2

server

(backup) Full directory protocol


Backup peer to peer computing l.jpg

(backup) Peer-to-peer Computing

Decentralize a distributed protocol:

  • Scalable

  • Self-organizing

  • Fault tolerant

  • Load balanced

    Not automatic!!


Decentralized web cache37 l.jpg

Decentralized Web Cache

Browser

Cache

Browser

Web

Server

Browser

Cache

Browser

Internet

LAN


Challenge l.jpg

Challenge

Decentralized web caching algorithm:

Need to achieve those benefits in practice!

Need to keep overhead unnoticeably low.

Node failures should not become significant.


Peer to peer routing e g pastry l.jpg

Peer-to-peer routing, e.g., Pastry

Peer-to-peer object location and routing substrate = Distributed Hash Table.

Reliably maps an object key to a live node.

Routes in log16(N)steps

(e.g. 3-4 steps for 100,000 nodes)


Home store is better l.jpg

Home-store is better!

Simpler home-store scheme achieves load balancing by hash function randomization.

Directory scheme implicitly relies on access patterns for load distribution.


Directory scheme seems better l.jpg

Directory scheme seems better…

Avoids storing unnecessary copies of objects.

Rapidly changing directory for popular objects results in load balancing.


Interesting difference l.jpg

Interesting difference

Consider:

  • Web page with many images, or

  • Heavily browsing node

    Directory:many pointers to some node.

    Home-store:natural load balancing.

Evaluate …


Fault tolerance43 l.jpg

Fault tolerance

When a single Squirrel node crashes, the fraction of lost cached content is:


  • Login