squirrel a peer to peer web cache l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Squirrel: A peer-to-peer web cache PowerPoint Presentation
Download Presentation
Squirrel: A peer-to-peer web cache

Loading in 2 Seconds...

play fullscreen
1 / 43

Squirrel: A peer-to-peer web cache - PowerPoint PPT Presentation


  • 201 Views
  • Uploaded on

Squirrel: A peer-to-peer web cache. Sitaram Iyer (Rice University) Joint work with Ant Rowstron (MSR Cambridge) Peter Druschel (Rice University). PODC 2002 / Sitaram Iyer / Tuesday July 23 / Monterey, CA. Web Caching. Latency, External traffic, Load on web servers and routers.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Squirrel: A peer-to-peer web cache' - libitha


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
squirrel a peer to peer web cache

Squirrel: A peer-to-peer web cache

Sitaram Iyer (Rice University)

Joint work with

Ant Rowstron (MSR Cambridge)

Peter Druschel (Rice University)

PODC 2002 / Sitaram Iyer / Tuesday July 23 / Monterey, CA

web caching
Web Caching
  • Latency,
  • External traffic,
  • Load on web servers and routers.

Deployed at: Corporate network boundaries, ISPs, Web Servers, etc.

web cache
Web Cache

Browser

Cache

Browser

Web

Server

Client

Centralized Web Cache

Browser

Cache

Browser

Client

Corporate LAN

Internet

cooperative web cache
Cooperative Web Cache

Browser

Cache

Web Cache

Browser

Web Cache

Web

Server

Client

Web Cache

Web Cache

Web Cache

Browser

Cache

Browser

Client

Corporate LAN

Internet

decentralized web cache
Decentralized Web Cache

Squirrel

Browser

Cache

Browser

Web

Server

Client

Browser

Cache

Browser

Client

Corporate LAN

Internet

distributed hash table
Distributed Hash Table

Peer-to-peer location service: Pastry

nodes

k1,v1

k2,v2

k3,v3

Operations:

Insert(k,v)

Lookup(k)

Peer-to-peer routing and location substrate

k4,v4

k5,v5

k6,v6

<key,value>

  • Completely decentralized and self-organizing
  • Fault-tolerant, scalable, efficient
why peer to peer
Why peer-to-peer?
  • Cost of dedicated web cache

No additional hardware

  • Administrative effort

Self-organizing network

  • Scaling implies upgrading

Resources grow with clients

setting
Setting
  • Corporate LAN
  • 100 - 100,000 desktop machines
  • Located in a single building or campus
  • Each node runs an instance of Squirrel
  • Sets it as the browser’s proxy
mapping squirrel onto pastry
Mapping Squirrel onto Pastry

Two approaches:

  • Home-store
  • Directory
home store model

Internet

LAN

Home-store model

client

URL hash

home

home store model11
Home-store model

client

home

…that’s how it works!

directory model
Directory model

Client nodes always cache objects locally.

Home-store: home node also stores objects.

Directory: the home node only stores pointers to recent clients, and forwards requests.

directory model14
Directory model

client

home

Randomly choose entry from table

directory advantages
Directory: Advantages

Avoids storing unnecessary copies of objects.

Rapidly changing directory for popular objects seems to improve load balancing.

Home-store scheme can incur hotspots.

directory disadvantages
Directory: Disadvantages

Cache insertion only happens at clients, so:

  • active clients store all the popular objects,
  • inactive clients waste most of their storage.

Implications:

  • Reduced cache size.
  • Load imbalance.
directory load spike example
Directory: Load spike example
  • Web page with many embedded images, or
  • Periods of heavy browsing.

Many home nodes point to such clients!

Evaluate …

total external traffic
Total external traffic

Redmond

105

No web cache

100

95

Directory

Total external traffic (GB)

[lower is better]

Home-store

90

Centralized cache

85

0.001

0.01

0.1

1

10

100

Per-node cache size (in MB)

total external traffic20
Total external traffic

Cambridge

6.1

No web cache

6

5.9

Directory

Total external traffic (GB)

5.8

[lower is better]

Home-store

5.7

5.6

Centralized cache

5.5

0.001

0.01

0.1

1

10

100

Per-node cache size (in MB)

lan hops
LAN Hops

Redmond

100%

80%

60%

% of cacheable requests

40%

20%

0%

0

1

2

3

4

5

6

Total hops within the LAN

Centralized

Home-store

Directory

lan hops22
LAN Hops

Cambridge

100%

80%

60%

% of cacheable requests

40%

20%

0%

0

1

2

3

4

5

Total hops within the LAN

Centralized

Home-store

Directory

load in requests per sec
Load in requests per sec

Redmond

100000

Home-store

Directory

10000

1000

Number of times observed

100

10

1

0

10

20

30

40

50

Max objects served per-node / second

load in requests per sec24
Load in requests per sec

Cambridge

1e+07

Home-store

Directory

1e+06

100000

10000

Number of times observed

1000

100

10

1

0

10

20

30

40

50

Max objects served per-node / second

load in requests per min
Load in requests per min

Redmond

100

Home-store

Directory

10

Number of times observed

1

0

50

100

150

200

250

300

350

Max objects served per-node / minute

load in requests per min26
Load in requests per min

Cambridge

Home-store

Directory

10000

1000

Number of times observed

100

10

1

0

20

40

60

80

100

120

Max objects served per-node / minute

fault tolerance
Fault tolerance

Sudden node failures result in

partial loss of cached content.

Home-store: Proportional to failed nodes.

Directory: More vulnerable.

fault tolerance28
Fault tolerance

If 1% of Squirrel nodes abruptly crash, the fraction of lost cached content is:

conclusions
Conclusions
  • Possible to decentralize web caching.
  • Performance comparable to a centralized web cache,
  • Is better in terms of cost, scalability, and administration effort, and
  • Under our assumptions, the home-store scheme is superior to the directory scheme.
other aspects of squirrel
Other aspects of Squirrel
  • Adaptive replication
    • Hotspot avoidance
    • Improved robustness
  • Route caching
    • Fewer LAN hops
backup full home store protocol
(backup) Full home-store protocol

other

req

other

req

req

(LAN)

(WAN)

a : object or notmod from home

b : req

home

client

1

b

b : object or notmod from origin

2

3

origin

server

backup full directory protocol

other

req

other

req

req

a : no dir, go to origin. Also d

a , d : req

1

1

home

2

2

client

b : not-modified

dir

a , d

origin

c ,e : object

3

3

2

4

c ,e : req

server

1

1

dele-

gate

object or

e

3

not-modified

origin

e : cGET req

2

server

(backup) Full directory protocol
backup peer to peer computing
(backup) Peer-to-peer Computing

Decentralize a distributed protocol:

  • Scalable
  • Self-organizing
  • Fault tolerant
  • Load balanced

Not automatic!!

decentralized web cache37
Decentralized Web Cache

Browser

Cache

Browser

Web

Server

Browser

Cache

Browser

Internet

LAN

challenge
Challenge

Decentralized web caching algorithm:

Need to achieve those benefits in practice!

Need to keep overhead unnoticeably low.

Node failures should not become significant.

peer to peer routing e g pastry
Peer-to-peer routing, e.g., Pastry

Peer-to-peer object location and routing substrate = Distributed Hash Table.

Reliably maps an object key to a live node.

Routes in log16(N)steps

(e.g. 3-4 steps for 100,000 nodes)

home store is better
Home-store is better!

Simpler home-store scheme achieves load balancing by hash function randomization.

Directory scheme implicitly relies on access patterns for load distribution.

directory scheme seems better
Directory scheme seems better…

Avoids storing unnecessary copies of objects.

Rapidly changing directory for popular objects results in load balancing.

interesting difference
Interesting difference

Consider:

  • Web page with many images, or
  • Heavily browsing node

Directory: many pointers to some node.

Home-store: natural load balancing.

Evaluate …

fault tolerance43
Fault tolerance

When a single Squirrel node crashes, the fraction of lost cached content is: