on the placement of web server replicas
Download
Skip this Video
Download Presentation
On the Placement of Web Server Replicas

Loading in 2 Seconds...

play fullscreen
1 / 21

On the Placement of Web Server Replicas - PowerPoint PPT Presentation


  • 66 Views
  • Uploaded on

On the Placement of Web Server Replicas. Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001, Anchorage, AK, April 2001. Outline. Overview Related work Our approach Simulation methodology & results Summary. Motivation.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'On the Placement of Web Server Replicas' - edric


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
on the placement of web server replicas

On the Placement of Web Server Replicas

Lili Qiu, Microsoft Research

Venkata N. Padmanabhan, Microsoft Research

Geoffrey M. Voelker, UCSD

IEEE INFOCOM’2001, Anchorage, AK, April 2001

outline
Outline
  • Overview
  • Related work
  • Our approach
  • Simulation methodology & results
  • Summary
motivation
Motivation
  • Growing interests in Web server replicas
    • Exponential growth in Web usage
    • Content providers want to offer better service at lower cost
    • Solution: replication
  • Forms of Web server replicas
    • Mirror sites
    • Content Distribution Networks (CDNs)
      • CDN: a network of servers
      • Examples: Akamai, Digital Island

Internet

replica

replica

replica

replica

replica

Content Providers

Clients

placement of web server replicas
Placement of Web Server Replicas
  • Problem specification
    • Among a set of N potential sites, pick K sites as replicas to minimize users’ latency or bandwidth usage

Internet

Content

Providers

Clients

related work
Related Work
  • Placement of Web proxies [LGI+99]
  • Cache location [KRS00]
  • Placement of Internet instrumentation [JJJ+00]
our approach
Our Approach
  • Model Internet as a graph
  • Parameterize the graph using measured inputs
    • # requests generated from each region
    • Distance between different regions
  • Map the placement problem onto a graph optimization problem
    • Assumption:
      • Each client uses a single replica that is closest to it
  • Solve graph optimization problem
    • Using various approximation algorithms
minimum k median problem
Minimum K-median Problem
  • Given a complete graph G=(V,E), d(j), c(i,j)
    • d(j): # requests
    • c(i,j): distance between node i and j
      • Latency
      • or hop counts
      • or other metric to be optimized
  • Find a subset V’ V with |V’| = K s.t. it minimizes

vVminwV’d(v)c(v,w)

  • NP-hard problem

8

7

4

5

3

2

2

2

4

8

6

3

5

10

6

placement algorithms
Placement Algorithms
  • Tree based algorithm [LGG+99]
    • Assume the underlying topologies are trees, and model it as a dynamic programming problem
    • O(N3M2) for choosing M replicas among N potential places
  • Random
    • Pick the best among several random assignments
  • Hot spot
    • Place replicas near the clients that generate the largest load
placement algorithms cont
Placement Algorithms (Cont.)
  • Greedy algorithm
    • Calculate costs of assigning clients to replicas
    • Select replica with lowest cost
    • Adjust costs based upon assignment, repeat until done
  • Super-Optimal algorithm
    • Lagrangian relaxation + subgradient method
simulation methodology
Simulation Methodology
  • Network topology
    • Randomly generated topologies
      • Using GT-ITM Internet topology generator
    • Real Internet network topology
      • AS level topology obtained using BGP routing data from a set of seven geographically dispersed BGP peers
  • Web Workload
    • Real server traces
      • MSNBC, ClarkNet, NASA Kennedy Space Center
  • Performance Metric
    • Relative performance: costpractical/costsuper-optimal
simulation methodology cont
Simulation Methodology (Cont.)
  • Simulate a network of N nodes (100  N  3000)
    • Cluster clients using network aware clustering [KW00]
      • IP addresses with the same address prefix belong to a cluster
      • A small number of popular clusters account for most requests
        • Top 10, 100, 1000, 3000 clusters account for about 24%, 45%, 78%, and 94% of the requests respectively
    • Pick the top N clusters
    • Map them to different nodes
simulation methodology cont1
Simulation Methodology (Cont.)
  • Random trees
  • Random graphs
  • AS-level topologies
  • Sensitivity to the error in the input
random tree topologies
Random Tree Topologies

Tree-based algorithm performs well as expected.

Greedy algorithm performs equally as well.

random graph topologies
Random Graph Topologies

The greedy and hot-spot algorithms

out-perform the tree-based algorithm.

large random graph topologies
Large Random Graph Topologies

The greedy performs the best,

and the hot-spot performs nearly as well.

as level internet topologies
AS-level Internet Topologies

The greedy performs the best,

and the hot-spot performs nearly as well.

effects of imperfect knowledge about input data
Effects of Imperfect Knowledge about Input Data
  • Predicted workload (using moving window average)
  • Perfect topology information

Within 5% degradation when using predicted workload

effects of imperfect knowledge about input data cont
Effects of Imperfect Knowledge about Input Data (Cont.)
  • Predicted workload (using moving window average)
  • Noisy topology information
    • Perturb the distance between two nodes i and j by up to a factor of 2

Within 15% degradation when using

predicted workload and noisy topology information

summary
Summary
  • One of the first experimental studies on placement of Web server replicas
  • Knowledge about client workload and topology is needed for provisioning replicas
  • The greedy algorithm performs very well
    • Within a factor of 1.1 – 1.5 of the super-optimal
    • Insensitive to noise
      • Stay within a factor of 2 of the super-optimal when the salted error is a factor of 4
  • The hot spot algorithm performs nearly as well
    • Within a factor of 1.6 – 2 of the super-optimal
  • Obtaining input data
    • Moving window average for load prediction
    • Using BGP router data to obtain topology information
conclusion
Conclusion
  • Recommend using the greedy algorithm for deciding the placement of Web server replicas
acknowledgement
Acknowledgement
  • Craig Labovitz
  • Yin Zhang
  • Ravi Kumar
ad