Effects and implications of file size service time correlation on web server scheduling policies
This presentation is the property of its rightful owner.
Sponsored Links
1 / 30

Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies PowerPoint PPT Presentation


  • 34 Views
  • Uploaded on
  • Presentation posted in: General

Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies. Dong Lu * + Peter Dinda* Yi Qiao* Huanyuan Sheng* *Northwestern University + Ask Jeeves, Inc. Outline. Quick review of size-based scheduling Motivation and approach

Download Presentation

Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Effects and implications of file size service time correlation on web server scheduling policies

Effects and Implications of File Size/Service Time Correlation on Web Server Scheduling Policies

Dong Lu*+

Peter Dinda*

Yi Qiao*

Huanyuan Sheng*

*Northwestern University

+Ask Jeeves, Inc.


Outline

Outline

  • Quick review of size-based scheduling

  • Motivation and approach

  • Correlation between file size and service time: a measurement study

  • Performance of SRPT scheduling under real workload

  • Domain-based scheduling


Quick review of size based scheduling

Quick Review of Size-based Scheduling

  • SRPT

    • Shortest Remaining Processing Time

    • Assuming perfect knowledge of service times

  • FSP

    • Fair Sojourn Protocol

    • Assuming perfect knowledge of service times

  • Typical non-size-based scheduling

    • Processor Sharing (PS)

    • First Come First Serve (FCFS)


Effects and implications of file size service time correlation on web server scheduling policies

SRPT

  • Always serve the job with minimum remaining processing time first, preemptive scheduling

    • Performance: Minimum mean response time [Schrage, Operations Research, 1968]

    • Fairness: performance gains of SRPT over PS do not usually come at the expense of large jobs, in other words, it is fair for heavy-tail job size distribution [Bansal and Harchol-Balter, Sigmetrics ‘01]


Effects and implications of file size service time correlation on web server scheduling policies

FSP

  • Combined SRPT with PS, preemptive scheduling. [Friedman, et al, Sigmetrics ‘03]

    • SRPT + the longer a job stay in the queue, the higher its priority

    • Performance: Mean response time is close to that of SRPT

    • Fairness: Fairer than PS


Outline1

Outline

  • Quick review of size-based scheduling

  • Motivation and approach

  • Correlation between file size and service time: a measurement study

  • Performance of SRPT scheduling under real workload

  • Domain-based scheduling


Motivation

Motivation

  • Current implementation of SRPT and FSP

    • Use file size as service time (sorting jobs using file size)

  • Is file size a good estimator of service time?

  • What is the performance of SRPT and FSP using file size as service time? And how to improve?

Service time: the time needed to send

requested data in the absence of other

requests in the system


Trace driven simulation

Trace-driven Simulation

  • Simulator:

    • C++

    • Supports G/G/n/m queuing model

    • Driven by enhanced web server traces

    • Validation

      • Little’s law

      • Repeat the simulations in the FSP paper [Friedman, et al, Sigmetrics ‘03]

      • Compare with available theoretical results [Bansal and Harchol-Balter, Sigmetrics ‘01]


Scheduling policies studied

Scheduling Policies Studied

  • SRPT: Ideal SRPT

  • SRPT-FS: File size as service time

  • SRPT-D: Domain-estimated service time

  • FSP: Ideal FSP

  • FSP-FS: File size as service time

  • FSP-D: Domain-estimated service time

  • PS: Processor sharing


Outline2

Outline

  • Quick review of size-based scheduling

  • Our approach and questions answered

  • Correlation between file size and service time: a measurement study

  • Performance of SRPT-FS and FSP-FS scheduling under real workload

  • Domain-based scheduling


Correlation is weak on a typical web server

R ≈ 0.14

File Size

Service time

Correlation is Weak on a Typical Web Server

  • Measurement on departmental web server: Scatter plot of file size versus service time (log-log scale)

Request from the whole Internet


Correlation is weak on web cache servers

1.0

P[R>x]

0.5

0

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Correlation Coefficient R Between File size and Service time

Correlation is Weak on Web Cache Servers

  • Measurement on 10 Squid web cache servers:

    • www.ircache.net


Main reason for the weak correlation

Main reason for the weak correlation

  • End-to-end path diversity

Web Server

Client 4

Client 3

Client 1

Client 2


Outline3

Outline

  • Quick review of size-based scheduling

  • Our approach and questions answered

  • Correlation between file size and service time: a measurement study

  • Performance of SRPT-FS and FSP-FS scheduling under real workload

  • Domain-based scheduling


Mean response time much worse than expected

Mean Response Time Much Worse Than Expected

Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load).

Mean Response Time (millisec)

900

700

SRPT-FS

500

FSP-FS

PS

300

Ideal SRPT

and FSP

100

0

0.5

1.0

1.5

2.0

Load on the queue


Mean queue length much worse than expected

0

0.5

1.0

1.5

2.0

Mean Queue Length Much Worse Than Expected

Simulation driven by web server trace. G/G/1/m. Pareto arrivals (rate controlled to tune the load).

5000

Mean Queue Length

4000

SRPT-FS

PS

FSP-FS

3000

2000

Ideal SRPT

and FSP

1000

Load on the queue


Outline4

Outline

  • Quick review of size-based scheduling

  • Our approach and questions answered

  • Correlation between file size and service time: a measurement study

  • Performance of SRPT-FS and FSP-FS scheduling under real workload

  • Domain-based scheduling


Requirements for a better service time estimator

Requirements For A Better Service Time Estimator

  • Low overhead

    • Passive measurement

    • Low computation complexity

    • Low / adjustable memory usage

  • Effective

    • Approximate the correct ordering of the service times. High correlation.


Domain based estimator

Domain-based estimator

  • Divide Internet into smaller “domains” by leveraging CIDR (Classless Inter-domain Routing)

  • Hosts in the same domain are likely to share same/similar routes to web server, and thus similar throughput

Web Server


Supporting facts

Supporting Facts

  • Statistical Internet stability and locality

    • Routing stability [Paxson, Sigcomm 1996]

    • TCP throughput locality and stability [Balakrishnan, et al, Sigmetrics 1997]; [Seshan, et al, USITS 1997]; [Myers, et al, Infocom 1999]

  • Classless Inter-domain Routing

    • implies that routes from machines in the domain to a server outside the domain will share many hops.


Algorithm

Algorithm

  • Use high order k bits of client IP address to classify clients into 2k domains

  • For each domain, calculate R = F/S

    • R: representative service rate

    • F: sum of file sizes delivered to domain

    • S: sum of corresponding service times

  • For each request, first extract its domain, then service time can be estimated as B/R

    • B: requested file size

    • R: representative service rate obtained before


Higher correlation can be achieved

0.7

Correlation Coefficient R

0.5

0.3

0.1

0

8

16

24

32

Bits used to define a domain

Higher Correlation Can Be Achieved


Much lower service times can be achieved

Much Lower Service TimesCan Be Achieved

FSP-D

900

FSP-FS

Mean Response time (milisec)

700

500

SRPT-FS

SRPT-D

300

PS

100

0

8

16

24

32

Bits used to define a domain

SRPT and FSP


Much lower queue lengths can be achieved

Much Lower Queue LengthsCan Be Achieved

3000

FSP-D

FSP-FS

Mean queue length

2000

SRPT-FS

SRPT-D

1000

PS

0

5

10

15

20

25

30

35

Bits used to define a domain

SRPT and FSP


Conclusions

Conclusions

  • File size may not be a good estimator of service time for many regimes

  • File size-based SRPT and FSP can perform worse than PS in these regimes

  • Domain-based scheduling brings the benefits of size-based scheduling to these regimes


For more information

For more information

  • Prescience Lab at Northwestern University

    • www.presciencelab.org


Jeeves invitation

Jeeves’ Invitation …

  • Have you ever seen the whole Web at once?

  • Did you ever wonder how to rein the power of thousands of machines?

  • We are hiring talents for Internet Search

    • Software Engineer

    • Development Manager

Send us your Resume:

[email protected]


Correlation is weak on a typical web server1

R ≈ 0.14

File Size

Service time

Correlation is Weak on a Typical Web Server

  • Measurement on departmental web server: Scatter plot of file size versus service time (log-log scale)

Request from the whole Internet

Request from a “/16” IP network

R ≈ 0.25

File Size

Service time


Future work

Future Work

  • The “back-filling” queuing model

Bandwidth

Bottleneck

4

6

3

2

7

Web Requests

5

Web Server

1

Time


  • Login