integrated resource management for cluster based internet services l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Integrated Resource Management for Cluster-based Internet Services PowerPoint Presentation
Download Presentation
Integrated Resource Management for Cluster-based Internet Services

Loading in 2 Seconds...

play fullscreen
1 / 20

Integrated Resource Management for Cluster-based Internet Services - PowerPoint PPT Presentation


  • 291 Views
  • Uploaded on

Integrated Resource Management for Cluster-based Internet Services. Kai Shen Dept. of Computer Science Univ. of Rochester. Hong Tang, Tao Yang*, Lingkun Chu Dept. of Computer Science Univ. of California, Santa Barbara * : Ask Jeeves, Inc. Background.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Integrated Resource Management for Cluster-based Internet Services' - Patman


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
integrated resource management for cluster based internet services

Integrated Resource Management for Cluster-based Internet Services

Kai Shen

Dept. of Computer Science

Univ. of Rochester

Hong Tang, Tao Yang*, Lingkun Chu

Dept. of Computer Science

Univ. of California, Santa Barbara

*: Ask Jeeves, Inc.

background
Background
  • Large-scale resource-intensive Internet services hosted on server clusters.
    • Yahoo, MSN, Google, Teoma/Ask Jeeves …
  • Challenges/requirements for resource management:
    • Scalability and robustness;
    • Online users require interactive responses;
    • Resource (CPU, IO)–hungry service processing and large user traffic require efficient resource utilization;
    • Fluctuating user traffic requires adaptive management;
    • Supporting differentiated services to different types of user requests.

OSDI 2002

architecture of targeted services document search engine
Architecture of Targeted Services:Document Search Engine

Index servers

(partition 1)

Query

caches

Firewall/

Web switch

Local-area

network

Index servers

(partition 2)

Web server/

Query handlers

Index servers

(partition 3)

Doc servers

OSDI 2002

neptune project overview
“Neptune” Project Overview
  • Programming and runtime support to aggregate and replicate stand-alone service components.
  • Building blocks forscalable and robust service constructions:
    • Functionally-symmetric clustering architecture;
    • Integrated resource management – quality, efficiency, and differentiation;
    • Replication management.

OSDI 2002

architecture of targeted services document search engine5

Neptune runtime

Neptune runtime

SAP

SAP

Architecture of Targeted Services:Document Search Engine

Index servers

(partition 1)

Query

cache

Firewall/

Web switch

Local-area

network

Index servers

(partition 2)

Web server/

Query handlers

Index servers

(partition 3)

Doc servers

OSDI 2002

neptune deployments
Neptune Deployments
  • Service deployments:
    • Web document searching;
    • BLAST – protein sequence similarity matching;
    • Prototype database services – online discussion group, auction.
  • Production system at search enginesTeoma/Ask Jeeves since 2000:
    • search indexes of more than 450M Web documents;
    • over 800 multiprocessor servers;
    • tens of millions of search queries per day.

OSDI 2002

outline
Outline
  • Project Overview
  • Integrated Resource Management
    • Multiple Resource Management Objectives
    • Two-level Mechanism
    • Trace-driven Performance Evaluation on a Linux Cluster
    • Related Work and the Conclusion

OSDI 2002

quality aware resource utilization efficiency
Quality-aware Resource Utilization Efficiency
  • Throughput: measure resource utilization efficiency.
  • Service response time: measure client-perceived service quality.
  • Aggregate service yield: measure quality-aware resource utilization efficiency.
    • Fulfillment of each service request generates quality-aware service yield – a function of service response time.
    • Service yield function– specified by service providers (flexibility).
    • System goal – maximizing aggregate service yield:

OSDI 2002

sample service yield functions

<A> Maximizing throughput

(with a deadline)

Constant

yield

Service yield

Response

time

0

0

Deadline

<B> Minimizing mean response time

(with a deadline)

<C> A hybrid metric

Full

yield

Full

yield

Service yield

Service yield

Drop

penalty

Response

time

Response

time

0

0

0

Full-yield

deadline

Deadline

0

Deadline

Sample Service Yield Functions

QoS yield

QoS yield

QoS yield

OSDI 2002

service differentiation
Service Differentiation
  • Service class – a category of service accesses that enjoy the same level of QoS support.
    • Client identities: paid vs unpaid, consumers vs corporate partners.
    • Service types or data partitions: order placement vs catalog browsing.
  • Service differentiation in Neptune
    • Differentiated service yield function.
    • Proportional resource allocation guarantee.

OSDI 2002

cluster level partitioning or not
Cluster-level: Partitioning or Not?
  • Periodic Server Partitioning [Zhu2001]:
    • Determine resource allocation at each epoch.
    • Partition the server pool among service classes.
  • Neptune – does not partition servers at cluster-level:
    • Random polling-based load balancing to evenly distribute requests for each service class to all nodes  service differentiation inside each node.
    • Advantages:
      • Functional-symmetry and decentralization  robustness and scalability.
      • Better handling of system state changes: demand spikes and node failures.
    • Disadvantage:
      • Less isolation for misbehaved service classes.

OSDI 2002

node level request scheduling

Drop requests likely

generating zero yield

Search for

under-allocated

service class

Schedule the

under-allocated

service class

Yes

Found ?

No

Schedule for

high aggregate yield

Node-level Request Scheduling

OSDI 2002

scheduling for high aggregate yield
Scheduling for High Aggregate Yield
  • Offline optimal scheduling is NP-complete.

OSDI 2002

evaluation settings
Evaluation Settings
  • Evaluation platform
    • A cluster of Linux servers connected by switched Ethernet.
  • Workload I: trace-driven
    • Document search on a 2.5GB memory-mapped search index.
    • Based on 1.5M search queries selected from an one-week access trace at Ask Jeeves search in January 2002.
    • “Service yield”-based priority order: Gold > Silver > Bronze.
  • Workload II:
    • CPU-spinning micro-benchmark.
    • Poisson process arrival; exponentially-distributed service processing time.

QoS yield

OSDI 2002

evaluation on scheduling policies 16 nodes aggregate
Evaluation on Scheduling Policies (16 nodes aggregate)

Performance Metric:

(B) Overload

(A) Underload

EDF

6%

60%

YID

Loss percent

Loss percent

Greedy

45%

Adaptive

4%

30%

EDF

YID

2%

Lost percent

Lost percent

15%

Greedy

  • EDF and YID perform better than Greedy during system under-load; Greedy performs better during system overload.
  • Adaptive dynamically switches between YID and Greedy to achieve good performance under both situations.

Adaptive

Aggregated yield (normalized)

Aggregated yield (normalized)

Aggregated yield (normalized)

Aggregated yield (normalized)

0%

0%

0%

25%

50%

75%

100%

100%

125%

150%

175%

200%

Aggregated yield (normalized)

Aggregated yield (normalized)

Arrival demand

Arrival demand

OSDI 2002

service differentiation during a demand spike and a node failure 8 nodes

Gold demand

Silver demand

Bronze demand

Gold acquisition

Silver acquisition

Bronze acquisition

Service Differentiation during a Demand Spike and a Node Failure (8 nodes)

CPU demand/acquisition

In percentage to total system resource

100%

80%

60%

40%

20%

  • “Service yield”-based priority order: Gold > Silver > Bronze.
  • 20% proportional resource guarantee for low-priority Bronze class.
  • Demand spike for the Silver class between time 50 and 150.
  • One node fails at time 200 and recovers at 250.

Resource demand/acquisition

Resource demand/acquisition

0%

0

50

100

150

200

250

300

Timeline (seconds)

OSDI 2002

performance scalability
Performance Scalability

<A> Differentiated Search

<B> Micro-benchmark

20

20

Aggregated yield (normalized)

Aggregated yield (normalized)

Demand 200%

Demand 200%

Demand 125%

Demand 125%

15

15

Demand 75%

Demand 75%

10

10

5

5

Aggregate yield (normalized)

Aggregate yield (normalized)

0

0

0

5

10

15

20

0

5

10

15

20

Number of service nodes

Number of service nodes

OSDI 2002

related work
Related Work
  • Software infrastructure for cluster-based Internet services – TACC [Fox1997], MultiSpace [Gribble1999], Porcupine [Saito1999], Ninja [von Behren2002].
  • QoS and service differentiation in computer networks – Weighted Fair Queuing [Demers1990; Parekh1993], Leaky Bucket, LIRA [Stoica1998], [Dovrolis1999].
  • QoS or real-time scheduling at the single host level – [Huang1989], [Haritsa1993], [Waldspurger1994], [Mogul1996], LRP [Druschel96], [Jones97], Eclipse [Bruno1998], Resource Container [Banga1999], [Steere1999].
  • Resource management and QoS for Web servers – [Almeida1998], [Pandey1998], [Abdelzaher1999], [Bhatti1999], [Chandra2000], [Li2000], [Voigt2001].
  • Resource management for clustered servers – LARD [Pai1998], Cluster Reserves [Aron2000], [Sullivan2000], DDSD [Zhu2001], [Chase2001].

OSDI 2002

conclusion
Conclusion
  • Multiple resource management objectives:
    • quality-aware resource utilization efficiency
    • service differentiation
  • Two-level resource management mechanism:
    • non-partitioning at the cluster level
    • adaptive scheduling at the node level
  • Trace-driven evaluations.
  • Future work – other types of service qualities.

OSDI 2002