measuring quality of service on worker node in cluster
Download
Skip this Video
Download Presentation
Measuring Quality of Service on Worker Node in Cluster

Loading in 2 Seconds...

play fullscreen
1 / 25

Measuring Quality of Service on Worker Node in Cluster - PowerPoint PPT Presentation


  • 133 Views
  • Uploaded on

Measuring Quality of Service on Worker Node in Cluster. Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division, BARC, Mumbai, India Helge Mainhard, Tony Cass, Olof Barring, CERN Geneva, Switzerland. INTRODUCTION. Quality of Service

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Measuring Quality of Service on Worker Node in Cluster' - afia


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
measuring quality of service on worker node in cluster

Measuring Quality of Service on Worker Node in Cluster

Rohitashva Sharma, R S Mundada, Sonika Sachdeva, P S Dhekne, Computer Division, BARC, Mumbai, India

Helge Mainhard, Tony Cass, Olof Barring, CERN Geneva, Switzerland

CHEP 06

introduction
INTRODUCTION
  • Quality of Service
    • Defines goodness of a node for a type of task
    • Needed for better/optimum utilization of resources
  • Computer Division, BARC and IT Division CERN collaborated to explore ways to predict QoS

CHEP 06

qos definition

Texecution = Wall clock execution time for any task

Tnoload = Wall clock execution time of the task on a given node without load

QoS = Quality of Service

QoS – Definition
  • QoS defines, how better the node is for a given task
  • QoS relates execution times like this
  • QoS varies between 0 to 1

CHEP 06

methodology
Methodology
  • Three task categories
    • CPU intensive
    • Disk IO intensive
    • Network IO intensive
  • Representative probe programs for each category
  • Load generating program for each category

CHEP 06

methodology1
Methodology
  • Monitor system metrics
    • Load avg, CPU utilization, Memory utilization, disk utilization, swap utilization etc.
  • Execute probe programs in different load conditions (generated using load generating programs)
  • Correlate probe execution time, system metrics and no load execution time of probe

CHEP 06

probe selection
Probe Selection
  • Probe should
    • Represent real world applications
    • Have less execution time
    • Non-interactive
  • Selected probes are
    • Linpack for CPU intensive
    • Bonnie for Disk IO intensive
    • Network IO intensive (not considered)

CHEP 06

load generating programs
Load Generating programs
  • Generate load in given category
  • Should have large execution time
  • Feature for varying the load
  • Two type of Disk IO load
    • Block IO (IO in large data blocks)
    • Character IO (IO in small data blocks)

CHEP 06

setup
SETUP
  • 32 node cluster
  • Each node consists of
  • EDG Fabric Monitoring System for gathering system metrics

CHEP 06

cpu probe

(Equation 1)

CPU Probe
  • CPU probe in different loading conditions
  • Correlation using load average
  • Execution time varies linearly with load average
  • Problem in block IO load

CHEP 06

cpu probe1
CPU Probe

CHEP 06

cpu probe2
CPU Probe
  • Load average represents combined CPU and IO load
  • CPU probe depends only on CPU load
  • Two ways to achieve it
    • Average CPU load (VmStatR)
    • Calculate available CPU to probe

CHEP 06

cpu probe3

(Equation 2)

CPU Probe
  • Average CPU Load
    • 1 minute running average of run queue
    • Called VmStatR
    • Predicted QoS will be

CHEP 06

cpu probe4
CPU Probe

CHEP 06

cpu probe5

(Equation 3)

CPU Probe
  • Available CPU to probe
    • Calculate using CPU utilization metric
    • Probe is eligible for
      • Available Idle time
      • A share of System and User time

CHEP 06

cpu probe6
CPU Probe
  • Table shows the comparison between QoS predicted using equation 1 & 3 in Block IO load
  • QoS using Eq. 3 shows correct characteristic

CHEP 06

comparison of results
Comparison of results
  • Compare the QoS results obtained using the three equations for CPU probe in different loads
    • Equation 1 does not give correct prediction in block IO load conditions
    • Equation 2 & 3 give acceptable results in any load condition

CHEP 06

cpu probe comparison of results
CPU Probe – Comparison of results

LC – CPU Load

LC+LB – CPU + Block IO Load

LC + LCh – CPU + Character IO Load

LCh + LB – Character + Block IO Load

CHEP 06

disk io probe
Disk IO Probe
  • Modified ‘Bonnie’ to perform both as block IO and character IO probe
  • Considered block IO probe as most of the applications were block IO intensive
  • Correlate execution time probe under different loading conditions
  • Predicted QoS using the three equations and compared results

CHEP 06

disk io probe comparison of results
Disk IO Probe – Comparison of results

LC – CPU Load

LC+LB – CPU + Block IO Load

LC + LCh – CPU + Character IO Load

LCh + LB – Character + Block IO Load

CHEP 06

cmsim results
CMSIM Results
  • Predicted execution time using QoS from Equation 2
  • % error against the measured one acceptable

CHEP 06

problem areas
Problem Areas
  • Effect of swapping
    • If available memory is less than the size of task
    • Linux kernel dynamically changes the priorities of tasks and swaps tasks accordingly
    • Difficult to predict QoS

CHEP 06

problem areas1
Problem Areas
  • Metric sampling frequency of monitoring system
    • Immediate metric value ensures better QoS prediction
    • At higher sampling frequency monitoring loads the node
  • Change in state after submission of task
    • QoS can’t consider load changes after submission of task
    • Submission/removal of other task may change QoS

CHEP 06

conclusion
Conclusion
  • Equation 2 & 3 provides better QoS for CPU bound applications
  • Equation 1 can be used for IO bound applications
  • Successfully predicted for CMSIM – It is mostly cpu bound job
  • Load balancing programs can use derived equations for job submissions

CHEP 06

slide25
Thanks

CHEP 06

ad