network weather service n.
Skip this Video
Loading SlideShow in 5 Seconds..
Network Weather Service PowerPoint Presentation
Download Presentation
Network Weather Service

Loading in 2 Seconds...

play fullscreen
1 / 38

Network Weather Service - PowerPoint PPT Presentation

  • Uploaded on

Network Weather Service. Sathish Vadhiyar. Sources / Credits: NWS web site: NWS papers. Introduction. “NWS provides accurate forecasts of dynamically changing performance characteristics from a distributed set of metacomputing resources”

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

Network Weather Service

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
network weather service

Network Weather Service

Sathish Vadhiyar

  • Sources / Credits:
  • NWS web site:
  • NWS papers
  • “NWS provides accurate forecasts of dynamically changing performance characteristics from a distributed set of metacomputing resources”
  • What will be the future load (not current load) when a program is executed?
  • Producing short-term performance forecasts based on historical performance measurements
  • The forecasts can be used by dynamic scheduling agents
  • Resource allocation and scheduling decisions must be based on predictions of resource performance during a timeframe
  • NWS takes periodic measurements of performance and using numerical models, forecasts resource performance
nws goals
NWS Goals
  • Components
    • Persistent state
    • Name server
    • Sensors
      • Passive (CPU availability)
      • Active (Network measurements)
    • Forecaster
performance measurements
Performance measurements
  • Using sensors
  • CPU sensors
    • Measures CPU availability
    • Uses
      • uptime
      • vmstat
      • Active probes
  • Network sensors
    • Measures latency and bandwidth
  • Each host maintains
    • Current data
    • One-step ahead predictions
    • Time series of data
issues with network sensors
Issues with Network Sensors
  • Appropriate transfer size for measuring throughput
  • Collision of network probes
  • Solutions
    • Tokens and hierarchical trees with cliques
available cpu measurement1
Available CPU measurement
  • The formulae shown does not take into account job priorities
  • Hence periodically an active probe is run to adjust the estimates
  • To generate a forecast, forecaster requests persistent state data
  • When a forecast is requested, forecaster makes predictions for existing measurements using different forecast models
  • Dynamic choice of forecast models based on the best Mean Absolute Error, Mean Square Prediction Error, Mean Percentage Prediction Error
  • Forecasts requested by:
    • InitForecaster()
    • RequestForecasts()
  • Forecasting methods
    • Mean-based
    • Median based
    • Autoregressive
forecasting methods
Forecasting Methods


Prediction Accuracy:

Mean Absolute Error (MAE) is the average of the above

Prediction Method:



ai found such that it minimizes the overall error.

ri ,j is the autocorellation function for the series of N measurements.

forecasting complexity vs accuracy
Forecasting Complexity vs Accuracy
  • Semi Non-parametric Time Series Analysis (SNP) – an accurate but complicated model
    • Model fit using iterative search
    • Calculation of conditional expected value using conditional probability density
sensor control
Sensor Control
  • Each sensor connects to other sensors and perform measurements O(N2)
  • To reduce the time complexity, sensors organized in hierarchy called cliques
  • To avoid collisions, tokens are used
  • Adaptive control using adaptive token timeouts
  • Adaptive time-out discovery and distributed leader election protocol
synchronizing network probes
Synchronizing network probes
  • Consistent periodicity and mutual exclusion
  • Token
    • List of hosts to probe
    • Periodicity of probe
    • Parameters to the probe
    • Sequence number
  • Leader initiates the token
  • A hosts after receiving a token:
    • Conducts probes with the other hosts in the token
    • Passes the token to the next host
  • Token passed back to the leader
  • Leader notes the token circuit time and calculates the next token initiation time as (desired periodicity – token circuit time)
  • To avoid long delays in token circulation and to have fault tolerance:
    • Each host maintains a timer
    • When the timer times out, the host declares itself as the leader and initiates a new token
    • When a host encounters two tokens, the old token is destroyed
  • Calculation of time-outs
    • Each host records token circuit time, variance of the time
    • Uses NWS forecasting models to predict the next token arrival time
new protocol
New Protocol
  • Compromise between periodicity and mutual exclusion
  • NWS administrator specifies periodicity, and an upper range of desired periodicity
    • If network conditions are stable and if tokens are received within the upper range, then mutual exclusion is guaranteed
    • If not, hosts times out and start conducting probes with possible collisions
  • Thus the protocol switches between good and bad phases
comparison of 2 protocols experimental setup
Comparison of 2 protocols – Experimental setup
  • 4 machines – 2 in Lyon, France and 2 in Tennessee, USA
  • 240 second periodicity
  • 5 second range
use of nws scheduling a jacobi application
Use of NWS: Scheduling a Jacobi application

The problem: Appropriate partitioning strategy to balance processor efficiencies and communication overheads, i.e. deriving partitions to obtain resource performance

deriving partitions for jacobi
Deriving Partitions for Jacobi
  • Notations
  • Per-processor execution time
  • The goal
deriving partitions for jacobi1
Deriving Partitions for Jacobi
  • Communication time
  • Soultion: system of linear equations by Gaussian Elimination
  • Implementing a Performance Forecasting System for Metacomputing: The Network Weather Service. Rich Wolski, Neil Spring, Chris Peterson, in Proceedings of SC97, November, 1997.
  • Dynamically Forecasting Network Performance Using the Network Weather Service. Rich Wolski, in Journal of Cluster Computing, Volume 1, pp. 119-132, January, 1998.
  • The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing. Rich Wolski, Neil Spring, and Jim Hayes, Journal of Future Generation Computing Systems,Volume 15, Numbers 5-6, pp. 757-768, October, 1999.
  • Synchronizing Network Probes to avoid Measurement Intrusiveness with the Network Weather Service, B. Gaidioz, R. Wolski, and B. Tourancheau, Proceedings of 9th IEEE High-performance Distributed Computing Conference, August, 2000, pp. 147-154.
  • Experiences with Predicting Resource Performance On-line in Computational Grid Settings, Rich Wolski, ACM SIGMETRICS Performance Evaluation Review, Volume 30, Number 4, pp 41--49, March, 2003.