feedback performance control in software services n.
Skip this Video
Loading SlideShow in 5 Seconds..
Feedback performance control in software services PowerPoint Presentation
Download Presentation
Feedback performance control in software services

Loading in 2 Seconds...

play fullscreen
1 / 34

Feedback performance control in software services - PowerPoint PPT Presentation

  • Uploaded on

Feedback performance control in software services. T.F. Abdelzaher, J.A. Stankovic, C. Lu, R. Zhang, and Y. Lu, Feedback Performance Control in Software Services, IEEE Control Systems, 23(3): 74-90, June 2003. . Overview. SW systems become larger and bigger

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Feedback performance control in software services' - dick

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
feedback performance control in software services

Feedback performance control in software services

T.F. Abdelzaher, J.A. Stankovic, C. Lu, R. Zhang, and Y. Lu, Feedback Performance Control in Software Services, IEEE Control Systems, 23(3): 74-90, June 2003.

  • SW systems become larger and bigger
  • Performance guarantee required, e.g., in web-based e-commerce
  • Control theory
    • Promising theoretical foundation for perf control in complex SW applications, e.g., real-time scheduling, web servers, multimedia control, storage mangers, power management, routing in computer networks, …
  • Software performance assurance problems -> Feedback control problems focused on web server performance guarantee problems
sw performance control
SW performance control
  • Less rigorous guarantees on perf and quality
  • Most SW eng. research deals with the development of functionally correct SW
  • Functional correctness is not enough!
    • Timeliness in embedded systems
      • Correct but delayed action can be disastrous
    • Non-fucntional QoS attributes, e.g., timeliness, security, availability, …
traditional approaches for perf guarantees
Traditional approaches for perf guarantees
  • Worst case estimates of load & resource availability
    • Recall EDF, RM, DM, Priority Ceiling Protocol, …
new demand for performance assurance
New demand for performance assurance
  • QoS guarantees required in a broader scope of applications run in open, unpredictable environments
    • Global communication networks enabling online banking, trading, distance learning, …
    • Points of massive aggregation suffering unpredictable loads, potential bottlenecks, DoS attacks, …

-> Precise workload/system model unknown a priori

    • Failure to meet QoS requirements -> loss of customers or financial damages
    • Worst case analysis/overdeisgn could be overly pessimistic or wasteful
    • Solid analytic framework for cost-effective perf assurance required
  • How to model SW architecture?
  • How to map a specific QoS problem into a feedback control system?
  • How to choose proper SW sensors and actuators to monitor and adjust perf and workloads/resource allocation?
  • How to design controllers for servers?

-> This paper focuses on web servers

qos metrics
QoS metrics
  • Delay metrics
    • Proportional to time: queuing delays, execution latencies, service response time
  • Rate metrics
    • Inversely proportional to time
    • Connection bandwidth, throughput, packet rate
time related perf attributes
Time-related perf attributes
  • Can be controlled by adjusting resource allocation
    • Queuing theory can predict perf given a particular resource allocation or vice versa
    • Queuing theory only works for Poisson arrival patterns
      • Queuing theory can only predict average perf even if this assumption holds
    • Arrival patterns in web applications follow heavy-tailed distribution -> Bursty arrival patterns
service architecture
Service architecture

Liquid task model

Fig. 1 Server architecture: (a) computing model (b) control-oriented


liquid task model
Liquid task model
  • Ci << Di
    • Takes Ci units of time to serve request i
    • Di is the max tolerable response time
    • Tolerable response time is finite
    • Service times are infinitesimal
  • Progress of requests through the server queues ≈ Fluid flow
  • Service rate at stage k = dNk(t)/dt where Nk is #requests processed by stage k
liquid task model1
Liquid task model
  • Volume at time T≈ #requests queued at stage k = ∫T(Fin – Fk)
    • Fk: service rate at stage k
    • Fin: request arrival rate to this stage
  • Valves: points of control, i.e., manipulated variables such as the queue length
  • Liquid model does not describe how individual requests are prioritized
  • Control theory can be combined with queuing theory or real-time scheduling
server modeling
Server modeling
  • Difference equation to model web servers
    • y(k): perf, e.g., delay or throughput, measured at the kth sampling period
    • U(k): control input at the kth sampling period
    • ARMA (AutoreRressive Moving Average) model
      • y(k) = a1y(k-1) + a2y(k-2) + … + any(k-n)

+ b1u(k-1) + b2u(k-2) + … + bnu(k-n)

      • Transfer function can be derived
        • Web proxy cache model [4]
        • TCP dynamics [5]
resource allocation for qos guarantees
Resource allocation for QoS guarantees
  • Allocate more/less resource = open/close a valve
  • Need actuators to control resource allocation or QoS provided by the system
sw system actuators
SW system actuators
  • Input flow actuators
    • Admission control
    • Control queue length, server utilization, …
    • Reject some requests under overload
sw system actuators1
SW system actuators
  • Quality adaptation actuators
    • Change processing requirements to increase server rate under overload
    • E.g., Return abbreviated web page under overload
    • Tradeoff btwn delay & quality
    • Service level m in a range [0, M] where 0 is rejection
resource reallocation actuator
Resource reallocation actuator
  • Alter the amount of allocated resources
  • Usually applicable to multiple classes of clients, e.g., dynamically reallocate disk space to support the service delay ratio 1:2 between two service classes [4,7]
qos mapping
QoS Mapping
  • Convert common resource management & SW perf assurance problems to FC problems
  • Absolute convergence guarantee
  • Relative guarantee
  • Resource reservation guarantee
  • Prioritization guarantee
  • Statistical multiplexing guarantee
  • Utility optimization guarantee
absolute convergence guarantee
Absolute convergence guarantee
  • Convergence to the specified problem
  • Overshoot: Maximum deviation
  • Settling time: Time taken to recover the desired perf
absolute convergence guarantee1
Absolute convergence guarantee
  • Rate & queue length control
    • Result in linear FC
    • (Flow) rate can be directly controlled by actuators
    • Queue length can be linearly controlled by controlling the flow
    • E.g., server utilization control loop
absolute convergence guarantee2
Absolute convergence guarantee
  • Delay control
    • More difficult
    • Delay is inversely proportional to flow
      • Queuing delay d = Q/r where Q is queue length & r is service rate
      • Nonlinear
relative guarantee
Relative guarantee
  • For example, fix the delays of two traffic classes at a ratio 3:1
  • Hi: measured perf of class i
  • Ci: weight of class i
  • Relative guarantee specifies H1:H2 = 1:3
  • Set point = 1/3
  • Error e = 1/3 – H1/H2
relative guarantee in apache web server
Relative guarantee in Apache web server
  • Controlled variable: relative delay ratio
  • Manipulated variable: #allocated processes per class to control connection delay
      • HTTP protocol summary
        • A client, e.g., web browser establishes a TCP connection with a server process
        • The client submits an HTTP request to the sever over the TCP connection
        • The server sends the response back to the client
        • Keep open the TCP connection for the Keep Alive interval, e.g., 15s

-> Claim connection delay dominates service response time

-> Scheduling can also significantly relative delay ratio, but it is not considered

relative guarantee in apache web server1
Relative guarantee in Apache web server
  • System identification based on the ARMA model
      • Randomly change per class process allocations
      • Measure response time
relative guarantee in apache web server2
Relative guarantee in Apache web server
  • Perf settings
    • 4 Linux machines run the Surge web workload generator
    • 1 Linux machine runs the Apache web server
    • Suddenly increase #premium clients by 100 at time 870s
relative guarantee in apache web server3
Relative guarantee in Apache web server
  • Perf results

Open Loop


Closed Loop

related work
Related work
  • ControlWare
  • CPU scheduling
  • Storage management
  • Network routers
  • Power/heat management
  • RTDB
  • Feedback control is applicable to managing performance in SW systems
  • Future work
    • Adaptive/robust control
    • Predictive control
    • Apply to other computational systems such as embedded systems
adptive control self tuning regulator
Adptive Control: Self-Tuning Regulator
  • Dynamically estimate a model of the system via the Recursive Least Square method
  • Controller will accordingly set the actuators to support the desired perf.
references hp storage systems lab
References (HP Storage Systems Lab)
  • Designing controllable computer systems, Christos Karamanolis, Magnus Karlsson and Xiaoyun Zhu. USENIX Workshop on Hot Topics in Operating Systems (HotOS), June 2005, pp. 49-54, Santa Fe, NM.
  • Dynamic black-box performance model estimation for self-tuning regulators, Magnus Karlsson and Michele Covell. International Conference on Autonomic Computing (ICAC), pp. 172-182, June 2005, Seattle, WA.
ibm autonomic computing lab
IBM Autonomic Computing Lab
  • General, broader research issues regarding self-tuning, self-managing systems
  • Also, visit Joe Hellerstein’s Adaptive Systems Department
some university labs
Some University Labs
  • Tarek Abdelzaher:
  • Chenyang Lu:
  • Programming Assignment 1 is posted on the course web page