observe analyze act paradigm for storage system resource arbitration
Skip this Video
Download Presentation
Observe-Analyze-Act Paradigm for Storage System Resource Arbitration

Loading in 2 Seconds...

play fullscreen
1 / 23

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration - PowerPoint PPT Presentation

  • Uploaded on

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration. Li Yin 1 Email: [email protected] Joint work with: Sandeep Uttamchandani 2 Guillermo Alvarez 2 John Palmer 2 Randy Katz 1 1:University of California, Berkeley 2: IBM Almaden Research Center. Outline.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about 'Observe-Analyze-Act Paradigm for Storage System Resource Arbitration' - Melvin

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
observe analyze act paradigm for storage system resource arbitration

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration

Li Yin1

Email: [email protected]

Joint work with: Sandeep Uttamchandani2

Guillermo Alvarez2

John Palmer2

Randy Katz1

1:University of California, Berkeley

2: IBM Almaden Research Center

  • Observe-analyze-act in storage system: CHAMELEON
    • Motivation
    • System model and architecture
    • Design details
    • Experimental results
  • Observe-analyze-act in other scenarios
    • Example: network applications
  • Future challenges
need for run time system management




Data Warehousing


Storage Virtualization

(Mapping Application-data to Storage Resources)

Need for Run-time System Management
  • Static resource allocation is not enough
    • Incomplete information of the access characteristics: workload variations; change of goals
    • Exception scenarios: hardware failures; load surges.
approaches for run time storage system management
Approaches for Run-time Storage System Management
  • Today: Administrator observe-analyze-act
  • Automate the observe-analyze-act:
    • Rule-based system
      • Complexity
      • Brittleness
    • Pure feedback-based system
      • Infeasible for real-world multi-parameter tuning
    • Model-based approaches
      • Challenges:
        • How to represent system details as models?
        • How to create/evolve models?
        • How to use models for decision making?
system model for resource arbitration
Host 1

Host 2

Host n

Throttling Points

QoS DecisionModule

InterconnectionFabric Switch m

InterconnectionFabric Switch 1



System Model for Resource Arbitration
  • Input:
    • SLAs for workloads
    • Current system status (performance)
  • Output:
    • Resource reallocation action (Throttling decisions)


our solution chameleon
Managed System

Current States

Designer-defined Policies


Piece-wiseLinear Programming

Component Models






Our Solution: CHAMELEON




Incremental ThrottlingStep Size

Current States


Throttling Value


knowledge base component model
Knowledge Base: Component Model
  • Objective:Predict service time for a given load at a component (For example: storage controller).

Service_timecontroller = L( request size, read write ratio, random sequential ratio, request rate)

  • An example of component model
    • FAStT900, 30 disks, RAID0
    • Request Size 10KB, Read/Write Ratio 0.8, Random Access
component model cont
Component Model (cont.)
  • Quadratic Fit
    • S = 3.284, r = 0.838
  • Linear Fit
    • S = 3.8268, r = 0.739
  • Non-saturated case: Linear Fit
    • S = 0.0509, r = 0.989
knowledge base workload model
Knowledge Base: Workload Model
  • Objective:Predict the load on component i as a function of the request rate j
  • Example:
    • Workload with 20KB request size, 0.642 read/write ratio and 0.026 sequential access ratio

Component_loadi,j= Wi,j( workload j request rate)

knowledge base action model
Knowledge Base: Action Model
  • Objective: Predict the effect of corrective actions on workload requirements
  • Example:

Workload J request Rate = Aj(Token Issue Rate for Workload J)

analyze module reasoning engine
Analyze Module: Reasoning Engine
  • Formulated as a constraint solving problem
    • Part 1: Predict Action Behavior: For each candidate throttling decision, predict its performance result based on knowledge base
    • Part 2: Constraint Solving: Use linear programming technique to scan all feasible solutions and choose the optimal one
reasoning engine predict result

WorkloadRequest Rate

Token Issue Rate


Workload Model


Workload Request Rate

Reasoning Engine: Predict Result
  • Chain all models together to predict action result
  • Input: Token issue rate for each workloads
  • Output: Expected latency

Action Model

Workload 1

Component Model

Workload n

reasoning engine constraint solving




Reasoning Engine: Constraint Solving


  • Formulated using Linear Programming
  • Formulation:
    • Variable: Token issue rate for each workload
    • Objective Function:
      • Minimize number of workloads violating their SLA goals
      • Workloads are as close to their SLA IO rate as possible
      • Example:
    • Constraints:
      • Workloads should meet their SLA latency goals





Minimize ∑paipbi[ SLAi – T(current_throughputi, ti)]


where pai= Workload priority pbi = Quadrant priority

act module throttling executor


Execute Reasoning Engine Output with Step-wise Throttling Proportional toConfidence Value

Execute DesignerPolicies

Act Module: Throttling Executor

ReasoningEngine Invoked

  • Hybrid of feedback and prediction
  • Ability to switch to rule-based (policies) when confidence value is low
  • Ability to re-trigger reasoning engine

Confidence Value < Threshold




Analyze System


experimental results
Experimental Results
  • Test-bed configuration:
    • IBM x-series 440 server (2.4GHz 4-way with 4GB memory, redhat server 2.1 kernel)
    • FAStT 900 controller
    • 24 drives (RAID0)
    • 2Gbps FibreChannel Link
  • Tests consist of:
    • Synthetic workloads
    • Real-world trace replay (HP traces and SPC traces)
experimental results synthetic workloads
Experimental Results: Synthetic Workloads
  • Effect of priority values on the output of constraint solver
  • Effect of model errors on output of the constraint solver

(a) Equal priority (b) Workload priorities (c ) Quadrant priorities

(a) Without feedback (b) with feedback

experiment result real world trace replay
Experiment Result: Real-world Trace Replay
  • Real-world block-level traces from HP (cello96 trace) and SPC (web server)
  • A phased synthetic workload acts as the third flow
  • Test goals:
    • Do they converge to SLAs?
    • How reactive the system is?
    • How does CHAMELEON handle unpredictable variations?
real world trace replay
Real-world Trace Replay
  • With throttling
  • Without CHAMELEON:
real world trace replay19
Real-world Trace Replay
  • Handling system changes
  • With periodic un-throttling
other system management scenarios
1: monitor network/server behaviors

4: Execute or

Distribute actiondecision to execution points

2: establish/maintain modelsfor system


3: Analyze and make actiondecision

Other system management scenarios?
  • Automate the observe-analyze-act loop for other self-management scenarios
  • Example: CHAMELEON for network applications
    • Example: A proxy in front of server farm
future work
Future Work
  • Better methods to improve model accuracy
  • More general constraint solver
  • Combining with other actions
  • CHAMELEON in other scenarios
  • CHAMELEON for reliability and failure
  • L. Yin, S. Uttamchandani, J. Palmer, R. Katz, G. Agha, “AUTOLOOP: Automated Action Selection in the ``Observe-Analyze-Act’’ Loop for Storage Systems”, submitted for publication, 2005
  • S. Uttamchandani, L. Yin, G. Alvarez, J. Palmer, G. Agha, “CHAMELEON: a self-evovling, fully-adaptive resource arbitrator for storage systems”, to appear in USENIX Annual Technical Conference (USENIX’05), 2005
  • S. Uttamchandani, K. Voruganti, S. Srinivasan, J. Palmer , D. Pease, “Polus: Growing Storage QoS Management beyond a 4-year old kid”, 3rd USENIX Conference on File and Storage Technologies (FAST’04), 2004