observe analyze act paradigm for storage system resource arbitration
Download
Skip this Video
Download Presentation
Observe-Analyze-Act Paradigm for Storage System Resource Arbitration

Loading in 2 Seconds...

play fullscreen
1 / 23

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration - PowerPoint PPT Presentation


  • 355 Views
  • Uploaded on

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration. Li Yin 1 Email: [email protected] Joint work with: Sandeep Uttamchandani 2 Guillermo Alvarez 2 John Palmer 2 Randy Katz 1 1:University of California, Berkeley 2: IBM Almaden Research Center. Outline.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Observe-Analyze-Act Paradigm for Storage System Resource Arbitration' - Melvin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
observe analyze act paradigm for storage system resource arbitration

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration

Li Yin1

Email: [email protected]

Joint work with: Sandeep Uttamchandani2

Guillermo Alvarez2

John Palmer2

Randy Katz1

1:University of California, Berkeley

2: IBM Almaden Research Center

outline
Outline
  • Observe-analyze-act in storage system: CHAMELEON
    • Motivation
    • System model and architecture
    • Design details
    • Experimental results
  • Observe-analyze-act in other scenarios
    • Example: network applications
  • Future challenges
need for run time system management
E-mail

Application

Web-server

Application

Data Warehousing

SLAGoals

Storage Virtualization

(Mapping Application-data to Storage Resources)

Need for Run-time System Management
  • Static resource allocation is not enough
    • Incomplete information of the access characteristics: workload variations; change of goals
    • Exception scenarios: hardware failures; load surges.
approaches for run time storage system management
Approaches for Run-time Storage System Management
  • Today: Administrator observe-analyze-act
  • Automate the observe-analyze-act:
    • Rule-based system
      • Complexity
      • Brittleness
    • Pure feedback-based system
      • Infeasible for real-world multi-parameter tuning
    • Model-based approaches
      • Challenges:
        • How to represent system details as models?
        • How to create/evolve models?
        • How to use models for decision making?
system model for resource arbitration
Host 1

Host 2

Host n

Throttling Points

QoS DecisionModule

InterconnectionFabric Switch m

InterconnectionFabric Switch 1

controller

controller

System Model for Resource Arbitration
  • Input:
    • SLAs for workloads
    • Current system status (performance)
  • Output:
    • Resource reallocation action (Throttling decisions)

controller

our solution chameleon
Managed System

Current States

Designer-defined Policies

Retrigger

Piece-wiseLinear Programming

Component Models

Models

WorkloadModels

ReasoningEngine

ActionModels

KnowledgeBase

Our Solution: CHAMELEON

Observe

Analyze

Act

Incremental ThrottlingStep Size

Current States

Feedback

Throttling Value

ThrottlingExecutor

knowledge base component model
Knowledge Base: Component Model
  • Objective:Predict service time for a given load at a component (For example: storage controller).

Service_timecontroller = L( request size, read write ratio, random sequential ratio, request rate)

  • An example of component model
    • FAStT900, 30 disks, RAID0
    • Request Size 10KB, Read/Write Ratio 0.8, Random Access
component model cont
Component Model (cont.)
  • Quadratic Fit
    • S = 3.284, r = 0.838
  • Linear Fit
    • S = 3.8268, r = 0.739
  • Non-saturated case: Linear Fit
    • S = 0.0509, r = 0.989
knowledge base workload model
Knowledge Base: Workload Model
  • Objective:Predict the load on component i as a function of the request rate j
  • Example:
    • Workload with 20KB request size, 0.642 read/write ratio and 0.026 sequential access ratio

Component_loadi,j= Wi,j( workload j request rate)

knowledge base action model
Knowledge Base: Action Model
  • Objective: Predict the effect of corrective actions on workload requirements
  • Example:

Workload J request Rate = Aj(Token Issue Rate for Workload J)

analyze module reasoning engine
Analyze Module: Reasoning Engine
  • Formulated as a constraint solving problem
    • Part 1: Predict Action Behavior: For each candidate throttling decision, predict its performance result based on knowledge base
    • Part 2: Constraint Solving: Use linear programming technique to scan all feasible solutions and choose the optimal one
reasoning engine predict result
latency

WorkloadRequest Rate

Token Issue Rate

load

Workload Model

ComponentLoad

Workload Request Rate

Reasoning Engine: Predict Result
  • Chain all models together to predict action result
  • Input: Token issue rate for each workloads
  • Output: Expected latency

Action Model

Workload 1

Component Model

Workload n

reasoning engine constraint solving
FAILED

EXCEED

MEET

LUCKY

Reasoning Engine: Constraint Solving

Latency(%)

  • Formulated using Linear Programming
  • Formulation:
    • Variable: Token issue rate for each workload
    • Objective Function:
      • Minimize number of workloads violating their SLA goals
      • Workloads are as close to their SLA IO rate as possible
      • Example:
    • Constraints:
      • Workloads should meet their SLA latency goals

1

1

IOps(%)

0

Minimize ∑paipbi[ SLAi – T(current_throughputi, ti)]

SLAi

where pai= Workload priority pbi = Quadrant priority

act module throttling executor
Yes

No

Execute Reasoning Engine Output with Step-wise Throttling Proportional toConfidence Value

Execute DesignerPolicies

Act Module: Throttling Executor

ReasoningEngine Invoked

  • Hybrid of feedback and prediction
  • Ability to switch to rule-based (policies) when confidence value is low
  • Ability to re-trigger reasoning engine

Confidence Value < Threshold

Re-triggerReasoningEngine

Continue

Throttling

Analyze System

States

experimental results
Experimental Results
  • Test-bed configuration:
    • IBM x-series 440 server (2.4GHz 4-way with 4GB memory, redhat server 2.1 kernel)
    • FAStT 900 controller
    • 24 drives (RAID0)
    • 2Gbps FibreChannel Link
  • Tests consist of:
    • Synthetic workloads
    • Real-world trace replay (HP traces and SPC traces)
experimental results synthetic workloads
Experimental Results: Synthetic Workloads
  • Effect of priority values on the output of constraint solver
  • Effect of model errors on output of the constraint solver

(a) Equal priority (b) Workload priorities (c ) Quadrant priorities

(a) Without feedback (b) with feedback

experiment result real world trace replay
Experiment Result: Real-world Trace Replay
  • Real-world block-level traces from HP (cello96 trace) and SPC (web server)
  • A phased synthetic workload acts as the third flow
  • Test goals:
    • Do they converge to SLAs?
    • How reactive the system is?
    • How does CHAMELEON handle unpredictable variations?
real world trace replay
Real-world Trace Replay
  • With throttling
  • Without CHAMELEON:
real world trace replay19
Real-world Trace Replay
  • Handling system changes
  • With periodic un-throttling
other system management scenarios
1: monitor network/server behaviors

4: Execute or

Distribute actiondecision to execution points

2: establish/maintain modelsfor system

behaviors

3: Analyze and make actiondecision

Other system management scenarios?
  • Automate the observe-analyze-act loop for other self-management scenarios
  • Example: CHAMELEON for network applications
    • Example: A proxy in front of server farm
future work
Future Work
  • Better methods to improve model accuracy
  • More general constraint solver
  • Combining with other actions
  • CHAMELEON in other scenarios
  • CHAMELEON for reliability and failure
references
References
  • L. Yin, S. Uttamchandani, J. Palmer, R. Katz, G. Agha, “AUTOLOOP: Automated Action Selection in the ``Observe-Analyze-Act’’ Loop for Storage Systems”, submitted for publication, 2005
  • S. Uttamchandani, L. Yin, G. Alvarez, J. Palmer, G. Agha, “CHAMELEON: a self-evovling, fully-adaptive resource arbitrator for storage systems”, to appear in USENIX Annual Technical Conference (USENIX’05), 2005
  • S. Uttamchandani, K. Voruganti, S. Srinivasan, J. Palmer , D. Pease, “Polus: Growing Storage QoS Management beyond a 4-year old kid”, 3rd USENIX Conference on File and Storage Technologies (FAST’04), 2004
ad