Observe analyze act paradigm for storage system resource arbitration
Download
1 / 23

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration - PowerPoint PPT Presentation


  • 354 Views
  • Uploaded on

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration. Li Yin 1 Email: [email protected] Joint work with: Sandeep Uttamchandani 2 Guillermo Alvarez 2 John Palmer 2 Randy Katz 1 1:University of California, Berkeley 2: IBM Almaden Research Center. Outline.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Observe-Analyze-Act Paradigm for Storage System Resource Arbitration' - Melvin


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Observe analyze act paradigm for storage system resource arbitration l.jpg

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration

Li Yin1

Email: [email protected]

Joint work with: Sandeep Uttamchandani2

Guillermo Alvarez2

John Palmer2

Randy Katz1

1:University of California, Berkeley

2: IBM Almaden Research Center


Outline l.jpg
Outline Arbitration

  • Observe-analyze-act in storage system: CHAMELEON

    • Motivation

    • System model and architecture

    • Design details

    • Experimental results

  • Observe-analyze-act in other scenarios

    • Example: network applications

  • Future challenges


Need for run time system management l.jpg

E-mail Arbitration

Application

Web-server

Application

Data Warehousing

SLAGoals

Storage Virtualization

(Mapping Application-data to Storage Resources)

Need for Run-time System Management

  • Static resource allocation is not enough

    • Incomplete information of the access characteristics: workload variations; change of goals

    • Exception scenarios: hardware failures; load surges.


Approaches for run time storage system management l.jpg
Approaches for Run-time Storage System Management Arbitration

  • Today: Administrator observe-analyze-act

  • Automate the observe-analyze-act:

    • Rule-based system

      • Complexity

      • Brittleness

    • Pure feedback-based system

      • Infeasible for real-world multi-parameter tuning

    • Model-based approaches

      • Challenges:

        • How to represent system details as models?

        • How to create/evolve models?

        • How to use models for decision making?


System model for resource arbitration l.jpg

Host 1 Arbitration

Host 2

Host n

Throttling Points

QoS DecisionModule

InterconnectionFabric Switch m

InterconnectionFabric Switch 1

controller

controller

System Model for Resource Arbitration

  • Input:

    • SLAs for workloads

    • Current system status (performance)

  • Output:

    • Resource reallocation action (Throttling decisions)

controller


Our solution chameleon l.jpg

Managed ArbitrationSystem

Current States

Designer-defined Policies

Retrigger

Piece-wiseLinear Programming

Component Models

Models

WorkloadModels

ReasoningEngine

ActionModels

KnowledgeBase

Our Solution: CHAMELEON

Observe

Analyze

Act

Incremental ThrottlingStep Size

Current States

Feedback

Throttling Value

ThrottlingExecutor


Knowledge base component model l.jpg
Knowledge Base: Component Model Arbitration

  • Objective:Predict service time for a given load at a component (For example: storage controller).

    Service_timecontroller = L( request size, read write ratio, random sequential ratio, request rate)

  • An example of component model

    • FAStT900, 30 disks, RAID0

    • Request Size 10KB, Read/Write Ratio 0.8, Random Access


Component model cont l.jpg
Component Model (cont.) Arbitration

  • Quadratic Fit

    • S = 3.284, r = 0.838

  • Linear Fit

    • S = 3.8268, r = 0.739

  • Non-saturated case: Linear Fit

    • S = 0.0509, r = 0.989


Knowledge base workload model l.jpg
Knowledge Base: Workload Model Arbitration

  • Objective:Predict the load on component i as a function of the request rate j

  • Example:

    • Workload with 20KB request size, 0.642 read/write ratio and 0.026 sequential access ratio

Component_loadi,j= Wi,j( workload j request rate)


Knowledge base action model l.jpg
Knowledge Base: Action Model Arbitration

  • Objective: Predict the effect of corrective actions on workload requirements

  • Example:

Workload J request Rate = Aj(Token Issue Rate for Workload J)


Analyze module reasoning engine l.jpg
Analyze Module: Reasoning Engine Arbitration

  • Formulated as a constraint solving problem

    • Part 1: Predict Action Behavior: For each candidate throttling decision, predict its performance result based on knowledge base

    • Part 2: Constraint Solving: Use linear programming technique to scan all feasible solutions and choose the optimal one


Reasoning engine predict result l.jpg

latency Arbitration

WorkloadRequest Rate

Token Issue Rate

load

Workload Model

ComponentLoad

Workload Request Rate

Reasoning Engine: Predict Result

  • Chain all models together to predict action result

  • Input: Token issue rate for each workloads

  • Output: Expected latency

Action Model

Workload 1

Component Model

Workload n


Reasoning engine constraint solving l.jpg

FAILED Arbitration

EXCEED

MEET

LUCKY

Reasoning Engine: Constraint Solving

Latency(%)

  • Formulated using Linear Programming

  • Formulation:

    • Variable: Token issue rate for each workload

    • Objective Function:

      • Minimize number of workloads violating their SLA goals

      • Workloads are as close to their SLA IO rate as possible

      • Example:

    • Constraints:

      • Workloads should meet their SLA latency goals

1

1

IOps(%)

0

Minimize ∑paipbi[ SLAi – T(current_throughputi, ti)]

SLAi

where pai= Workload priority pbi = Quadrant priority


Act module throttling executor l.jpg

Yes Arbitration

No

Execute Reasoning Engine Output with Step-wise Throttling Proportional toConfidence Value

Execute DesignerPolicies

Act Module: Throttling Executor

ReasoningEngine Invoked

  • Hybrid of feedback and prediction

  • Ability to switch to rule-based (policies) when confidence value is low

  • Ability to re-trigger reasoning engine

Confidence Value < Threshold

Re-triggerReasoningEngine

Continue

Throttling

Analyze System

States


Experimental results l.jpg
Experimental Results Arbitration

  • Test-bed configuration:

    • IBM x-series 440 server (2.4GHz 4-way with 4GB memory, redhat server 2.1 kernel)

    • FAStT 900 controller

    • 24 drives (RAID0)

    • 2Gbps FibreChannel Link

  • Tests consist of:

    • Synthetic workloads

    • Real-world trace replay (HP traces and SPC traces)


Experimental results synthetic workloads l.jpg
Experimental Results: Synthetic Workloads Arbitration

  • Effect of priority values on the output of constraint solver

  • Effect of model errors on output of the constraint solver

(a) Equal priority (b) Workload priorities (c ) Quadrant priorities

(a) Without feedback (b) with feedback


Experiment result real world trace replay l.jpg
Experiment Result: Real-world Trace Replay Arbitration

  • Real-world block-level traces from HP (cello96 trace) and SPC (web server)

  • A phased synthetic workload acts as the third flow

  • Test goals:

    • Do they converge to SLAs?

    • How reactive the system is?

    • How does CHAMELEON handle unpredictable variations?


Real world trace replay l.jpg
Real-world Trace Replay Arbitration

  • With throttling

  • Without CHAMELEON:


Real world trace replay19 l.jpg
Real-world Trace Replay Arbitration

  • Handling system changes

  • With periodic un-throttling


Other system management scenarios l.jpg

1: monitor network/server behaviors Arbitration

4: Execute or

Distribute actiondecision to execution points

2: establish/maintain modelsfor system

behaviors

3: Analyze and make actiondecision

Other system management scenarios?

  • Automate the observe-analyze-act loop for other self-management scenarios

  • Example: CHAMELEON for network applications

    • Example: A proxy in front of server farm


Future work l.jpg
Future Work Arbitration

  • Better methods to improve model accuracy

  • More general constraint solver

  • Combining with other actions

  • CHAMELEON in other scenarios

  • CHAMELEON for reliability and failure


References l.jpg
References Arbitration

  • L. Yin, S. Uttamchandani, J. Palmer, R. Katz, G. Agha, “AUTOLOOP: Automated Action Selection in the ``Observe-Analyze-Act’’ Loop for Storage Systems”, submitted for publication, 2005

  • S. Uttamchandani, L. Yin, G. Alvarez, J. Palmer, G. Agha, “CHAMELEON: a self-evovling, fully-adaptive resource arbitrator for storage systems”, to appear in USENIX Annual Technical Conference (USENIX’05), 2005

  • S. Uttamchandani, K. Voruganti, S. Srinivasan, J. Palmer , D. Pease, “Polus: Growing Storage QoS Management beyond a 4-year old kid”, 3rd USENIX Conference on File and Storage Technologies (FAST’04), 2004


Slide23 l.jpg

Questions? Arbitration

Email: [email protected]


ad