Observe-Analyze-Act Paradigm for Storage System Resource Arbitration - PowerPoint PPT Presentation

Observe analyze act paradigm for storage system resource arbitration l.jpg
Download
1 / 23

  • 337 Views
  • Uploaded on
  • Presentation posted in: Pets / Animals

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration. Li Yin 1 Email: yinli@eecs.berkeley.edu Joint work with: Sandeep Uttamchandani 2 Guillermo Alvarez 2 John Palmer 2 Randy Katz 1 1:University of California, Berkeley 2: IBM Almaden Research Center. Outline.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

Download Presentation

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Observe analyze act paradigm for storage system resource arbitration l.jpg

Observe-Analyze-Act Paradigm for Storage System Resource Arbitration

Li Yin1

Email: yinli@eecs.berkeley.edu

Joint work with: Sandeep Uttamchandani2

Guillermo Alvarez2

John Palmer2

Randy Katz1

1:University of California, Berkeley

2: IBM Almaden Research Center


Outline l.jpg

Outline

  • Observe-analyze-act in storage system: CHAMELEON

    • Motivation

    • System model and architecture

    • Design details

    • Experimental results

  • Observe-analyze-act in other scenarios

    • Example: network applications

  • Future challenges


Need for run time system management l.jpg

E-mail

Application

Web-server

Application

Data Warehousing

SLAGoals

Storage Virtualization

(Mapping Application-data to Storage Resources)

Need for Run-time System Management

  • Static resource allocation is not enough

    • Incomplete information of the access characteristics: workload variations; change of goals

    • Exception scenarios: hardware failures; load surges.


Approaches for run time storage system management l.jpg

Approaches for Run-time Storage System Management

  • Today: Administrator observe-analyze-act

  • Automate the observe-analyze-act:

    • Rule-based system

      • Complexity

      • Brittleness

    • Pure feedback-based system

      • Infeasible for real-world multi-parameter tuning

    • Model-based approaches

      • Challenges:

        • How to represent system details as models?

        • How to create/evolve models?

        • How to use models for decision making?


System model for resource arbitration l.jpg

Host 1

Host 2

Host n

Throttling Points

QoS DecisionModule

InterconnectionFabric Switch m

InterconnectionFabric Switch 1

controller

controller

System Model for Resource Arbitration

  • Input:

    • SLAs for workloads

    • Current system status (performance)

  • Output:

    • Resource reallocation action (Throttling decisions)

controller


Our solution chameleon l.jpg

Managed System

Current States

Designer-defined Policies

Retrigger

Piece-wiseLinear Programming

Component Models

Models

WorkloadModels

ReasoningEngine

ActionModels

KnowledgeBase

Our Solution: CHAMELEON

Observe

Analyze

Act

Incremental ThrottlingStep Size

Current States

Feedback

Throttling Value

ThrottlingExecutor


Knowledge base component model l.jpg

Knowledge Base: Component Model

  • Objective:Predict service time for a given load at a component (For example: storage controller).

    Service_timecontroller = L( request size, read write ratio, random sequential ratio, request rate)

  • An example of component model

    • FAStT900, 30 disks, RAID0

    • Request Size 10KB, Read/Write Ratio 0.8, Random Access


Component model cont l.jpg

Component Model (cont.)

  • Quadratic Fit

    • S = 3.284, r = 0.838

  • Linear Fit

    • S = 3.8268, r = 0.739

  • Non-saturated case: Linear Fit

    • S = 0.0509, r = 0.989


Knowledge base workload model l.jpg

Knowledge Base: Workload Model

  • Objective:Predict the load on component i as a function of the request rate j

  • Example:

    • Workload with 20KB request size, 0.642 read/write ratio and 0.026 sequential access ratio

Component_loadi,j= Wi,j( workload j request rate)


Knowledge base action model l.jpg

Knowledge Base: Action Model

  • Objective: Predict the effect of corrective actions on workload requirements

  • Example:

Workload J request Rate = Aj(Token Issue Rate for Workload J)


Analyze module reasoning engine l.jpg

Analyze Module: Reasoning Engine

  • Formulated as a constraint solving problem

    • Part 1: Predict Action Behavior: For each candidate throttling decision, predict its performance result based on knowledge base

    • Part 2: Constraint Solving: Use linear programming technique to scan all feasible solutions and choose the optimal one


Reasoning engine predict result l.jpg

latency

WorkloadRequest Rate

Token Issue Rate

load

Workload Model

ComponentLoad

Workload Request Rate

Reasoning Engine: Predict Result

  • Chain all models together to predict action result

  • Input: Token issue rate for each workloads

  • Output: Expected latency

Action Model

Workload 1

Component Model

Workload n


Reasoning engine constraint solving l.jpg

FAILED

EXCEED

MEET

LUCKY

Reasoning Engine: Constraint Solving

Latency(%)

  • Formulated using Linear Programming

  • Formulation:

    • Variable: Token issue rate for each workload

    • Objective Function:

      • Minimize number of workloads violating their SLA goals

      • Workloads are as close to their SLA IO rate as possible

      • Example:

    • Constraints:

      • Workloads should meet their SLA latency goals

1

1

IOps(%)

0

Minimize ∑paipbi[ SLAi – T(current_throughputi, ti)]

SLAi

where pai= Workload priority pbi = Quadrant priority


Act module throttling executor l.jpg

Yes

No

Execute Reasoning Engine Output with Step-wise Throttling Proportional toConfidence Value

Execute DesignerPolicies

Act Module: Throttling Executor

ReasoningEngine Invoked

  • Hybrid of feedback and prediction

  • Ability to switch to rule-based (policies) when confidence value is low

  • Ability to re-trigger reasoning engine

Confidence Value < Threshold

Re-triggerReasoningEngine

Continue

Throttling

Analyze System

States


Experimental results l.jpg

Experimental Results

  • Test-bed configuration:

    • IBM x-series 440 server (2.4GHz 4-way with 4GB memory, redhat server 2.1 kernel)

    • FAStT 900 controller

    • 24 drives (RAID0)

    • 2Gbps FibreChannel Link

  • Tests consist of:

    • Synthetic workloads

    • Real-world trace replay (HP traces and SPC traces)


Experimental results synthetic workloads l.jpg

Experimental Results: Synthetic Workloads

  • Effect of priority values on the output of constraint solver

  • Effect of model errors on output of the constraint solver

(a) Equal priority(b) Workload priorities(c ) Quadrant priorities

(a) Without feedback(b) with feedback


Experiment result real world trace replay l.jpg

Experiment Result: Real-world Trace Replay

  • Real-world block-level traces from HP (cello96 trace) and SPC (web server)

  • A phased synthetic workload acts as the third flow

  • Test goals:

    • Do they converge to SLAs?

    • How reactive the system is?

    • How does CHAMELEON handle unpredictable variations?


Real world trace replay l.jpg

Real-world Trace Replay

  • With throttling

  • Without CHAMELEON:


Real world trace replay19 l.jpg

Real-world Trace Replay

  • Handling system changes

  • With periodic un-throttling


Other system management scenarios l.jpg

1: monitor network/server behaviors

4: Execute or

Distribute actiondecision to execution points

2: establish/maintain modelsfor system

behaviors

3: Analyze and make actiondecision

Other system management scenarios?

  • Automate the observe-analyze-act loop for other self-management scenarios

  • Example: CHAMELEON for network applications

    • Example: A proxy in front of server farm


Future work l.jpg

Future Work

  • Better methods to improve model accuracy

  • More general constraint solver

  • Combining with other actions

  • CHAMELEON in other scenarios

  • CHAMELEON for reliability and failure


References l.jpg

References

  • L. Yin, S. Uttamchandani, J. Palmer, R. Katz, G. Agha, “AUTOLOOP: Automated Action Selection in the ``Observe-Analyze-Act’’ Loop for Storage Systems”, submitted for publication, 2005

  • S. Uttamchandani, L. Yin, G. Alvarez, J. Palmer, G. Agha, “CHAMELEON: a self-evovling, fully-adaptive resource arbitrator for storage systems”, to appear in USENIX Annual Technical Conference (USENIX’05), 2005

  • S. Uttamchandani, K. Voruganti, S. Srinivasan, J. Palmer , D. Pease, “Polus: Growing Storage QoS Management beyond a 4-year old kid”, 3rd USENIX Conference on File and Storage Technologies (FAST’04), 2004


Slide23 l.jpg

Questions?

Email: yinli@eecs.berkeley.edu


  • Login