ANALYZING STORAGE SYSTEM WORKLOADS
This presentation is the property of its rightful owner.
Sponsored Links
1 / 25

ANALYZING STORAGE SYSTEM WORKLOADS PowerPoint PPT Presentation


  • 42 Views
  • Uploaded on
  • Presentation posted in: General

ANALYZING STORAGE SYSTEM WORKLOADS. Paul G. Sikalinda, Pieter S. Kritzinger {psikalin, [email protected], DNA Research Group Computer Science Department University of Cape Town, and Lourens O. Walters. [email protected] Mosaic Software Rondebosch Cape Town Republic of South Africa. 2.

Download Presentation

ANALYZING STORAGE SYSTEM WORKLOADS

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Analyzing storage system workloads

ANALYZING STORAGE SYSTEM WORKLOADS

Paul G. Sikalinda, Pieter S. Kritzinger{psikalin, [email protected], DNA Research GroupComputer Science DepartmentUniversity of Cape Town, and Lourens O. [email protected] SoftwareRondeboschCape Town Republic of South Africa.


Presentation outline

2

Presentation Outline

Introduction

Motivation and Objectives

Storage Systems

Storage System Workloads

The Storage System Workload Analyzed

Statistical Methodology

Workload Analysis Results

Conclusions

Future Work


Introduction

3

Introduction

The DNA Group specializes, among other things, in using theory, formal methods and software tools in the:

– specification of …

– design of …

– modelling of …

– building of …

– security of …

– *workload analysis of …

– correctness analysis of …

– performance analysis of …

concurrent computing systems (CCS).


Introduction cont d

4

Introduction(cont’d)

ANALYZING STORAGE SYSTEM WORKLOADS


Introduction cont d1

5

Introduction (cont’d)

PROCESSOR

RQ

ANALYZING STORAGE SYSTEM WORKLOADS

RP

Start Address

Operation Type

Request Size

Timestamps

Etc.

5


Motivation and objectives

6

Motivation and Objectives

A lot of effort is being spent in improving the I/O subsystem because it is a bottleneck in current computer systems.

-In design, performance and correctness evaluation of storage systems the workload modelling is an important component.

Common assumption not correct:

-Uniform distribution of start addresses,

-Exponential inter-arrival times.

Therefore storage system workload analysis should be done to come up with correct models.


Motivation and objectives cont d

7

Motivation and Objectives(cont’d)

-Designing storage systems.

-Designing I/O optimization techniques (read caching, write caching, pre-fetching, I/O parallelism, I/O rescheduling) to improve performance.

-Understanding application behavior and requirements.

-Deciding to pool storage system resources (SSPs).

-Implementing intelligent storage systems.

etc.


Motivation and objectives cont d1

8

Motivation and Objectives(cont’d)

Our aim was to analyze storage system workloads in terms of

inter-arrival times,

sizes and

“seek distances” of I/O requests

andprovide statistics for these parameters to be used to:

(a) derive models for storage system evaluation and

(b) design optimization techniques (read caching, I/O parallelism etc. )


Storage systems

Path to host

Host/Bus adapter

Path to cache

Cache

Path to controller

Array controller

Path to disks

Disk drives

9

Storage Systems

Enterprise Storage System (ESS)


Storage systems cont d

10

Storage Systems(cont’d)

ESS are powerful disk storage systems with the following capabilities:

-High performance*,

-Large capacity and availability

-Protection against physical drive failure can be provided using RAID methods.

*But can not still match the processor speeds because of mechanical processes in the disk drives.


Storage system workloads

Application Software

I/O request

Operating System

File System

I/O request

Disk System

11

Storage System Workloads

I/O Request Servicing and workload classification:

-Logical Workloads (File System Workloads)

-Storage System Workloads (Physical I/O Traffic)


Storage system workloads cont d

12

Storage System Workloads (cont’d)

Workload Parameters:

-Logical Volume Number

*Start Address (seek distances)

*Request Size

Operation Type (i.e., read or write)

*Time Stamp (inter-arrival times)


The storage system workload analyzed

13

The Storage System Workload Analyzed

We analyzed inter-arrival times, request sizes, and ”seek distances” of I/O requests from a system running a web search enginedeviation.

Got the I/O trace files from Storage Performance Council (SPC). (http://www.storageperformance.org)


Statistical methodology

14

Statistical Methodology

Visual Techniques:

-Histogram and

-ECDF graphs.

Key Data Statistics

-Sample mean,

-Variance and standard deviation,

-Coefficient of skew, kurtosis, and variation,

-Five number data summaries (minimum, lower quartile, median, upper quartile, maximum).

-Lower and upper outlier limits


Results 1 inter arrival times m

15

Results 1: inter-arrival times (µm)


Results 1 inter arrival times

16

Results 1: inter-arrival times

-Highly variable data. Range (126, 100100 microseconds)

-Coefficient of kurtosis shows that the distribution is heavy tailed.


Results 2 request sizes bytes

17

Results 2: Request sizes (bytes)


Results 2 request sizes

18

Results 2: Request sizes

Distribution peaks – 8192 (60%), 16384(10%), 24576 (9%) and 32768 (20%).

Reason:

OS Filesystem Block

- 8192 bytes


Results 3 seek distances blocks

19

Results 3: Seek distances (blocks)


Results 3 seek distances

20

Results 3: Seek distances

-The distribution of seek distances is symmetrical.


Conclusions

21

Conclusions

(1) Analyzing storage system workloads is necessary to properly model the workloads:

To model Web inter-arrival time, Weibull, lognormal, beta, gamma, exponential probability density functions should be considered.

To model Web data size and seek distance using probability mass function is more appropriate.

*We intend to use the models in simulations of ESS.


Conclusions cont d

22

Conclusions (cont’d)

(2) The analysis results are useful when designing optimization techniques of storage system. E.g.,

-Cache management block size – 8192 bytes.

-I/O rescheduling and background tasking would be ideal for the workload.

-The storage system handling the workload we analyzed can be optimized to handle the symmetrical behavior*.

*The results are not broadly applicable.


Conclusions cont d1

23

Conclusions (cont’d)

(3) Other conclusions:

-Request sizes influenced by filesystem in use.

-Seek distances are not always uniform distributed.

*In summary, we have provided statistics about the parameters for the storage system workload that we analyzed and have shown how we can use them to derive models and design I/O optimization techniques.


Future work

24

Future Work

Rigorously find a probability density function matching a given data set of inter-arrival times.

- Analyze the storage system workloads in terms of other parameters (e.g., logical volume numbers and operation types)


Thank you for your attention

25

THANK YOU FOR YOUR ATTENTION!

?


  • Login