automated administration for storage system
Download
Skip this Video
Download Presentation
Automated administration for storage system

Loading in 2 Seconds...

play fullscreen
1 / 35

Automated administration for storage system - PowerPoint PPT Presentation


  • 134 Views
  • Uploaded on

Automated administration for storage system. Presentation by Amitayu Das. Introduction. Major challenges in storage management System design and configuration (device management) Capacity Planning (space management) Performance tuning (performance management)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Automated administration for storage system' - valarie


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
automated administration for storage system

Automated administration for storage system

Presentation by Amitayu Das

CSE 598B: Fall2005 Presentation

introduction
Introduction
  • Major challenges in storage management
    • System design and configuration (device management)
    • Capacity Planning (space management)
    • Performance tuning (performance management)
    • High Availability (availability management)
    • Automation (all of the above, in a self-managing manner)

CSE 598B: Fall2005 Presentation

motivation
Motivation
  • Large disk arrays and networked storage lead to huge storage capacities and high bandwidth access to facilitate consolidated storage systems.
  • Enterprise-scale storage systems contain hundreds of host computers and storage devices and up to tens of thousands of disks.
  • Designing, deploying and runtime management of such systems lead to huge cost (often higher than procuring cost)…
  • Look at the problems in greater details …

CSE 598B: Fall2005 Presentation

storage system life cycle

(Dynamic) business

requirements

Configure/

reconfigure

Design/

redesign

Monitor

Storage devices

Storage System life-cycle

CSE 598B: Fall2005 Presentation

storage administration functions
Storage administration functions
  • Data protection
  • Performance tuning
  • Planning and deployment
  • Monitoring and record-keeping
  • Diagnosis and repair

CSE 598B: Fall2005 Presentation

few notable attempts
Few notable attempts
  • System-managed storage (IBM)
  • Attribute-managed storage (HP)
  • Replication
    • RAID
    • Online snapshot support
    • Remote replication
    • Online archival
  • Interposed request routing
  • Smart file-system switches

CSE 598B: Fall2005 Presentation

designing problem
Designing problem
  • Given a pool of resources and workload, determine appropriate choice of devices, configure them and assign the workload to the configured storage.
  • Solution is not straight-forward because,
    • Huge size of system and thousands of design choices and many choices have unforeseen circumstances.
    • Personnel with detailed knowledge of applications’ storage behavior are in short supply and hence, are quite expensive.
    • Design process is tedious and complicated to do by hand, usually leading to solutions that are grossly over-provisioned, substantially under-performing or, in the worst case, both.
    • Once a design is in place, implementing it is time-consuming, tedious and error-prone.
    • A mistake in any of these steps is difficult to identify and can result in a failure to meet the performance requirements.

CSE 598B: Fall2005 Presentation

storage system life cycle design configuration

(Dynamic) business

requirements

Configure/

reconfigure

Design/

redesign

Monitor

Storage devices

Storage System life-cycle: design/configuration

CSE 598B: Fall2005 Presentation

system design and assignment problem
System design and assignment problem

Application

Application

Application

Application

Workload requirements

Workload

Storage

Assignment engine

Storage System

System configuration

Storage device abilities

CSE 598B: Fall2005 Presentation

initial system design
Initial system design
  • Problem: convert workloads, business needs and device characteristics into assignment of stores and streams to devices
  • One approach: constraint-based multi-dimensional bin-packing
  • Sample constraints: # of device = 1
    • - Sum of store sizes  capacity
    • - Sum of stream utilizations  1.0
  • Sample objective functions:
    • - Minimize cost
    • - Balance load

Req. size

Capacity

I/O rate

How many drives? Holding which data?

CSE 598B: Fall2005 Presentation

initial system design disk arrays
Initial system design –> disk arrays
  • Problem:
    • extending the single disk solution to disk arrays
    • The space of array designs is potentially huge:
      • LUN sizes and RAID levels, stripe unit sizes, disks in LUNs
      • More work needed before the solver can run

CSE 598B: Fall2005 Presentation

slide12

Minerva Control flow. The array designer is called as a subroutine by allocator.

Minerva’s role in storage system life cycle. Input and output are shown.

CSE 598B: Fall2005 Presentation

minerva running a sample workload
Minerva running a sample workload

CSE 598B: Fall2005 Presentation

merits demerits
Merits/demerits
  • Merits:
    • Reasonable automation
  • Demerits:
    • Requires accurate models of workloads, performance requirements, and devices
    • Address only the mechanisms, not the policy

CSE 598B: Fall2005 Presentation

storage system life cycle redesign reconfigure

(Dynamic) business

requirements

Configure/

reconfigure

Design/

redesign

Monitor

Storage devices

Storage System life-cycle: redesign/reconfigure

CSE 598B: Fall2005 Presentation

system redesign reconfiguration
System redesign/reconfiguration
  • new application added
  • new users added
  • system load increases
  • hardware/software upgraded
  • device fails
  • new storage arrives

Reconfigured System

  • performance tuning

Running System

Events triggering redesign/reconfiguration

CSE 598B: Fall2005 Presentation

iterative storage management loop
Iterative storage management loop

Design new system

Analyze workload

Implement design

Events triggering reconfiguration

CSE 598B: Fall2005 Presentation

hippodrome
Hippodrome
  • Two objectives:
    • The automated loop must converge on a viable design that meets the workload’s requirements without over- or under-provisioning.
    • It must converge to a stable final system as quickly as possible, with as little as input from its users.

CSE 598B: Fall2005 Presentation

components of hippodrome
Components of Hippodrome
  • Analysis component (1)
  • Performance model component (2)
  • Solver components (3)
  • Migration component (4)

candidate design

2

utilzn (dsgn)

4

workload

1

summary

dsgn

finalized design

3

CSE 598B: Fall2005 Presentation

issues in system design and allocation
Issues in system design and allocation
  • What optimization algorithms are most effective?
  • What optimization objectives and constraints produce reasonable designs?

– ex: cost of reconfiguring system

  • What\'s the right part of the storage design space to explore?

– ex: RAID level vs. stripe unit size vs. cache management parameters

  • What are reasonable general guidelines for tagging a store\'s RAID level?
  • What (other) decompositions of the design and allocation problem are reasonable?
  • How to generalize system design?

– for SAN environment

– for host and applications

CSE 598B: Fall2005 Presentation

issues in reconfiguration
Issues in reconfiguration
  • How to do system discovery?
    • e.g., existing state, presence of new devices
    • Dealing with inconsistent information
    • In a scalable fashion
  • How to abstractly describe storage devices?
    • For system discovery output
    • For input to tools that perform changes
  • How to automate the physical redesign process?
    • e.g., physical space allocation etc.
  • Events trigger redesign decision
    • – How do we decide when to reconfigure?
  • Reconfiguration inputs:

– current system configuration/assignment

– desired system configuration/assignment

CSE 598B: Fall2005 Presentation

self storage architecture
Self-* storage architecture

CSE 598B: Fall2005 Presentation

administration and organization
Administration and organization
  • Administrative interface
  • Supervisors
  • Administrative assistants
  • Data access and storage
  • Routers
  • Workers

CSE 598B: Fall2005 Presentation

merits
Merits
  • Simpler storage administration
    • Data protection
    • Performance tuning
    • Planning and deployment
    • Monitoring and record-keeping
    • Diagnosis and repair

CSE 598B: Fall2005 Presentation

demerits
Demerits
  • The proposed solution is too simplistic to handle the issues raised.
  • Authors have provided solution from a high-level viewpoint, but the solution is not complete in any sense.
  • The implementation and evaluation is not convincing enough.
  • All the aspects of “self-*” has not been addressed as claimed.

CSE 598B: Fall2005 Presentation

storage system life cycle virtualization
Storage System life-cycle: virtualization

(Dynamic) business

requirements

Configure/

reconfigure

Design/

redesign

Monitor

Performance tuning

CSE 598B: Fall2005 Presentation

runtime management problem
Runtime management problem
  • Often, enterprise customers outsource their storage needs to data centers.
  • At data centers, different workload /application /services share the underlying storage infrastructure.
  • Sharing (of disk drives, storage caches, network links, controllers etc.) can lead to interference between the users/applications leading to possible violations in performance-based QoS guarantees.
  • To prevent that, data centers needs to insulate the users from each other – virtualization.

CSE 598B: Fall2005 Presentation

need for virtualization
Need for virtualization
  • At data centers, many different enterprise servers that support different business processes, such as, Web servers, file servers, database serves may have very different performance requirements on their backend storage server.
  • Sophisticated resource allocation and scheduling technology is required to effectively isolate these logical storage servers as if they are separate physical storage servers.
  • Storage Virtualization refers to the technology that allows creation of a set of logical storage devices from a single physical storage structure.

CSE 598B: Fall2005 Presentation

storage virtualization
Storage virtualization

Storage

management

Application

Clients

Abstract

Interface

Virtual Disks

Storage

Virtualization

Operating

System

Hardware

resources

Disks,

Controllers

Physical Disks

  • Examples: LVM, xFS, StorageTank
  • Hides Physical details from high-level applications

CSE 598B: Fall2005 Presentation

dimensions of virtualization
Dimensions of virtualization
  • Commercial storage virtualization systems are rather limited because they can virtualize storage capacity.
  • However, from the standpoint of storage clients or enterprise servers, the virtual storage devices are desired to be as tangible as physical disks.
  • Need to virtualize efficiently any standard attribute associated with a physical disk, such as capacity, bandwidth, latency, availability etc.

CSE 598B: Fall2005 Presentation

hardware organization

Client

Application

Client

Application

Kernel

Kernel

Storage

Clerk

Storage

Clerk

Storage

server

Storage

server

Storage

server

Disk

array

Disk

array

Disk

array

Hardware Organization

client

client

Object interface

File interface

Object interface

Storage

manager

Data/cmds

Control mesg

Gigabit network

CSE 598B: Fall2005 Presentation

a 2 level cvc scheduler
A 2-level CVC Scheduler

Storage Server

4

Storage Server

1

Storage Manager

Client

5

2

7

3

Storage Server

6

CSE 598B: Fall2005 Presentation

references
References
  • Hippodrome: running circles around storage administration. Eric Anderson et. al., FAST ’02, pp. 175-188, January 2002.
  • Minerva: an automated resource provisioning tool for large-scale storage systems.G. Alveraz et. al., ACM Transactions on Computer Systems 19 (4): 483-518, November 2001
  • Ergastulum: quickly finding near-optimal storage system designs. Eric Anderson et. al., Technical Report from HP Laboratories.
  • Disk Array Models in Minerva. Arif Merchant et. al., Technical Report, HP Laboratories.
  • Self-* Storage: Brick-based Storage with Automated Administration.G. Ganger et. al., Technical report,2003

CSE 598B: Fall2005 Presentation

references34
References
  • SIGMETRICS ’00 Tutorial, HP Laboratories.
  • Optimization algorithms
    • Bin-packing Heuristics [Coffman84]
    • Toyoda Gradient [Toyoda75]
    • Simulated Annealing [Drexl88]
    • Relaxation Approaches [Pattipati90, Trick92]
    • Genetic Algorithms [Chu97]
  • Multidimensional Storage Virtualization. Lan Huang et. al., SIGMETRICS ’04, New York, June 2004.
  • An Interposed 2-Level I/O Scheduling Framework for Performance Virtualization. J. Zhang et. al., SIGMETRICS ’05
  • Efficiency-aware disk scheduler:
    • - Cello, Prism, YFQ

CSE 598B: Fall2005 Presentation

slide35

THANK YOU !!!

CSE 598B: Fall2005 Presentation

ad