Optimized Caching Policies for Storage Systems

Optimized Caching Policies for Storage Systems Amir Rachum Chai Ronen Final presentation Industrial Supervisor: Dr. Roee Engelberg, LSI

Introduction – Storage Tiering System data is stored over different types of storage devices Generally speaking, in data storage, for a given price, the higher the speed, the lower the volume The idea is enable use of larger, low-cost disk space with the benefits of high-speed hardware-optimize data storage for fastest overall disk access This requires a dynamic algorithm for managing (migrating) the data across the tiers.

Goals Creating a platform which will allow us to test different algorithms in system-specific scenarios. Testing several algorithms and finding the optimal algorithm amongst them for storage tiering in different scenarios.

Methodology We coded a simulator that represents the platform running the tiered storage system. We created several data structures that represent the data on the system, its location at all times, record read/write operations, and several other unique features We used a recording of real I/O calls for such a system to simulate an actual scenario.

Accomplishments • Created an Algorithm interface that supports any algorithm, multiple tiers and multiple platform data structures. • Our design is generic enough to enable very easy addition of usage statistics and platform data. • CLI enabled quick input of input file, chunk size, tiers information. • Varying chunk size let us research the effect of the size on run time and algorithm effectiveness. • We implemented 2 caching algorithms: • A “naïve” algorithm that transfers every chunk to the top tier upon IO • A more efficient algorithm that minimizes migrations • Smart implementation resulted in low disk space usage for the various data structures (used a default tier).

Algorithm conclusions • We ran 3 different scenarios: • Small chunk size (16B), small SSD size (64B, *4 chunk size) • Large chunk size (2048B), (relatively) small SSD size( 8196B, *4 chunk size) • Small chunk size (16B), relatively large SSD size ( 8196B, *512 chunk size)

Algorithm conclusions • When using extremely small SSD size (*4 chunk size), both caching algorithms are ineffective: • The naïve one showed a high number of reads from higher tier, yet had twice as many migrations between tiers • The smart algorithm, despite having half the migrations of the naïve algorithm, showed very little reading from higher tier. • In this case, the dummy algorithm proved very efficient, as it saved all the time needed for relatively useless migrations.

Algorithm Conclusions (16/64)

Algorithm conclusions • When running with a large chunk size and *4 SSD size, the caching algorithms received much better results than the dummy algorithm. However, the 2 caching algorithms did not differ in between themselves.

Algorithm conclusions Running with a small chunk size and a large SSD size, the 2 caching algorithms also gave similar results. However, they were far inferior to the results from the previous run.

General Conclusions • Chunk size greatly affects the runtime of the platform, but “standard” size does not take long to run. • Smart usage of Boost greatly decreases work and is very effective. • Good implementation can result in huge disk space saving. • Despite having data structures in the platform, most non-naïve algorithms also need their own data structure of some sort • Working with Git source control proved to be very helpful: • Retrieving old code that was once thought to be obsolete . • Collaboration.

Optimized Caching Policies for Storage Systems

Optimized Caching Policies for Storage Systems

Presentation Transcript

Designing systems for continuous availability - multi-node with remote file storage

Clocked Storage Elements for High-Performance and Low-Power Systems ICCD 2001 Tutorial

File Systems

Distributed Operating Systems

Chapter 5 Data Storage Technology

CS 5600 Computer Systems

Chapter 12: Mass-Storage Systems

Integrity Testing of Aboveground Storage Tanks

Chapter 11: Storage and File Structure

Ontology Storage, Reasoning and Query ---- Methods, Systems and Applications

Accommodating Global Policies and User Preferences in Computer Supported Collaborative Systems

Chapter 17: Recovery System

Chapter 7 Storage Systems

Chapter 2

File Systems

Storage Systems

Intelligent Storage Systems

儲存技術簡介及應用

CS 5600 Computer Systems

Secondary Storage Management

Chapter 11: Storage and File Structure