1 / 13

 Optimized Caching Policies for Storage Systems

 Optimized Caching Policies for Storage Systems. Amir Rachum Chai Ronen Final presentation Industrial Supervisor: Dr. Roee   Engelberg , LSI. Introduction – Storage Tiering. System data is stored over different types of storage devices

plato
Download Presentation

 Optimized Caching Policies for Storage Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1.  Optimized Caching Policies for Storage Systems Amir Rachum Chai Ronen Final presentation Industrial Supervisor: Dr. Roee  Engelberg, LSI

  2. Introduction – Storage Tiering System data is stored over different types of storage devices Generally speaking, in data storage, for a given price, the higher the speed, the lower the volume The idea is enable use of larger, low-cost disk space with the benefits of high-speed hardware-optimize data storage for fastest overall disk access This requires a dynamic algorithm for managing (migrating) the data across the tiers.

  3. Goals Creating a platform which will allow us to test different algorithms in system-specific scenarios. Testing several algorithms and finding the optimal algorithm amongst them for storage tiering in different scenarios.

  4. Methodology We coded a simulator that represents the platform running the tiered storage system. We created several data structures that represent the data on the system, its location at all times, record read/write operations, and several other unique features We used a recording of real I/O calls for such a system to simulate an actual scenario.

  5. Accomplishments • Created an Algorithm interface that supports any algorithm, multiple tiers and multiple platform data structures. • Our design is generic enough to enable very easy addition of usage statistics and platform data. • CLI enabled quick input of input file, chunk size, tiers information. • Varying chunk size let us research the effect of the size on run time and algorithm effectiveness. • We implemented 2 caching algorithms: • A “naïve” algorithm that transfers every chunk to the top tier upon IO • A more efficient algorithm that minimizes migrations • Smart implementation resulted in low disk space usage for the various data structures (used a default tier).

  6. Algorithm conclusions • We ran 3 different scenarios: • Small chunk size (16B), small SSD size (64B, *4 chunk size) • Large chunk size (2048B), (relatively) small SSD size( 8196B, *4 chunk size) • Small chunk size (16B), relatively large SSD size ( 8196B, *512 chunk size)

  7. Algorithm conclusions • When using extremely small SSD size (*4 chunk size), both caching algorithms are ineffective: • The naïve one showed a high number of reads from higher tier, yet had twice as many migrations between tiers • The smart algorithm, despite having half the migrations of the naïve algorithm, showed very little reading from higher tier. • In this case, the dummy algorithm proved very efficient, as it saved all the time needed for relatively useless migrations.

  8. Algorithm Conclusions (16/64)

  9. Algorithm conclusions • When running with a large chunk size and *4 SSD size, the caching algorithms received much better results than the dummy algorithm. However, the 2 caching algorithms did not differ in between themselves.

  10. Algorithm Conclusions (2048/8192)

  11. Algorithm conclusions Running with a small chunk size and a large SSD size, the 2 caching algorithms also gave similar results. However, they were far inferior to the results from the previous run.

  12. Algorithm Conclusions (16/8192)

  13. General Conclusions • Chunk size greatly affects the runtime of the platform, but “standard” size does not take long to run. • Smart usage of Boost greatly decreases work and is very effective. • Good implementation can result in huge disk space saving. • Despite having data structures in the platform, most non-naïve algorithms also need their own data structure of some sort • Working with Git source control proved to be very helpful: • Retrieving old code that was once thought to be obsolete . • Collaboration.

More Related