Five minute rule ten years later and other computer storage rules of thumb
Download
1 / 17

“Five minute rule ten years later and other computer storage rules of thumb” - PowerPoint PPT Presentation


  • 293 Views
  • Uploaded on

“Five minute rule ten years later and other computer storage rules of thumb”. Authors: Jim Gray, Goetz Graefe Reviewed by: Nagapramod Mandagere Biplob Debnath. Outline. Problem Statement Motivation Importance and Relevance Main Contributions and Validation Key Ideas Illustrations

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' “Five minute rule ten years later and other computer storage rules of thumb”' - Jeffrey


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Five minute rule ten years later and other computer storage rules of thumb l.jpg

“Five minute rule ten years later and other computer storage rules of thumb”

Authors: Jim Gray, Goetz Graefe

Reviewed by: Nagapramod Mandagere Biplob Debnath


Outline l.jpg
Outline storage rules of thumb”

  • Problem Statement

  • Motivation

  • Importance and Relevance

  • Main Contributions and Validation

  • Key Ideas

  • Illustrations

  • New Metrics

  • Assumptions

  • Re-write Today

  • Questions


Problem statement l.jpg
Problem Statement storage rules of thumb”

  • Broader Problem: Viewing developments over a long period of time to try and extract important technology trends.

  • Specific Instance: Inferring rules of thumb for buffer replacement policies in a number of settings, including RAID environments.

    • Given: Trends over time for parameters such as memory cost, disk cost, tape cost

    • Find: Rules of thumb for deciding where to store the data and when to replace data from memory buffer

    • Objectives: Simple rules, extensible rules

    • Constraints: Hierarchical Storage Model


Typical database administrators dilemma l.jpg
Typical Database Administrators Dilemma storage rules of thumb”

The performance isn’t good. Am I doing something wrong?

Should I cache on the client?

Should I cache this data in memory?

Should store data back on disk? (local or network disk)

Should I move data to tape?


Importance relevance l.jpg
Importance & Relevance storage rules of thumb”

  • Different rates at which parameters changes

  • seek/second & Disk capacity – 10x to 100x

  • Disk MB/K$ & DRAM MB/K$ - 1000x


Importance relevance6 l.jpg
Importance & Relevance storage rules of thumb”

  • The location of data is very important

    • Main Memory: Very Fast, Expensive, limited size

    • Disk Storage: Lot slower that main memory, inexpensive, close to unlimited size

    • Tape Storage: Slowest, dirt cheap, unlimited capacity

  • How can one decide what data resides where?

    • System Learns from data access patterns and adapts (Admins hate to give up control)

    • Administrator controls data locality by using some experience or historical performance info (rules of thumb)


Main contributions validation l.jpg
Main Contributions & Validation storage rules of thumb”

  • The Five minute rule

    • Randomly accessed buffer pages can be replaced if unused for more than 5 minutes.

    • Sequentially accessed buffer pages can be replaced if unused for more than 1 minute.

  • Metrics for storage performance characterization

    • Cost/Access

    • Maps: Megabyte accesses per second

    • Scan: Time it takes to sequentially read or write all the data in the device

  • Validation Methodology - Examples

    • Examples

      • Random access

      • On pass sort

      • Two pass sort

    • Trends observed over a period of time


Key ideas l.jpg
Key Ideas storage rules of thumb”

  • Tradeoff between the cost of RAM and the cost of disk accesses.

    • The tradeoff is that caching pages in the extra memory can save disk IOs.

    • The break-even point is met when the rent on the extra memory for cache ($/page/sec) exactly matches the savings in disk accesses per second ($/disk_access/sec).


Illustration typical system in 1997 l.jpg
Illustration – Typical System in 1997 storage rules of thumb”

  • For a system with following characteristics

    • PagesPerMBofRAM = 128 pages/MB (8KB pages)

    • AccessesPerSecondPerDisk = 64 access/sec/disk

    • PricePerDiskDrive = 2000 $/disk (9GB + controller)

    • PricePerMBofDRAM = 15 $/MB_DRAM

  • The Inter reference interval is 266 seconds ~ 5 minutes


Illustration l.jpg
Illustration storage rules of thumb”

  • One pass algorithms

    • reads data and never references it,

    • no need to cache the data in RAM.

    • system needs only enough buffer memory to allow data to stream from disk to main memory.

    • Typically, two or three one-track buffers (~100 KB) are adequate per disk to buffer disk operations and allow the device to stream data to the application.


Illustration11 l.jpg
Illustration storage rules of thumb”

  • Two pass algorithms

    • sequential operations that read a large dataset and then revisit parts of the data.

    • Database join, cube, rollup, and sort operators

    • Sorting uses two pass if memory size is smaller than the data set size

    • Inter reference time is typically about a minute (sequential data access)


Illustration two pass sort l.jpg
Illustration – Two Pass Sort storage rules of thumb”

  • One pass sort needs larger amount of memory

  • Memory needed grows faster with size of input file

  • For files bigger than memory size, two pass is the only option


Disk vs tape tradeoff l.jpg
Disk vs Tape tradeoff storage rules of thumb”

  • Tape vs Disk Trade off ?????

    • Tape - larger penalty (slower access, least cost)

    • Solution – Larger breakeven point, bigger page size


New metrics l.jpg
New Metrics storage rules of thumb”

  • Data flow applications which stream huge amounts of data like data mining applications, multimedia applications

  • New Metrics

    • Kaps

      • Kilo byte accesses per second

    • Maps

      • Mega byte accesses per second

    • Scan

      • Time taken to sequentially read or write all data on a device

  • These metrics combined with rent costs provide a price/performance metric


Assumptions l.jpg
Assumptions storage rules of thumb”

  • Disk storages have same characteristics (cost/performance). It assumes that the disk storage systems is homogenous and does not consider the more recent shift towards hierarchical/heterogeneous storage systems.

  • The trade off only consider the performance aspect, the security and fault tolerance issues are assumed to be uniform throughout.


Re write l.jpg
Re-write storage rules of thumb”

  • Re-evaluate the rules of thumb considering more recent costs and the more recent trends in storage systems like heterogeneous/hierarchical storage

    • Take into account SAN, NAS characteristics


Questions l.jpg
Questions??? storage rules of thumb”

  • Does Five minute rule hold good today???

  • No (With Reservations)

    • If one changes the Page Size to MegaByte range, five minute rule still applies.

      • Pages/MB of RAM = 16 (8 K pages)

      • Access/sec/disk = 64

      • Price/disk drive = $400

      • Price/MB of RAM = $0.1

      • Break even point ~ 1000s

  • Further Evidence - Jim (Keynote in FAST 2004) Grayhttp://www.usenix.org/events/fast05/


ad