Jim Gray Talk at University of Tokyo Personal views on PITAC report: invest in long term research Preview of Turing lecture: 10 long term research problems Bush: Summarize info in cyberspace Turing: Intelligent Computers 7 9s: build systems that are always up and prove it. 5-Minute rule
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
Presidential Advisory Committee onHigh Performance Computing and Communications,Information Technologies, and the Next Generation InternetInformation Technology
industry spends LITTLE on long-term research.
it is not in their best interest
computer science research
is different from
the application of computers to some discipline.
Cache 1, 2
Main (1, 2, 3 if nUMA).
Disk (1 (cached), 2)
Tape (1 (mounted), 2)
10Today’s Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs
Size vs Speed
Price vs Speed
Typical System (bytes)
Access Time (seconds)
Access Time (seconds)
PageSize x Lifetime
A$: cost of a disk access
AccessesPerSec x Lifetime
RI: Reference Interval
time between accesses to page
Disk access cost A$/RI
Cost of a RAM pageThe 5 Minute Rule Derived
M$ = A$ / Reference Interval
Reference Interval = M$/A$
= DiskPrice x PageSize
RAMprice x AccPerSec
Reference Interval =Time
(2) Economic term: DiskPrice / RAM_MB_Price ~ 400:4 = 100:1
(1) Technology term: PageSize / DiskAccPerSec ~ 8KB : 80 = 100:1
At 10 MB/s: 1.2 days to scan
1,000 x parallel: 100 seconds SCAN.
Parallelism: divide a big problem into many smaller ones to be solved in parallel.
The 1 TB disc card
An array of discs
Can be used as
1 striped disc
10 Fault Tolerant discs
LOTS of accesses/second
27 hr Scan
Scan in 27 hours.
many independent tape robots
(like a disc farm)
Optical is cheap: 200 $/platter
=> 100$/GB (2x cheaper than disc)
Tape is cheap: 30 $/tape
=> 1.5 $/GB (100x cheaper than disc).
The Myth: seek or pick time dominates
The reality: (1) Queuing dominates
(2) Transfer dominates BLOBs
(3) Disk seeks often short
Implication: many cheap servers better than one fast expensive server
This is now obvious for disk arrays
This will be obvious for tape arrays