1 / 53

Rules of Thumb in Data Engineering

In this talk, Jim Gray discusses the rules of thumb in data engineering, including storage, networking, and caching. He also discusses the impact of Moore's Law and the importance of technology ratios.

Download Presentation

Rules of Thumb in Data Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rules of Thumb in Data Engineering Jim Gray Microsoft Storage Lunch 10 July 2001 Gray@Microsoft.com, http://research.Microsoft.com/~Gray/Talks/

  2. Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb

  3. Meta-Message: Technology Ratios Matter • Price and Performance change. • If everything changes in the same way, then nothing really changes. • If some things get much cheaper/faster than others, then that is real change. • Some things are not changing much: • Cost of people • Speed of light • … • And some things are changing a LOT

  4. Trends: Moore’s Law • Performance/Price doubles every 18 months • 100x per decade • Progress in next 18 months = ALL previous progress • New storage = sum of all old storage (ever) • New processing = sum of all old processing. • E. coli double ever 20 minutes! 15 years ago

  5. Trends: ops/s/$ Had Three Growth Phases 1890-1945 Mechanical Relay 7-year doubling 1945-1985 Tube, transistor,.. 2.3 year doubling 1985-2000 Microprocessor 1.0 year doubling

  6. So: a problem • Suppose you have a ten-year compute job on the world’s fastest supercomputer. What should you do. • ? Commit 250M$ now? • ? Program for 9 years Software speedup: 26 = 64x Moore’s law speedup: 26 = 64x so 4,000x speedup: spend 1M$ (not 250M$ on hardware) runs in 2 weeks, not 10 years. • Homework problem: What is the optimum strategy?

  7. Storage capacity beating Moore’s law 3 k$/TB today (raw disk) 1k$/TB by end of 2002

  8. Consequence of Moore’s law:Need an address bit every 18 months. • Moore’s law gives you 2x more in 18 months. • RAM • Today we have 10 MB to 100 GB machines(24-36 bits of addressing) then • In 9 years we will need 6 more bits: 30-42 bit addressing (4TB ram). • Disks • Today we have 10 GB to 100 TB file systems/DBs(33-47 bit file addresses) • In 9 years, we will need 6 more bits40-53 bit file addresses (100 PB files)

  9. Architecture could change this • 1-level store: • System 48, AS400 has 1-level store. • Never re-uses an address. • Needs 96-bit addressing today. • NUMAs and Clusters • Willing to buy a 100 M$ computer? • Then add 6 more address bits. • Only 1-level store pushes us beyond 64-bits • Still, these are “logical” addresses, 64-bit physical will last many years

  10. Trends: Gilder’s Law: 3x bandwidth/year for 25 more years • Today: • 40 Gbps per channel (λ) • 12 channels per fiber (wdm): 500 Gbps • 32 fibers/bundle = 16 Tbps/bundle • In lab 3 Tbps/fiber (400 x WDM) • In theory 25 Tbps per fiber • 1 Tbps = USA 1996 WAN bisection bandwidth • Aggregate bandwidth doubles every 8 months! 1 fiber = 25 Tbps

  11. Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb

  12. How much storage do we need? Yotta Zetta Exa Peta Tera Giga Mega Kilo Everything! Recorded • Soon everything can be recorded and indexed • Most bytes will never be seen by humans. • Data summarization, trend detection anomaly detection are key technologies See Mike Lesk: How much information is there: http://www.lesk.com/mlesk/ksg97/ksg.html See Lyman & Varian: How much information http://www.sims.berkeley.edu/research/projects/how-much-info/ All BooksMultiMedia All LoC books (words) .Movie A Photo A Book 24 Yecto, 21 zepto, 18 atto, 15 femto, 12 pico, 9 nano, 6 micro, 3 milli

  13. Storage Latency: How Far Away is the Data? Andromeda 9 10 Tape /Optical 2,000 Years Robot 6 Pluto Disk 2 Years 10 1.5 hr Springfield 100 Memory This Campus 10 10 min On Board Cache 2 On Chip Cache This Room 1 Registers My Head 1 min

  14. 15 2 10 10 12 0 10 10 9 -2 10 10 6 -4 10 10 3 -6 10 10 Storage Hierarchy : Speed & Capacity vs Cost Tradeoffs Price vs Speed Size vs Speed Nearline Cache Tape Offline Main Tape Disc Secondary Online Online Secondary $/MB Tape Tape Disc Typical System (bytes) Main Offline Nearline Tape Tape Cache -9 -6 -3 0 3 -9 -6 -3 0 3 10 10 10 10 10 10 10 10 10 10 Access Time (seconds) Access Time (seconds)

  15. Disks: Today • Disk is 8GB to 160 GB10-50 MBps5k-15k rpm (6ms-2ms rotational latency)12ms-7ms seek2K$/IDE-TB, 15k$/SCSI-TB • For shared disks most time spent waiting in queue for access to arm/controller Wait Transfer Transfer Rotate Rotate Seek Seek

  16. Standard Storage Metrics • Capacity: • RAM: MB and $/MB: today at 512MB and 200$/GB • Disk: GB and $/GB: today at 80GB and 70k$/TB • Tape: TB and $/TB: today at 40GB and 10k$/TB (nearline) • Access time (latency) • RAM: 100 ns • Disk: 15 ms • Tape: 30 second pick, 30 second position • Transfer rate • RAM: 1-10 GB/s • Disk: 10-50 MB/s - - -Arrays can go to 10GB/s • Tape: 5-15 MB/s - - - Arrays can go to 1GB/s

  17. New Storage Metrics: Kaps, Maps, SCAN • Kaps: How many kilobyte objects served per second • The file server, transaction processing metric • This is the OLD metric. • Maps: How many megabyte objects served per sec • The Multi-Media metric • SCAN: How long to scan all the data • the data mining and utility metric • And • Kaps/$, Maps/$, TBscan/$

  18. 10x better access time 10x more bandwidth 100x more capacity Data 25x cooler (1Kaps/20MB vs 1Kaps/500MB) 4,000x lower media price 20x to 100x lower disk price Scan takes 10x longer (3 min vs 45 min) RAM/disk media price ratio changed 1970-1990 100:1 1990-1995 10:1 1995-1997 50:1 today ~ 6$/GB disk 100:1 600$/GB ram Storage Ratios Changed

  19. Data on Disk Can Move to RAM in 10 years 100:1 10 years

  20. 100 GB 30 MB/s More Kaps and Kaps/$ but…. • Disk accesses got much less expensive Better disks Cheaper disks! • But: disk arms are expensivethe scarce resource • 1 hour Scanvs 5 minutes in 1990

  21. The “Absurd” 10x (=4 year) Disk • 2.5 hr scan time (poor sequential access) • 1 aps / 5 GB (VERY cold data) • It’s a tape! 1 TB 100 MB/s 200 Kaps

  22. Disk 80 GB 20 MBps 5 ms seek time 3 ms rotate latency 3$/GB for drive 3$/GB for ctlrs/cabinet 15 TB/rack 1 hour scan Tape 40 GB 10 MBps 10 sec pick time 30-120 second seek time 2$/GB for media8$/GB for drive+library 10 TB/rack 1 week scan Disk vs Tape Guestimates Cern: 200 TB 3480 tapes 2 col = 50GB Rack = 1 TB = 8 drives The price advantage of tape is narrowing, and the performance advantage of disk is growing At 10K$/TB, disk is competitive with nearline tape.

  23. How to cool disk data: • Cache data in main memory • See 5 minute rule later in presentation • Fewer-larger transfers • Larger pages (512-> 8KB -> 256KB) • Sequential rather than random access • Random 8KB IO is 1.5 MBps • Sequential IO is 30 MBps (20:1 ratio is growing) • Raid1 (mirroring) rather than Raid5 (parity).

  24. Auto Manage Storage • 1980 rule of thumb: • A DataAdmin per 10GB, SysAdmin per mips • 2000 rule of thumb • A DataAdmin per 5TB • SysAdmin per 100 clones (varies with app). • Problem: • 5TB is 50k$ today, 5k$ in a few years. • Admin cost >> storage cost !!!! • Challenge: • Automate ALL storage admin tasks

  25. Summarizing storage rules of thumb (1) • Moore’s law: 4x every 3 years 100x more per decade • Implies 2 bit of addressing every 3 years. • Storage capacities increase 100x/decade • Storage costs drop 100x per decade • Storage throughput increases 10x/decade • Data cools 10x/decade • Disk page sizes increase 5x per decade.

  26. Summarizing storage rules of thumb (2) • RAM:Disk and Disk:Tape cost ratios are 100:1 and 3:1 • So, in 10 years, disk data can move to RAM since prices decline 100x per decade. • A person can administer a million dollars of disk storage: that is 1TB - 100TB today • Disks are replacing tapes as backup devices.You can’t backup/restore a Petabyte quicklyso geoplex it. • Mirroring rather than Parity to save disk arms

  27. Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb

  28. System Bus PCI Bus 1 PCI Bus 2 Standard Architecture (today)

  29. Amdahl’s Balance Laws • parallelism law: If a computation has a serial part S and a parallel component P, then the maximum speedup is (S+P)/S. • balanced system law: A system needs a bit of IO per second per instruction per second:about 8 MIPS per MBps. • memory law:=1:the MB/MIPS ratio (called alpha ()), in a balanced system is 1. • IO law: Programs do one IO per 50,000 instructions.

  30. Amdahl’s Laws Valid 35 Years Later? • Parallelism law is algebra: so SURE! • Balanced system laws? • Look at tpc results (tpcC, tpcH) at http://www.tpc.org/ • Some imagination needed: • What’s an instruction (CPI varies from 1-3)? • RISC, CISC, VLIW, … clocks per instruction,… • What’s an I/O?

  31. MHz/ cpu CPI mips KB/ IO IO/s/ disk Disks Disks/ cpu MB/s/ cpu Ins/ IO Byte Amdahl 1 1 1 6 8 TPC-C= random 550 2.1 262 8 100 397 50 40 7 TPC-H= sequential 550 1.2 458 64 100 176 22 141 3 TPC systems • Normalize for CPI (clocks per instruction) • TPC-C has about 7 ins/byte of IO • TPC-H has 3 ins/byte of IO • TPC-H needs ½ as many disks, sequential vs random • Both use 9GB 10 krpm disks (need arms, not bytes)

  32. TPC systems: What’s alpha (=MB/MIPS)? Hard to say: • Intel 32 bit addressing (= 4GB limit). Known CPI. • IBM, HP, Sun have 64 GB limit. Unknown CPI. • Look at both, guess CPI for IBM, HP, Sun • Alpha is between 1 and 6

  33. Instructions per IO? • We know 8 mips per MBps of IO • So, 8KB page is 64 K instructions • And 64KB page is 512 K instructions. • But, sequential has fewer instructions/byte. (3 vs 7 in tpcH vs tpcC). • So, 64KB page is 200 K instructions.

  34. Amdahl’s Balance Laws Revised • Laws right, just need “interpretation” (imagination?) • Balanced System Law:A system needs 8 MIPS/MBpsIO, but instruction rate must be measured on the workload. • Sequential workloads have low CPI (clocks per instruction), • random workloads tend to have higher CPI. • Alpha (the MB/MIPS ratio) is rising from 1 to 6. This trend will likely continue. • One Random IO’s per 50k instructions. • Sequential IOs are larger One sequential IO per 200k instructions

  35. Application Data File System CPU System Bus 550 x4 Mips = 2 Bips 1600 MBps 1-3 cpi = 170-550 mips 500 MBps PCI System Bus 133 MBps PCI Bus 1 PCI Bus 2 90 MBps SCSI 160 MBps 90 MBps Disks 66 MBps 25 MBps PAP vs RAP • Peak Advertised Performance vs Real Application Performance

  36. Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb

  37. Ubiquitous 10 GBps SANs in 5 years • 1Gbps Ethernet are reality now. • Also FiberChannel ,MyriNet, GigaNet, ServerNet,, ATM,… • 10 Gbps x4 WDM deployed now (OC192) • 3 Tbps WDM working in lab • In 5 years, expect 10x, wow!! 1 GBps 120 MBps (1Gbps) 80 MBps 5 MBps 40 MBps 20 MBps

  38. Networking • WANS are getting faster than LANSG8 = OC192 = 8Gbps is “standard” • Link bandwidth improves 4x per 3 years • Speed of light (60 ms round trip in US) • Software stackshave always been the problem. Time = SenderCPU + ReceiverCPU + bytes/bandwidth This has been the problem

  39. The Promise of SAN/VIA:10x in 2 years http://www.ViArch.org/ • Yesterday: • 10 MBps (100 Mbps Ethernet) • ~20 MBps tcp/ip saturates 2 cpus • round-trip latency ~250 µs • Now • Wires are 10x faster Myrinet, Gbps Ethernet, ServerNet,… • Fast user-level communication • tcp/ip ~ 100 MBps 10% cpu • round-trip latency is 15 us • 1.6 Gbps demoed on a WAN

  40. How much does wire-time cost?$/Mbyte? Cost Time • Gbps Ethernet .2µ$ 10 ms • 100 Mbps Ethernet .3µ$ 100 ms • OC12 (650 Mbps) .003$ 20 ms • DSL .0006$ 25 sec • POTs .002$ 200 sec • Wireless: .80$ 500 sec

  41. Data delivery costs 1$/GB today • Rent for “big” customers: 300$/megabit per second per month • Improved 3x in last 6 years (!). • That translates to 1$/GB you send. • You can mail a 160 GB disk for 20$. • That’s 16x cheaper • If overnight it’s 3 MBps. 3x160 GB ~ ½ TB

  42. Outline • Moore’s Law and consequences • Storage rules of thumb • Balanced systems rules revisited • Networking rules of thumb • Caching rules of thumb

  43. The Five Minute Rule • Trade DRAM for Disk Accesses • Cost of an access (Drive_Cost / Access_per_second) • Cost of a DRAM page ( $/MB/ pages_per_MB) • Break even has two terms: • Technology term and an Economic term • Grew page size to compensate for changing ratios. • Now at 5 minutes for random, 10 seconds sequential

  44. The 5 Minute Rule Derived Breakeven: RAM_$_Per_MB = _____DiskPrice . PagesPerMB T x AccessesPerSecond Disk Access Cost /T DiskPrice . AccessesPerSecond ( )/T Cost a RAM Page RAM_$_Per_MB PagesPerMB $ T =TimeBetweenReferences to Page T = DiskPrice x PagesPerMB . RAM_$_Per_MB x AccessPerSecond

  45. Plugging in the Numbers • Trend is longer times because disk$ not changing much, RAM$ declining 100x/decade 5 Minutes & 10 second rule

  46. The 10 Instruction Rule • Spend 10 instructions /second to save 1 byte • Cost of instruction: I =ProcessorCost/MIPS*LifeTime • Cost of byte: B = RAM_$_Per_B/LifeTime • Breakeven: NxI = B N = B/I = (RAM_$_B X MIPS)/ ProcessorCost ~ (3E-6x5E8)/500 = 3 ins/B for Intel ~ (3E-6x3E8)/10 = 10 ins/B for ARM

  47. Trading Storage for Computation • You can spend 10 bytes of RAM to save 1 instruction/second. • Rent for Disk: 1$/GB (forever) • Processor costs 10$ to 1,000$/mips10$ - 1,000$ for 100 Tera Ops. • So 1$/TeraOp • 1 GB ~ 1 Top 1 MB ~ 1 Gop 1 KB ~ 1 Mop • Save a 1KB object on disk if it costs more than 10 ms to compute.

  48. When to Cache Web Pages. • Caching saves user time • Caching saves wire time • Caching costs storage • Caching only works sometimes: • New pages are a miss • Stale pages are a miss

  49. Web Page Caching Saves People Time • Assume people cost 20$/hour (or .2 $/hr ???) • Assume 20% hit in browser, 40% in proxy • Assume 3 second server time • Caching saves people time 28$/year to 150$/year of people time or .28 cents to 1.5$/year.

  50. Web Page Caching Saves Resources • Wire cost is penny (wireless) to 100µ$ LAN • Storage is 8 µ$/mo • Breakeven: wire cost = storage rent4 to 7 months • Add people cost: breakeven is ~ 4 years.“cheap people” (.2$/hr)  6 to 8 months.

More Related