1 / 35

The Rebirth of Database Machines

The Rebirth of Database Machines. Dina Bitton Jim Gray. Outline. Active Disks are coming Disk Tutorial (not presented, but slides in deck) Disk Arms are important (optimize them) The Rebirth of Database Machines. Disks of 30 Years Ago. 10 MB Failed every few weeks Cost more than 400$.

tiva
Download Presentation

The Rebirth of Database Machines

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Rebirth ofDatabase Machines Dina Bitton Jim Gray Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  2. Outline • Active Disks are coming • Disk Tutorial (not presented, but slides in deck) • Disk Arms are important (optimize them) • The Rebirth of Database Machines Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  3. Disks of 30 Years Ago • 10 MB • Failed every few weeks • Cost more than 400$ Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  4. Disk Arrays • 24 cpus • 384 disks • More mips in the disks than in the cpus Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  5. Year 2003 Disks • Big disk (10 $/GB) • 3” • 200 GB • 150 kaps (k accesses per second) • 30 MBps sequential • Small disk (20 $/GB) • 2” • 40 GB • 100 kaps • 20 MBps sequential • Both running DBMS, Mail, Web, and OS Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  6. From CMU Active Disk web sitehttp://www.pdl.cs.cmu.edu/Active/ Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  7. Research Problem: When every disk is a super-computer…And there are thousands of them... • Who manages data placement? • Query plans among 1,000 severs? • How does • mirroring work? • backup work? • Where does my program run? Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  8. Relevant University Research on Active Disks • Kim Keeton & Dave Patterson @ UC Berkeleyhttp://www.cs.berkeley.edu/~pattrsn/talks/sigmod98-keynote.ppt • Erik Riedel & Garth Gibson @ CMUhttp://www.pdl.cs.cmu.edu/Active/ • Mike Franklin @ U Marylandhttp://www.cs.umd.edu/projects/bdisk • Anurag Acharya, Mustafa Uysal @ UC SBhttp://www.cs.ucsb.edu/TRs/TRCS98-06.html Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  9. Outline • Active Disks are coming • Disk Tutorial (not presented, but slides in deck) • Disk Arms are important • The Rebirth of Database Machines Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  10. Disk Access Time • Access time = SeekTime 6 ms + RotateTime 3 ms + ReadTime 1 ms • Rotate time: • 5,000 to 10,000 rpm • ~ 12 to 6 milliseconds per rotation • ~ 6 to 3 ms rotational latency Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  11. Disk Access Time Improves Slowly • Access time = SeekTime 6 ms 8%/y + RotateTime 3 ms 8%/y + ReadTime 1 ms 40%/y • Other useful facts: • Power rises more than size3 (small is indeed beautiful) • Small devices are more rugged • Small devices can use plastics (forces are much smaller)e.g. bugs fall without breaking anything Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  12. Disk Seek Time • Seek time is ~ Sqrt(distance)(distance = 1/2 acceleration x time2) • Specs assume seek is 1/3 of disk • Short seeks are common. (over 50% are zero length) • Typical 1/3 seek time: 6 ms • 4x improvement in 20 years. Full Stop Full Accelerate speed time Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  13. Disk Access Ratios Have Changed • Key metrics: $/GB Kaps/GB (KB accesses per second per GB) SCAN: time to scan the disk • Scan going from minutes to days • Disk arms are precious resource (disk capacity is no longer the precious resource)Kaps/GB went from 500 to 7 and going to 1 Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  14. Stripe For More Bandwidth • N-stores have N-times the bandwidth • Works great! • Supported by most file systems Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  15. Mirrors: Replicate Stores for Availability • Read one, write all • If one fails, rebuild from survivor • Run scrubber in background to fix faults • N-replicas can give N-times the bandwidth • UnAvailabity ~ A Million Years!!! Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  16. RAID5: Parity Saves Storage Space • Mirrors: 50% storage overhead • read one, write both • RAID5: 12% Storage overhead: • read one, write one plus parity PARITY Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  17. Interesting Fact: Mirrored Disks Optimize Disk Arms • Doubles read bandwidthSequential: Read stagger reads from each drive (stripe) Random: Read closest armseek is min seek. • Doubles write cost (write both) • Write time increases becauseseek is max seek. Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  18. If Mix Reads & WritesMirror is Better Than Partition • 2 servers are better than one • Benefit is better than 2x write cost if reads  writes Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  19. What if you have LOTS of Disks • When you have BIG disks (200 GB), arms are precious, space is cheap. • If you replicate 1000x • write seek time asymptotically approaches 1.7x • read seek time asymptotically approaches zero. Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  20. Outline • Active Disks are coming • Disk Tutorial (not presented, but slides in deck) • Disk Arms are important • The Rebirth of Database Machines Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  21. The Rebirth of Database Machines Dina Bitton Jim Gray IDS Microsoft Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  22. Performance hungry databases History: life and death of database machines What has changed that can make database machines work today Shared-Nothing Database Machine Where is the required bandwidth DMP : Shared-Nothing & Shared-Everything Outline Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  23. Larger Databases: marketing data warehouses: TB of historical data daily news broadcasts: 1 TB of searchable video/audio data Large Scans: Searches require access to large fraction of database Repeated Scans: DSS queries, Data mining algorithms Demand for Database Performance Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  24. Life, Death & Reincarnation Database Machines are coming, Database Machines are coming ... (Hsiao 1979) Then there was Britton-Lee, Direct, ICL … Teradata builds highly-parallel shared-nothing SQL server many university “paper” designs “Database Machines, An Idea whose time has Passed?” (Boral- DeWitt 1983) Then there was MMDBs, Grace, Gamma and more Teradata Then there was Software (Parallel Database Query) Next: PDQ + lots of disks with power controllers Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  25. And All Along Stonebraker’s Opinion: “The history of DBMS research is littered with innumerable proposals to construct hardware database machines to provide high performance operations. In general these have been proposed by hardware types with a clever solution in searchof a problem on which it might work.” Readings in Database Systems, Morgan-Kaufmann Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  26. Why Not then, but Yes now • Too early: small databases on 1 diskTB databases span thousands disks, need partitioning • Disk filter designs: addressed only small part of DBMS requirementsdisk controllers are fast computers • Exotic technologies (bubbles, CCD…) went away • Special purpose hardware increased design time and costHigher level of integration,VLSI design tools better • Parallel query processing was not well-understoodLarge body of research, successful commercial implementations Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  27. Parallel Query Processing[DeWitt-Gray CACM91] Pipelining data streams flow from one operator to the next Partitioning tables are partitioned to allow concurrent processing on partitions Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  28. Data Pathway Contention[Patterson Sigmod 1998] Diskexternal I/O bus bottleneck to transfer rate, cost Networkinternal I/O bus interface is bottleneck to delivered bandwidth Memory-Processorprocessor-memory interface (cache+memory bus) is bottleneck to delivered bandwidth Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  29. A Shared-Nothing Database Machine Scalable Interconnect Processor & Memory Processor & Memory Processor & Memory Processor & Memory . . . No contention in memory access or parallel disk access => “Embarrassingly Parallel” Scan [Patterson] But: how fast need Interconnect be? Each processor has own OS, communication protocols,DB instance Exchange data streams for pipelining ops, for sort, merge Can’t support M:N mapping between disks & threads Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  30. Share-Everything? • Need more bandwidth for shipping data streams than network can provide • Need M:N mapping from disks to processors for sort/merge • Control & synchronization: Data-flow best to synchronize processors Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  31. Where to Get the Bandwidth? Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  32. The Data Manipulation Platform To Host Computer To other DMP Boards via high-speed switch • Massive Parallel Operation data-flow control • M:N thread-to-disk I/O interface adapter Bus adapter . . . . . . NP 1 NP 2 NP 16 P 4 P 1 RFM BAM RAM Direct processor to disk access Direct disk to memory connect Direct connection ... 1 80 DMP BOARD Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  33. Select sum(tabX.amount*.08), tabY.region from tabX,tabY where tabX.key=tabY.region group by tabY.region, order by tabY.region; A DSS Query Execution Plan Exchange 5 Sort 1 Sort 2 Exchange 4 Group 1 Group 2 1/10 grouped Exchange 3 Temp Disks 1/10 joined HJoin HJoin HJoin Exchange 1 Exchange 2 1/3 selected Scan tabX 1 Scan tabX 2 Scan tabX 32 Scan tabY 1 Scan tabY 3 . . . . . . . . . . . . 2 32 1 3 1 Database Disks Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  34. Bandwidth Requirements Exchange 5 Sort 1 Sort 2 2.1 MB/s Exchange 4 Group 1 Group 2 21 MB/s Exchange 3 Temp DiskContention HJoin HJoin HJoin Exchange 1 Exchange 2 210 MB/s Scan tabX 1 Scan tabX 2 Scan tabX 32 Scan tabY 1 Scan tabY 3 . . . . . . . . . Database Disks 2 32 1 3 1 32*20MB/s= 640 MB/s Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

  35. Conclusion DMP: shared-nothing and shared-everything IT ISN’T THAT YOU CAN’T SHARE IT IS WHERE YOU SHARE ON A CHIP ON A BOARD ON A NETWORK Bitton & Gray: The Rebirth of Database Machines, http://research.microsoft.com/~Gray/talks/DB_Machine_Rebirth.ppt

More Related