1 / 64

Object Based Disk: the key to intelligent I/O

Object Based Disk: the key to intelligent I/O. George Gorbatenko Data Machine International St Paul, MN 55115 gorby@ece.umn.edu. Why are we interested?. faster transportable more accessible cheaper facilitates holistic design improve reliability.

denise
Download Presentation

Object Based Disk: the key to intelligent I/O

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Object Based Disk: the key to intelligent I/O George Gorbatenko Data Machine International St Paul, MN 55115 gorby@ece.umn.edu

  2. Why are we interested? • faster • transportable • more accessible • cheaper • facilitates holistic design • improve reliability DMI

  3. I/O is considered the weak link in systems architecture • I/O problem • memory wall • bottle neck DMI

  4. Issues • randomness is painful • mechanical time vs electronic time • ratio of times is about 200:1 • operating system obscures the disk DMI

  5. Operating System • seamless view of space • legacy of data storage goes back to punched card • accommodates all applications DMI

  6. Data evolution • tape reflected a 80 column card image • disk reflected tape DMI

  7. In short… • nothing much has changed data format-wise since the 1930’s • we are pretty much dealing with records in a linear format, one record after the next DMI

  8. The advantage of object based design is • encapsulate the data • define the application subset • don’t have the operating system getting in the way DMI

  9. SQL object is good choice • broad user base • de facto standard for data bases • high enough to exploit the power in the I/O • yesterdays CPU in today’s disk (controller) • aggregate compute power exceeds the host DMI

  10. Researchers in Intelligent Disks are motivated by… • exploiting the latent processing potential • filtering data in place DMI

  11. Consider a disk farm… DMI

  12. But where do we place the intelligence? • host • I/O controller • disk DMI

  13. many platters (ea fixed head) 10 many concentric tracks / platter 10k each track holds many sectors 100 Total number of 512 byte sectors 10M ____ disk capacity: 5GB Disk basics DMI

  14. To access a random block • seek to track 10-15 us • wait for block to roll around 4 –5 us • read block 80 us hence… 200:1 DMI

  15. Design Goals • synchronous operation • next data you want is beneath head • process data in place (filter) • touch the min amount of data • for what you touch you pay in time and space • exploit locality • amortize random access read over large data block DMI

  16. Access strategies… • Amortize the (inefficient) access over large block of data • Make sure the data has utility DMI

  17. Optimum Block Size DMI

  18. Select name, address,salary where salary >22K DMI

  19. Data Utility… DMI

  20. Consider the travels of an inchworm… A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3 C3 D3 E3 A4 B4 C4 D4 E4 A5 B5 C5 D5 E5 DMI

  21. Travels of an inchworm… A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3C3D3 E3 A4 B4 C4 D4 E4 A5 B5 C5 D5 E5 DMI

  22. Travels of an inchworm… A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3C3D3 E3 A4 B4 C4 D4 E4 A5 B5 C5 D5 E5 DMI

  23. Locality of Reference A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3C3D3 E3 A4 B4 C4 D4 E4 A5 B5 C5 D5 E5 (a) Logical view of two dimensional table. A1 B1 C1 D1 E1 A2 B2 C2 D2 E2 A3 B3 C3 D3…… (b) Row ordered mapping (physical). A1 A2 A3 A4 A5B1 B2 B3 B4 B5 C1 C2 C3 C4…… (c) Column ordered mapping (physical) DMI

  24. Preservation of Logical Topology To preserve the logical topology of n dimensional logical data space, the physical space must at least be of like dimension. - for a 2D table (rows and columns) we need to view disk as two dimensional DMI

  25. Observations: • SQL can be decomposed in two operations • select - favored by column order • extract – favored by row order • granular access permits touching min data • map data so as to preserve topology when going from logical to physical medium • reading a tracks worth of data appears reasonable DMI

  26. Treating disk as 2D space • data objects are 2D spaces • solves “design boundaries” • disk is basically a 3D medium • cylinder-track-sector DMI

  27. The disk is 3 dimensional DMI

  28. Consider the first cylinder of the set… DMI

  29. Examining a single cylinder… DMI

  30. which has tracks and sectors… DMI

  31. track for each head… DMI

  32. track read… DMI

  33. diagonal (sector block) read… DMI

  34. sector block shadow DMI

  35. Unwrap a cylinder… DMI

  36. 2 dimensional space: hd x sector DMI

  37. track read or sector block read… DMI

  38. Physical Sector Block Organization… Physical sector (512) Logical sector size (lss) DMI

  39. Logical Sector Block Organization… Physical sector (512) Logical sector size (lss) DMI

  40. record structure… typedef struct _record { char employee_no [8]; // employee number; field A char name [12]; // name; field B char address [24]; // address; field C char zip [5]; // zip code; field D char salary [6]; // salary; field E char doh [6]; // data of hire; field F char dept [3]; // department; field G char tbd [16]; // reserved for future use; field H } Record; DMI

  41. modified best fit algorithm LSS (8 bytes) LSS = ceil (rec_len / num_hds) = ceil (64 /10) = 4n = 8 rec_space = LSS * num_hds = 80 bytes DMI

  42. modified best fit algorithm LSS (8 bytes) A typedef struct _record { char employee_no [8]; // field A char name [12]; // field B char address [24]; // field C char zip [5]; // field D char salary [6]; // field E char doh [6]; // field F char dept [3]; // field G char tbd [16]; // field H } Record; B C D G E F DMI

  43. SQL Decomposition… • Select records • scan the salary field • stores ordinal position in bit vector • Extract records • optimizer decides strategy (trk or sb read) DMI

  44. Comparison Results… DMI

  45. Prototype • two 4 GB Seagate Baracudas • 21 heads (29 zones) • 40 KLOC • skew = 5 sectors • Solaris 2.51 OS • emulated intelligence in IOP • context sw every 60 ms DMI

  46. Data particulars… • 168 byte records • LSS = 8 bytes • 63 records per Sector Block • 7,749 records per cylinder • 3 fields (2 heads) involved in query • 2 records extracted from disjoint blocks DMI

  47. Test Runs • write cyl worth data w/o optimizer • write same with optimizer enabled • scan cyl involving 3 col; extract 2 blks • repeat operation (c) DMI

  48. Results… ObservedCalculated case (a) 2.5 sec 2.427 sec case (b) 196 ms 216 ± 4 ms case (c) 51 ms 54.5 ± 4ms case (d) 42 ms 37.6 ± 4ms DMI

  49. Benchmark Analysis • 3 Benchmarks selected - Wisconsin - Set Query - TPC D/H • selected non-join cases • reversed engineered the I/O detail DMI

  50. Wisconsin results… WISCONSIN BENCHMARK 1000 WIS 2D 100.0 Time ( seconds) 10.00 1.000 0.100 Q1 Q2 Q3 Q4 Q5 Benchmarks DMI

More Related