Bridging the Information Gap in Storage Protocol Stacks - PowerPoint PPT Presentation

issac
bridging the information gap in storage protocol stacks n.
Skip this Video
Loading SlideShow in 5 Seconds..
Bridging the Information Gap in Storage Protocol Stacks PowerPoint Presentation
Download Presentation
Bridging the Information Gap in Storage Protocol Stacks

play fullscreen
1 / 32
Download Presentation
Bridging the Information Gap in Storage Protocol Stacks
110 Views
Download Presentation

Bridging the Information Gap in Storage Protocol Stacks

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. Bridging the Information Gapin Storage Protocol Stacks Timothy E. Denehy, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau University of Wisconsin, Madison

  2. State of Affairs File System Storage System Namespace, Files, Metadata, Layout, Free Space Block Based, Read/Write Parallelism, Redundancy Interface

  3. Problem • Information gap may cause problems • Poor performance • Partial stripe write operations • Duplicated functionality • Logging in file system and storage system • Reduced functionality • Storage system lacks knowledge of files • Time to re-examine the division of labor

  4. Informed LFS Exposed RAID Our Approach • Enhance the storage interface • Expose performance and failure information • Use information to provide new functionality • On-line expansion • Dynamic parallelism • Flexible redundancy

  5. Outline • ERAID Overview • I·LFS Overview • Functionality and Evaluation • On-line expansion • Dynamic parallelism • Flexible redundancy • Lazy redundancy • Conclusion

  6. ERAID Goals • Backwards compatibility • Block-based interface • Linear, concatenated address space • Expose information to the file system above • Regions • Performance • Failure • Allow file system to utilize semantic knowledge

  7. ERAID Regions • Region • Contiguous portion of the address space • Regions can be added to expand the address space • Region composition • RAID: One region for all disks • Exposed: Separate regions for each disk • Hybrid ERAID

  8. ERAID Performance Information • Exposed on a per-region basis • Queue length and throughput • Reveals • Static disk heterogeneity • Dynamic performance and load fluctuations ERAID

  9. ERAID Failure Information • Exposed on a per-region basis • Number of tolerable failures • Reveals • Static differences in failure characteristics • Dynamic failures to file system above ERAID X RAID1

  10. Outline • ERAID Overview • I·LFS Overview • Functionality and Evaluation • On-line expansion • Dynamic parallelism • Flexible redundancy • Lazy redundancy • Conclusion

  11. I·LFS Overview • Log-structured file system • Transforms all writes into large sequential writes • All data and metadata is written to a log • Log is a collection of segments • Segment table describes each segment • Cleaner process produces empty segments • Why use LFS for an informed file system? • Write-anywhere design provides flexibility • Ideas applicable to other file systems

  12. I·LFS Overview • Goals • Improve performance, functionality, and manageability • Minimize system complexity • Exploits ERAID information to provide • On-line expansion • Dynamic parallelism • Flexible redundancy • Lazy redundancy

  13. I·LFS Experimental Platform • NetBSD 1.5 • 1 GHz Intel Pentium III Xeon • 128 MB RAM • Four fast disks • Seagate Cheetah 36XL, 21.6 MB/s • Four slow disks • Seagate Barracuda 4XL, 7.5 MB/s

  14. I·LFS Baseline Performance • Four slow disks: 30 MB/s • Four fast disks: 80 MB/s

  15. Outline • ERAID Overview • I·LFS Overview • Functionality and Evaluation • On-line expansion • Dynamic parallelism • Flexible redundancy • Lazy redundancy • Conclusion

  16. I·LFS On-line Expansion • Goal: Expand storage incrementally • Capacity • Performance • Ideal: Instant disk addition • Minimize downtime • Simplify administration • I·LFS supports on-line addition of new disks

  17. I·LFS On-line Expansion Details • ERAID: Expandable address space • Expansion is equivalent to adding empty segments • Start with an oversized segment table • Activate new portion of segment table

  18. I·LFS On-line Expansion Experiment • I·LFS immediately takes advantage of each extra disk

  19. I·LFS Dynamic Parallelism • Goal: Perform well on heterogeneous storage • Static performance differences • Dynamic performance fluctuations • Ideal: Maximize throughput of the storage system • I·LFS writes data proportionate to performance

  20. I·LFS Dynamic Parallelism Details • ERAID: Dynamic performance information • Most file system routines are not changed • Aware of only the ERAID linear address space • Reduces file system complexity • Segment selection routine • Aware of ERAID regions and performance • Chooses next segment based on current performance

  21. I·LFS Static Parallelism Experiment • Simple striping limited by the rate of the slowest disk • I·LFS provides the full throughput of the system

  22. I·LFS Dynamic Parallelism Experiment • I·LFS adjusts to the performance fluctuation

  23. I·LFS Flexible Redundancy • Goal: Offer new redundancy options to users • Ideal: Range of mechanisms and granularities • I·LFS provides mirroredper-file redundancy

  24. I·LFS Flexible Redundancy Details • ERAID: Region failure characteristics • Use separate files for redundancy • Even inode N for original files • Odd inode N+1 for redundant files • Original and redundant data in different sets of regions • Flexible data placement within the regions • Use recursive vnode operations for redundant files • Leverage existing routines to reduce complexity

  25. I·LFS Flexible Redundancy Experiment • I·LFS provides a throughput and reliability tradeoff

  26. I·LFS Lazy Redundancy • Goal: Avoid replication performance penalty • Ideal: Replicate data immediately before failure • I·LFS offers redundancy with delayed replication • Avoids replication penalty for short-lived files

  27. I·LFS Lazy Redundancy • ERAID: Region failure characteristics • Segments needing replication are flagged • Cleaner acts as replicator • Locates flagged segments • Checks data liveness and lifetime • Generates redundant copies of files

  28. I·LFS Lazy Redundancy Experiment • I·LFS avoids performance penalty for short-lived files

  29. Outline • ERAID Overview • I·LFS Overview • Functionality and Evaluation • On-line expansion • Dynamic parallelism • Flexible redundancy • Lazy redundancy • Conclusion

  30. Comparison with Traditional Systems • On-line expansion • Yes • Dynamic parallelism (heterogeneous storage) • Yes, but with duplicated functionality • Flexible redundancy • No, the storage system is not aware of file composition • Lazy redundancy • No, the storage system is not aware of file deletions

  31. Conclusion • Introduced ERAID and I·LFS • Extra information enables new functionality • Difficult or impossible in traditional systems • Minimal complexity • 19% increase in code size • Time to re-examine the division of labor

  32. Questions? http://www.cs.wisc.edu/wind/