1 / 12

Design and Implementation of a Log-Structured File System

This paper discusses the design and implementation of a log-structured file system that addresses the issues with existing file systems, such as spreading information too widely and synchronous writing. It covers file location, reading, free space management, segment cleaning mechanisms and policies, crash recovery, and checkpoints.

cmartin
Download Presentation

Design and Implementation of a Log-Structured File System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Design and Implementation of a Log-Structured File System- Mendel Rosenblum and John K. Ousterhout 서동화 dhdh0113@gmail.com

  2. Contents • Background • Problem • Design of log-structured file system • File location and reading • Free space management • Segment cleaning mechanism& polices • Crash recovery • Checkpoints • Roll-forward • Result

  3. Background • Disk &File system? • Seek time, rotational latency, transmission time • The structure and logic rules used to manage the groups of information and their names is called “file system” • File system is used to control how data is stored and retrieved. Physical logical

  4. Problem • Over the last decade CPU speeds have increased dramatically while disk access times have only improved slowly. • This trend is likely to continue in the future and it will cause more and more applications to become disk-bound. • Main memory is increasing in size at an exponential rate. Modern file systems cache recently-used file data in main memory • Lager main memory makes lager file caches possible. • To lessen the impact of this problem, this paper have devised a new disk storage management technique called a “log- structured file system”

  5. Problem with existing file system • Current file systems suffer from two general problem that make it hard for them to cope with the technologies and workloads. • First, they spread information around the disk in a way that causes too many small accesses. • The second problem with current file systems is that they tend to write synchronously. • As a result this problem make it hard for the application to benefit from faster CPU and large memory.

  6. Design of log-structured file system • File location and reading • Summary of the major data structures stored on disk by Sprite LFS • A comparison between Sprite LFS and Unix FFs • Although the two layouts have the same logical structure, the log-structured file system produces a much more compact arrang-ement. As a result, the write performance of Sprite LFS is much better than Unix FFS.

  7. Design of log-structured file system • Free space management • The goal of free space management is to maintain large free extents for writing new data. • Segment cleaning mechanism • The process of copying live data out of a segment is called “segment cleaning” • 1) Read a number of segments into memory. • 2) identify the live data. • 3) write the live data back to a smaller number of clean segments. • Sprite LFS writes a “segment summary block” as part of each segment. • The summary block identifies each piece of information that is written in the segment. • Sprite LFS also uses the segment summary information to distinguish live blocks from those that have been overwritten or deleted.

  8. Design of log-structured file system • Segment cleaning policies • Which segment should be cleaned? • How should the live blocks be grouped when they are written out? • Write cost • The write cost is the average amount of time the disk is busy per byte of new data written, including all the cleaning overheads. (Where u is the utilization of the segment and 0<=u<1) sensitive • The performance of a log-structured file system can be improved by reducing the overall utilization of the disk space. • Trade off between cost and performance. • Need to using bimodal segment distribution.

  9. Design of log-structured file system • Simulation results • Uniform : each file has equal likelihood of being selected in each step. • Using greedy policy • Hot-and-Cold : this access pattern models a simple form of locality. • Using greedy policy and the cleaner also sorts the live data by age before writing it out again. • We realized that hot and cold segments must be treated differently by the cleaner. • It is less beneficial to clean a hot segment because the data will likely die quickly and the free space will rapidly re-accumulate. • The stability can be estimated by the age of data. • Free space in a cold segment is more valuable than free space in a hot segment.

  10. Design of log-structured file system • cost-benefit policy • Benefit of cleaning the segment the segment. • Cost of cleaning the segment . • Choose the highest ratio of benefit to cost. • Segment usage table • In order to sort live blocks by age, the segment summary information records the age of the youngest block written to the segment. (Where u is the utilization of the segment and 0<=u<1)

  11. Design of log-structured file system • Crash recovery • When a system crash occurs, the last few operations performed on the disk. • During reboot the operating system must review these operations in order to correct any inconsistencies. • In LFS, the locations of the last disk operations are easy to determine • Checkpoints • A checkpoint is a position in the log at which all of the file system structures are consistent and complete. • It writes out all modified information to the log. • It writes a checkpoint region to a special fixed position on disk. • Roll-forward • In order to recover as much information as possible, LFS scans through the log segments that were written after the last checkpoint. • During roll-forward, LFS uses the information in segment summary blocks to recover recently-written file data. • Restore consistency between directory entries and inodes.

  12. Result • Experience with the Sprite LFS • Small file (1kb) • Cleaning overheads • Large file (100 mb)

More Related