670 likes | 682 Views
File System Implementations. CS-502, Operating Systems Fall 2009 (EMC) (Slides include materials from Modern Operating Systems , 3 rd ed., by Andrew Tanenbaum and from Operating System Concepts , 7 th ed., by Silbershatz, Galvin, & Gagne). Review – the File abstraction.
E N D
File System Implementations CS-502, Operating SystemsFall 2009 (EMC) (Slides include materials from Modern Operating Systems, 3rd ed., by Andrew Tanenbaum and from Operating System Concepts, 7th ed., by Silbershatz, Galvin, & Gagne) File System Implementations
Review – the File abstraction • A (potentially) large amount of information or data that lives a (potentially) very long time • Often much larger than the memory of the computer • Often much longer than any computation • Sometimes longer than life of machine itself • (Usually) organized as a linear array of bytes or blocks • Internal structure is imposed by application • (Occasionally) blocks may be variable length • (Often) requiring concurrent access by multiple processes • Even by processes on different machines! File System Implementations
File Systems and Disks • User view • File is a named,persistent collection of data • OS & file system view • File is collection of disk blocks — i.e., a container • File System maps file names and offsets to disk blocks File System Implementations
This topic – Implementation of Files • Create file abstraction using physical disk devices and disk blocks • Efficient in time, space, use of disk resources • Fast enough for application requirements • Must be scalable to a wide variety of file sizes • Many small files (< 1 page) • Huge files (100’s of gigabytes, terabytes, spanning disks) • Everything in between File System Implementations
Reading Assignment • Tanenbaum §4.3 File System Implementations
Overview • File Allocation methods • Sequential • Linked and FAT • Indexed • Free blocks & bad blocks • Scalability • Mounting & Virtual File Systems • Directory implementation notes File System Implementations
Definition — Volume • The fundamental unit of a file system • Physical volume may be a • physical disk storage device • physical partition of a single disk (aka minidisk) • Logical Volume is • A physical volume • A combination of other volumes • Usually similar in size and characteristics • Volume is also used (loosely) to denote • the part of the file system implemented on a physical or logical volume File System Implementations
File Allocation Schemes • Contiguous • Blocks of file stored in consecutive disk sectors • Directory points to first entry & specifies length • Linked • Blocks of file scattered across disk, as linked list • Directory points to first entry • Indexed • Blocks of file scattered across disk • Separate index block contains pointers to file blocks • Directory points to index block File System Implementations
File Allocation Schemes (continued) • The allocation scheme is an attribute of a file system, not of individual files within a system. • All files within a file system follow same allocation model File System Implementations
File Allocation Schemes • Contiguous • Blocks of file stored in consecutive disk sectors • Directory points to first entry • Linked • Blocks of file scattered across disk, as linked list • Directory points to first entry • Indexed • Separate index block contains pointers to file blocks • Directory points to index block File System Implementations
Contiguous Allocation • Ideal for large, static files • Databases, fixed system structures, OS code • Multi-media video and audio • CD-ROM, DVD • Simple address calculation • Directory entry points to first sector • File block i disk sector address • Fast multi-block reads and writes • Minimize seeks between blocks File System Implementations
Contiguously Allocated Files File System Implementations
Block-to-sector Calculation • To find disk sector containing block i of file f • Starting_block(f) + i • Starting block of each file is named in • Directory, or • File metadata File System Implementations
File Creation(Contiguous File System) • Search for an empty sequence of blocks • First-fit • Best-fit • Prone to fragmentation when … • Files come and go • Files change size • Similar to physical memory allocation in base-limit type of virtual memory File System Implementations
Contiguous Allocation – Extents • Extent: a contiguously allocated subset of a file • Directory entry points to • (For file with one extent) the extent itself • (For file with multiple extents) pointer to an extent block describing multiple extents • Advantages • Speed, ease of address calculation of contiguous file • Avoids (some of) the fragmentation issues • Can be adapted to support files across multiple disks • … File System Implementations
Contiguous Allocation – Extents • … • Disadvantages • Too many extents degenerates to indexed allocation • As in Unix-like systems, but not so well • Popular in 1960s & 70s • OS/360, other systems for commercial data processing • Currently used for large files in NTFS • Rarely mentioned in textbooks • Silbershatz, 7th ed., §11.4.1 & 22.5.1 File System Implementations
Digression: Bad Block Management • Bad blocks on disks are inevitable • Part of manufacturing process (less than 1%) • Most are detected during formatting • Occasionally, blocks become bad during operation • Manufacturers typically add extra tracks to disks • Physical capacity = (1 + x) * rated_capacity • Who handles bad blocks? • Disk controller: Bad block list maintained internally • Automatically substitutes good blocks • Formatter: Re-organize track to avoid bad blocks • OS: Bad block list maintained by OS, bad blocks never used File System Implementations
Bad Block Management inContiguous Allocation File Systems • Bad blocks must be concealed • Foul up the block-to-sector calculation • Methods • Look-aside list of bad sectors • Check each sector request against hash table • If present, substitute a replacement sector behind the scenes • Spare sectors in each track, remapped by formatting • Handling • Disk controller, invisible to OS • Lower levels of OS; concealed from higher layers of file system and from application File System Implementations
Questions? File System Implementations
File Allocation Schemes • Contiguous • Blocks of file stored in consecutive disk sectors • Directory points to first entry • Linked • Blocks of file scattered across disk, as linked list • Directory points to first entry • Indexed • Separate index block contains pointers to file blocks • Directory points to index block File System Implementations
Blocks scattered across disk Each block contains pointer to next block Directory points to first and last blocks Sector header: Pointer to next block ID and block number of file 10 16 25 01 Linked Allocation File System Implementations
Linked Allocation • Advantages • No external fragmentation of file space! • Easy to create, extend files • Ideal for lots of small files • Disadvantages • Lots of disk arm movement • Space taken up by links • Sequential access only! • Random access simulated by caching links • Used in Xerox Alto file system File System Implementations
Bad Block Management –Linked File Systems • In OS:– format all sectors of disk • Don’t reserve any spare sectors • Allocate bad blocks to a hidden file for the purpose • If a block becomes bad, “append” to the hidden file • Advantages • Very simple • No look-aside or sector remapping needed • Totally transparent without any hidden mechanism File System Implementations
Linked File System – Limitation • To access the ith block, it is necessary to physically read the first (i1) blocks from disk! • Serious problem for large files! File System Implementations
Solution – File Allocation Table (FAT) • Instead of link on each block, put all links in one table — the FAT • Each entry corresponds to physical block in disk • ith entry contains link for block i • Each entry points to next block (or EOF) • Directory points to first & last blocks of file File System Implementations
FAT File Systems • Advantages • Advantages of Linked File System • FAT can be cached in memory • Searchable at CPU speeds pseudo-random access • Disadvantages • Limited size, not suitable for very large disks • FAT cache describes entire disk, not just open files! • Not fast enough for large databases • Used in MS-DOS, early Windows systems • Also USB Flash drives, floppy disks, etc. File System Implementations
Bad Block Management –FAT File Systems • Same as Linked File Systems • I.e., format all sectors of disk • Don’t reserve any spare sectors • Allocate bad blocks to a hidden file for the purpose • If a block becomes bad, append to the hidden file • Same advantages and disadvantages File System Implementations
Disk Defragmentation • Re-organize blocks in disk so that file is (mostly) contiguous • Link or FAT organization preserved • Purpose: • To reduce disk arm movement during sequential accesses • To permit contiguous reads or writes in one disk operation • Does not change the linked structure of the file system! File System Implementations
Exam Question (last spring) • You have a humongous database stored in a file on a 4 GB flash drive with a FAT file system. What must the file system do to locate block n of the database? • Assume that database has not been defragmented, so that its blocks are likely to be scattered randomly across the flash drive. • Given that the file system has found the location of block n, what must it do to find the location of block n+1? block n-1? File System Implementations
Questions? Linked and FAT File Systems File System Implementations
File Allocation Schemes • Contiguous • Blocks of file stored in consecutive disk sectors • Directory points to first entry • Linked • Blocks of file scattered across disk, as linked list • Directory points to first entry • Indexed • Separate index block contains pointers to file blocks • Directory points to index block File System Implementations
Problems and Issues • Contiguous file systems not suitable for files that come and go rapidly & frequently • Linked file systems not suitable for very large disks or a lot of random access • Can we do better? File System Implementations
Indexed Allocation • i-node: • Contains file metadata • Lists sector address of each block of file • Advantages • True random access • Only i-nodes of open files need to be cached • Supports small and large files File System Implementations
Unix/Linux i-nodes • Direct blocks: • Pointers to first n sectors • Single indirect table: • Extra block containing pointers to blocks n+1 .. n+m • Double indirect table: • Extra block containing single indirect blocks • … File System Implementations
Indexed Allocation • Access to every block of file is via i-node • Bad block management • Similar to Linked/FAT systems • Disadvantage • Not as fast as contiguous allocation for large databases • Requires reference to i-node for every access vs. • Simple calculation of block to sector address File System Implementations
Indexed Allocation (continued) • Widely used in Unix, Linux, Windows NTFS • Robust • Has withstood the test of time • Many variations File System Implementations
Questions? File System Implementations
Free Block Management in File Systems • Bitmap • Very compact on disk • Expensive to search • Supports contiguous allocation • Free list • Linked list of free blocks • Each block contains pointer to next free block • Only head of list needs to be cached in memory • Very fast to search and allocate • Contiguous allocation very difficult File System Implementations
Free Block ManagementBit Vector 0 1 2 n-1 … 0 block[i] free 1 block[i] occupied bit[i] = Free block number calculation (number of bits per word) * (number of 0-value words) + offset of first 1 bit File System Implementations
Free Block ManagementBit Vector (continued) • Bit map • Must be kept both in memory and on disk • Copy in memory and disk may differ • Cannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk File System Implementations
Free Block ManagementBit Vector (continued) • Solution: • Set bit[i] = 1 in disk • Allocate block[i] • Set bit[i] = 1 in memory • Similarly for set of contiguous blocks • Potential for lost blocks in event of crash! • Discussion:– How do we solve this problem? File System Implementations
Free Block ManagementLinked List • Linked list of free blocks • Not necessarily in order! • Cache first few free blocks in memory • Head of list must be stored both • On disk • In memory • Each block must be written to disk when freed • Potential for losing blocks? File System Implementations
Free Block Management – Linked List(continued) • Can also be implemented in FAT • Can also be implemented in Indexed File system File System Implementations
Reading Assignment • Tanenbaum §4.3 File System Implementations
Scalability of File Systems • Question: How large can a file be? • Answer: limited by • Number of bits in length field in metadata • Size & number of block entries in FAT or i-node • Question: How large can file system be? • Answer: limited by • Size & number of block entries in FAT or i-node File System Implementations
MS-DOS & Windows • FAT-12 (primarily on floppy disks): • 4096 512-byte blocks • Only 4086 blocks usable! • FAT-16(early hard drives): • 64 K blocks; block sizes up to 32 K bytes • 2 GBytes max per partition, 4 partitions per disk • FAT-32(Windows 95) • 228 blocks; up to 2 TBytes per disk • Max size FAT requires 232 bytes in RAM! File System Implementations
MS-DOS File System (continued) • Maximum partition for different block sizes • The empty boxes represent forbidden combinations File System Implementations
Classical Unix • Maximum number of i-nodes = 64K! • How many files in a modern PC? • I-node structure allows very large files, but … • Limited by size of internal fields • Limited by # of entries in triple-indirect block File System Implementations
Modern Operating Systems • Need much larger, more flexible file systems • Many terabytes per system • Multi-terabyte files • Suitable for both large and small • Cache only open files in RAM File System Implementations
Examples of Modern File Systems • Windows NTFS • Tanenbaum §11.8 • Linux ext2 and ext3 • Tanenbaum §10.6.3 • Other file systems … • Consult your favorite Linux system documentation File System Implementations