Chapter 11: File System Implementation • Chapter 11.1 • File-System Structure • File-System Implementation • Directory Implementation • Allocation Methods • Chapter 11.2 • Free-Space Management • Recovery • Log-Structured File Systems • Chapter 12 – An Introduction • Overview of Mass Storage • Disk • Magnetic Tapes
Free-Space Management • We’ve talked about file space structure and file space implementation. • Now: how free space ismanaged? • Whatspaceisavailable, howmuch, and where are the free blocks are located. • In other words, we need some kind of storage map. • Keeping track of free space requires a free-space list. • Free space list is a list of unallocated disk blocks • Free-space list: a list? Some other kind of structure? • Let’s consider a couple of alternative structures used to manage free disk.
Bit Vector Approach – used for space allocation • Bit vector (used to represent n blocks) is one approach. Each block is represented by a singe bit: • 1 block is available; 0 block is in use (allocated) 0 1 2 n-1 … 0 block[i] free 1 block[i] occupied bit[i] = A sample bit vector might appear as 0011110011111100110011 …. This is a very simple approach but very efficient in finding the first free block. Too, a number of instruction sets contain instructions for bit manipulation We can see also how hardware features drives software functionality. Downside: to be efficient in searching, the bit map must be kept in primary memory. For small disks, there is not a problem, but for larger disks (say a 40GB disk with 1KB blocks) a bit map of 5MB is needed!!
Linked List Approach used for Allocation • This approach links all free space together via a linked list. • All we really need in memory (cached) is a pointer to the first free block. • All blocks then contain a pointer to the nextfreeblock • It is easy, however, to quickly see inefficiencies… • To allocate a lot of disk, we must readeachblock. This is an incredible amount of I/O and is very inefficient from a performance perspective! • If only a small amount of storage is necessary, as is usually the case, the linked list approach to disk allocation may be reasonably efficient. • Very often only a single block is requested. • FAT approach uses this approach – each entry in the FAT points to the next block. • With the linked list approach, we cannot get contiguous space easily
Implementing the Bit-Mapped Approach • Need to protect these structures very closely. : • We can implement the bit-mapped approach with a pointer to this free list • Bit map itself should be kept on disk • Mustensure that the copy in memory and disk do not differ. • Cannot allow for block[i] to have a situation where bit[i] = 1 in memory and bit[i] = 0 on disk • Solution: • Set bit[i] = 1 in disk • Allocate block[i] • Set bit[i] = 1 in memory
Implementation of Linked List and Similar Schemes • First Block: Maintain pointer to first block in memory (preferably in cache) • simple to program • time-consuming to execute • Each link points to next link • Grouping – Store addresses of the first n free blocksin the first free block, where the first n-1 blocks are actually free, but where the lastblock contains addresses of another n free blocks. • This greatly improves performance over the standard linked list. • Counting – Know usually several contiguous blocks may be allocated / freed at the same time – particularly when space is allocated using a contiguous allocation scheme or via clustering. • If this is the case, we can keep the address of the first free block and the number of free contiguous blocks that follow the first block. • As a result, the free space list contains a disk address and a count of blocks ‘starting’ at that spot – thus shortening the list considerably.
Linked Free Space List on Disk Shows that address of first free block points to the disk area, where other blocks are linked from one to the next.
Recovery • I cannot say enough about Recovery. • It is absolutely essential in any non-trivial computing environment. • It is easy, and it does indeed happen, that disks fail for any number of reasons – power losses, surges, back sectors on disk, dust, weather, and even malicious intent. • There is nothing more sacred in the world of computers than our data. • Programs can usually be reproduced, people replaced; but safeguarding our data and having it consistent is critical. • Backup and Recovery – two topics – are often not covered in detail in academic environments. But rest assured, in a production environment, these activities and procedures constitute a daily activity and involve planning and established procedures. • I cannot emphasize these topics enough!
Consistency Checking – Recovery • During operations, directory information is kept both in memory and on disk – and it is usually more ‘current’ in memory. • If the directory is kept in cache, it is usually not written back to memory every time it is updated. This would negate the performance gains of cache. • Systems can and do crash at the absolute worst times. If / When so, files, directories, caches, buffers can easily be left in a very inconsistent state. • Operating systems such as Unix and MS-DOS provide systems programs to run consistency checks when needed. • These compare directorydata with datablocksondisk and attempt to repair any inconsistencies these programs may find.
Consistency Checking – Recovery • The degree of success in running these system-supplied programs is largely dependent upon the type of allocation (contiguous, linked, indexed) as well as free-space management routines used to allocate disk. • Sometimes broken links can be repaired; sometimes not. • Loss of directory entries in an indexed organization can be disastrous, because the blocks are linked from one to the next. • If link is broken, we have big trouble. • Interestingly, Unix caches directory entries for reads but any writes that cause any kind of space allocation are done synchronously. • This simply means that the allocation is successfully completed before the write takes place. • Can still have a problem if crash occurs during this process too. • But we always try to minimize and localize any problems.
Backup and Restore • As mentioned in previous materials, all viable computing environments back up data from disk periodically to other media – either other disks, mag tapes, etc. • We can then restore from these backups if needed. • Often times directory information is used to developing the backup. • For files / directories not changed since last backup, backup not needed. • Schedule: depends on lots of things… • May do periodic full backups. • May do incremental backups – all files / directories changed usually overnight. • May simply copy to another medium all files changed since day n • Since the volume of data is often large, these may be blocked big time! • Often too, backed up to very low cost, high volume tapes.
Backup and Restore • We also have combinations of these: • Every so often a full backup. • More frequently, an incremental backup. • A restore can start restoring from the last full backup and then adding the incremental backups.
Log Structured File Systems - Motivation • Typical database systems use log-based recovery algorithms as part of their environment and operation. • We use the same techniques as part of our consistencycheckingapproach for backup and recovery when needed. • Common operations such as creating a file involves many operations and changes to several key data structures associated with the file system. • Certainly the directory structure will be modified, file control blocks are allocated and descriptions are developed, data itself are modified and the corresponding data structures used to house free block counts must be decremented. • All of these things can be corrupted if a system crash occurs somewhere in this process.
Log-Structured File Systems • We have spoken about backup and recovery and special system programs that are generally available to assist in re-establishing files and directories that are consistent. • It is important to note that in some cases, problems (inconsistencies) may not be recoverable. • Sometimes human intervention is required and the system may simply be unavailable until a recovery of some sort is established. • One solution is to incorporate a log-based recovery approach, which captures changes written sequentially to a log. • Each set of operations for performing a specific task is called a transaction. • Once changes are written to the log, they are considered ‘committed, and the system callthat writes these changes to the log may then return to the user process. • But the problem is that the file system itself may not be updated yet, as these updates to the file itself take place asynchronously. • Once a committed transaction is completed, it can be removedfrom the log file. • (It is recommended that the log be separated from the file system itself and perhaps on a different drive – in case the drive goes down.)
Restoring Consistency • If the file system crashes, all remaining transactions in the log must still be performed, if any are present. • Even though they were committed by the operating system, they may likely have not been effected in the file itself. • Difficulties arise if the transaction itself was aborted before the system crashed. • A partial ‘completion’ to the file system must be undone in order to arrive at a consistent data point.
Chapter 12: Mass-Storage Systems • Overview of Mass Storage Structure • Disk Structure • Disk Attachment • Disk Scheduling • Disk Management • Swap-Space Management • RAID Structure • Disk Attachment • Stable-Storage Implementation • Tertiary Storage Devices • Operating System Issues • Performance Issues
Limited Objectives • We can view a file system as possessing three components: • A user / programmer interface to the file system • The internal data structures and algorithms used by the operating system to implement this interface, and • The secondary and tertiary storage structures themselves • Here we will describe the physical structure of secondary and tertiary storage devices and the resulting effects on the uses of these devices
Overview of Mass Storage Structure • Magnetic disks provide bulk of secondary storage of modern computers • Drives rotate at 60 to 200 times per second • Transfer rate is rate at which data flow between drive and computer • Positioning time (random-access time) is time to move disk arm to desired cylinder (seek time) and time for desired sector to rotate under the disk head (rotational latency) • Disk consists of a central spindle with platters attached. • Data is recorded on the top and bottom surface of each platter except the top surface of the top platter and the bottom surface of the bottom platter (for dust). • The read/write heads ‘float’ over the surface of the platters and all arms move with the arm assembly together in unison. • The set of tracks that are ‘carved out’ via each arm position forms a cylinder.. • Each track may contain hundreds of sectors, depending on the size of the sectors. • See next slide.
Moving-head Disk Machanism Discuss
Disk Access • The disk spins at a high speed – somewhere between 60 and 200 revolutions per second. • A disk read consists of three components1. Seek time – this is the movement of the arm to the correct cylinder • 2. Head select - negligible • 3. Rotational delay (latency) – generally, half the speed of rotation – until the desired sector / block moves under the read/write head. • 4. Data transfer time (copying the data from the disk into the I/O controller unit. • Oftentimes, head select is not counted, because it is done electronically. But the head must be selected so that it is decided which head is going to read which surface!
Disk Head Crashes • The read/write heads float over a surface. • But one can experience a headcrash results from disk head making contact with the disk surface • This can happen if power is abruptly pulled, although more modern devices store some power so that they can gracefully degrade… • Some disks are removable that this allows other disks (disk packs) to be mounted on the same disk drive. • Some are ‘permanent’ disks in an organization. • These are generally faster and have more capacity and are not application-dependent. • Floppy Disks – inexpensive, removable magnetic disks where the head sits directly on the disk surface. • Floppies rotate much more slowly and have much less capacity than hard disks.
More on Disk • Drive attached to computer via I/O bus • Buses are the vehicle that support data transfer and are driven by special processes called I/O Disk Controllers. • A Host controller is located at the computer end of the bus. • The host controller uses the bus to talk to disk controllerbuilt into drive or storage array • The computer places a command into the host controller, typically using memory-mapped I/O ports, which then sends the command via messages to the disk controller. • The disk controller operates the disk drive hardware to carry out the command. • Disk controllers usually have a built in cache to support • data transfer from the cache to the disk surface and • data transfer from the cache to the host – • depending on whether we are reading or writing.
Last Look • One more bit of very interesting data: relative speeds of the disk in performing a read / write: • Seek time – nominally 20 msec • Rotational delay (latency) 8 msec • Data transfer 0.2 seconds. • Overall access: 28.2 msec. • Easy to see the emphasis on reducing seek time when allocating disk space! • This is why many allocation schemes allocate in what is called “cylinder mode,” so that successive surfaces lie in the same cylinder thus negating the need to do a seek – substituting only a head select in its place!!
Overview of Mass Storage Structure (Cont.) • Magnetic tape • Was early secondary-storage medium • Relatively permanent and holds large quantities of data • Access time slow, but again, can store huge quantities of data. • Random access ~1000 times slower than disk • Mainly used for backup, storage of infrequently-used data, transfer medium between systems • Please note that in years past, these constituted a primary storage medium for files – as long as they were sequential!! • Kept in spool and wound or rewound past read-write head • Once data under head, transfer rates comparable to disk • 20-200GB typical storage
Well… • This is about all the time we have for this course! • Take care, and I hope you have learned a lot. • It was my pleasure to work with you!