File Systems - PowerPoint PPT Presentation

file systems n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
File Systems PowerPoint Presentation
Download Presentation
File Systems

play fullscreen
1 / 71
File Systems
140 Views
Download Presentation
cira
Download Presentation

File Systems

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

  1. File Systems 6.1 Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems Chapter 6

  2. Files • Requirements for long term information storage: • Must store large amounts of data • Information stored must survive the termination of the process using it • Multiple processes must be able to access the information concurrently • Solution: Store information on disk or other external media in units called files. • The file system is a part of operating system that manages files.

  3. Files • File naming - filename.file extention • File structure • unstructured sequence of bytes. MS-DOS/UNIX • record sequence. CP/M • B-tree.

  4. File Naming Typical file extensions.

  5. File Structure • Three kinds of files • byte sequence • record sequence • tree

  6. Files • File types • Regular files contain user information either ASCII or binary. • Directories are system files for maintaining the structure of the file system. • Character special files are used to model serial I/O devices such as terminals, printers, and networks. • Block special files are used to model disks. • A UNIX executable file stats with a magicnumber, identifying the file as an executable file.

  7. File Types (a) An executable file (b) An archive

  8. File Access • Sequential access • read all bytes/records from the beginning • cannot jump around, could rewind or back up • convenient when medium was magnetic tape • Random access • bytes/records read in any order • essential for database systems • Two methods are used for specifying where to start reading. • read and then move file marker • move file marker (seek), then read

  9. File Attributes • Operating systems associate extra information with each file, called file attributes. Possible file attributes

  10. Create Delete Open Close Read Write Append Seek Get attributes Set Attributes Rename File Operations

  11. An Example Program Using File System Calls

  12. An Example Program Using File System Calls

  13. Memory-Mapped Files (Ex11.c) • To facilitate access to files, systems provides system calls to map files into the address space of a running process and remove (unmap) the files from the address space. • File services as a backing store for the process and when the process finishes all mapped, modified pages are written back to their files. • Advantage: eliminate need for I/O. • Disadvantage: • difficult to know size of output file. In case of all zeroes,  10 0's ?? or 100 0's ?? • Mapped file modified by one process is read differently by another. Two processes need to see consistent views of the file. • file may be too large to fit.

  14. Memory-Mapped Files (a) Segmented process before mapping files into its address space (b) Process after mapping existing file abc into one segment creating new segment for xyz

  15. Directories • File systems have directories or folders to keep track of files. • A single-level directory has one directory (root) containing all the files. • A two-level directory has a root directory and user directories. • A hierarchical directory has a root directory and arbitrary number of subdirectories. • Two different methods are used to specify file names in a directory tree: • Absolute path name consists of the path from the root directory to the file. • Relative path name consists of the path from the current directory (working directory).

  16. Directories - A single level directory system • A single level directory system • contains 4 files • owned by 3 different people, A, B, and C

  17. Two-level Directory Systems Letters indicate owners of the directories and files

  18. Hierarchical Directory Systems A hierarchical directory system

  19. Directories • The path name would be written: • Winodws \usr\ast\mailbox • UNIX /usr/ast/mailbox • MULTICS >usr>ast>mailbox • Dot and dot dot are two special entries in the file system. • Dot (.) refers to the current directory. • Dot dot (..) refers to its parent.

  20. Path Names A UNIX directory tree

  21. Create Delete Opendir Closedir Readdir Rename Link Unlink Directory Operations

  22. File System Implementation • File system layout: • MRB (Master Boot Record) is used to boot the computer. • The partition table gives the starting and ending addresses of each partition. • Partitions: • The first block, boot block, of the active partition is read in by the MRB program when the system is booted. • The superblock contains all the key parameters about the file system. • Free blocks information • i-nodes tells all about the file. • Root directory • Directories and files

  23. File System Implementation A possible file system layout

  24. File System Implementation • Implementing file storage is keeping track of which disk blocks go with which files. • Contiguous Allocation - store each file as contiguous block of data. • Advantage: • Simple to implement • Read performance is excellent • Disadvantage: • Disk fragmentation • The maximum file size must be known when file is created. • Example: CD-ROMs, DVDs, and write-once optical media • Linked List Allocation - keep linked list of disk blocks • Disadvantage: • random access slow • amount of data in a block not a power of 2

  25. Implementing Files (a) Contiguous allocation of disk space for 7 files (b) State of the disk after files D and E have been removed

  26. Implementing Files Storing a file as a linked list of disk blocks

  27. File System Implementation • Linked List Allocation using an index - take table pointer word from each block and put them in an index table, FAT (File Allocation Table) in memory. • Disadvantage - entire table must be in memory all the time • I-node (index-node) lists the attributes and disk addresses of the file's blocks.

  28. Implementing Files Linked list allocation using a file allocation table in RAM

  29. Implementing Files An example i-node

  30. Implementation directories • When a file is opened, the file system uses the path name to locate the directory entry. • The directory provides information needed to find the disk blocks. • disk address of the entire file (contiguous blocks) • the number of first block (linked list) • the number of i-node (i-node) • Where to store attributes? In directory or i-node?

  31. Implementing Directories (a) A simple directory – MS-DOS/Windows fixed size entries disk addresses and attributes in directory entry (b) Directory in which each entry just refers to an i-node - UNIX

  32. Implementation directories • Handling long file names in a directory: • Fixed-length names (Waste space) • In-line (When a file is removed, a variable-sized gap is introduced.) • Heap (The heap management needs extra effort.) • How to search files in each directory? • Linearly • Hash table • Cache the results of searches

  33. Implementing Directories • Two ways of handling long file names in directory • (a) In-line • (b) In a heap

  34. Shared files • A shared file is used to allow a file to appear in several directories. • The connection between a directory and the shared file is called a link. The file system is a Directed Acyclic Graph (DAG). • Problem: If directories contain disk address, a copy of the disk address will have to be made in directory B. What if A or B append the file, the new blocks will only appear in one directory.

  35. Shared files • Solution: • Do not list disk block addresses in directories but in a little data structure. (use i-nodes)  (hard link) • Create a new file of type link which contains the path name of the file to which it is linked symbolic linking

  36. Shared Files File system containing a shared file

  37. Shared files • ln file1 file2 • Problem of hard link - when should i-node be removed? Suppose A: rm file2  could set count = 1 and leave i-node intact. when count = 0, delete file and i-node. • Problem above does not occur in symbolic link because only the owner directory has a pointer to i-node. The problem is extra overhead in the traversing path. • Other problem is having multiple copies of a file may set copied when dumping an files onto a disk (tar). • do not descend path involving symbolic links.

  38. Shared Files (a) Situation prior to linking (b) After the link is created (c)After the original owner removes the file

  39. Disk space management • Strategies for storing an n byte file: • Allocate n consecutive bytes of disk space - segment • Allocate a number [n/k] blocks of size k bytes each - paging • problem – if the file grows it will have to be moved on the disk, it is an expensive operation and causes external fragmentation. => • All file systems chop files up into fixed-size blocks that need not to be adjacent.

  40. Block size • When block size increase, disk space utilization decrease (space efficiency decrease and internal fragmentation). • When block size decrease, data transfer rate decrease (time efficiency decrease) • usual size k = 512bytes, 1k (UNIX), or 2k

  41. Disk Space Management • Dark line (left hand scale) gives data rate of a disk • Dotted line (right hand scale) gives disk space efficiency • All files 2KB Block size

  42. Block size • Example: disk with 131072 bytes per track. rotation time = 8.33 msec average seek time = 10 msec.  time to read a block of k bytes = 10 + 8.33/2 + (k/131072) * 8.33 = 10 + 4.165 + k/131072 * 8.33 If k = 1 KB = 1024 bytes = 14.165 + 1024/131072 * 8.33 = 14.165 + 0.065 = 14.23 msec • Disk space efficiency = % of block used by data. • Observation: Assume that all files are 1 kbytes, on average 1/2 of last block is empty.

  43. Keeping Track OF Free Blocks • Use linked list of disk blocks: each block holds as many free disk block numbers as will fit. • With 1 KB block and 32-bit disk block number  1024 * 8/32 = 256 disk block numbers  255 free blocks (and) 1 next block pointer. • Use bit-map: A disk with (n) blocks requires a bit map with (n) bits • Free blocks are represented by 1's • Allocated blocks represented by 0's • 16GB disk has 224 1-KB and requires 224 bits  2048 blocks • Using a linked list = 224/255 = 65793 blocks. However, these blocks can be freed up as the disk is filled up. • Bit map generally better if it can be kept completely in memory.

  44. Disk Space Management (a) Storing the free list on a linked list (b) A bit map

  45. Disk Space Management (a) Almost-full block of pointers to free disk blocks in RAM - three blocks of pointers on disk (b) Result of freeing a 3-block file (c) Alternative strategy for handling 3 free blocks - shaded entries are pointers to free disk blocks

  46. Disk Space Management Quotas for keeping track of each user’s disk use

  47. File System Reliability • The loss of a file system can be catastrophic. • Methods to safeguard a file system: • Bad Block Management • Backups • Bad Block Management • Hardware solution - dedicate a sector to a "bad block list“ when disk controller is initiated, the bad block list is read and a spare block is picked to replace each bad block. The mapping is recorded in the bad block list. • Software solution - user or file system carefully construct a file containing all the bad blocks

  48. File System Reliability • Backups are made to handle: recover from disaster or stupidity. • Considerations of backups • Entire or part of the file system • Incremental dumps: dump only files that have changed • Compression • Backup an active file system • Security

  49. File System Reliability • Two strategies can be used for dumping a disk to tape: • Physical dump: starts at block 0 to the last one. • Advantages: simple and fast • Disadvantages: backup everything • Logical dump: starts at one or more specified directories and recursively dumps all files and directories found that have changed since some given base date.

  50. File System Reliability • A file system to be dumped • squares are directories, circles are files • shaded items, modified since last dump • each directory & file labeled by i-node number File that has not changed