1 / 47

Local Filesystems

Local Filesystems. CIS657. Filesystem Layers. Stackable Filesystems. Local Filesystems. File Stores. Stackable Filesystems provide composable operations and flexible name spaces Local Filesystems are the “foundations” of the name space

dara
Download Presentation

Local Filesystems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Local Filesystems CIS657

  2. Filesystem Layers Stackable Filesystems Local Filesystems File Stores • Stackable Filesystems provide composable operations and flexible name spaces • Local Filesystems are the “foundations” of the name space • File stores deal with the layout of the blocks on the disk

  3. Common Features in Local Filesystems • Hierarchical naming • Locking • Quotas • Attribute management • Protection • Provided by UFS in BSD Unix • (Unix File System = Berkeley FFS)

  4. Pathname searching Name creation Name change/deletion Attribute manipulation Object interpretation Process control Object management lookup create, mknod, link, symlink, mkdir rename, remove, rmdir access, getattr, setattr open, readdir, readlink, mmap, close advlock, ioctl, select lock, unlock, inactive, reclaim, abortop Filesystem Operations

  5. Lookup • We’ve seen filesystem-independent lookup in our discussion of vnodes • Defer filesystem-dependent discussion until we see inodes

  6. Name Creation • create: creates regular files and AF_LOCAL domain sockets • link, symlink: add names to existing objects • mknod: creates character special devices • mkdir: creates directories

  7. Name Change & Deletion • rename: delete a name (not the object!) in one location and create a new name in another • remove: removes a name; if this is the last reference to an object, remove the object too • rmdir: removes directories

  8. Attribute Manipulation • getattr: get attributes • Number of links • Timestamps • Flags • Uid, gid • Etc. • setattr: set attributes (can’t set all) • access: check whether process can read/write file

  9. Object Interpretation • open/close: for special files, tell the device driver about device activation or shutdown • readdir: reads fs-specific directory structure into standard form • readlink: returns contents of symlink • mmap: prepares object for mapping into process address space

  10. Process Control of Objects • select: find out if an object is ready for I/O • ioctl: pass request to specialized device (the catchall call) • advlock: get or release an advisory lock on an object (what’s an advisory lock?)

  11. Object Management • inactive/reclaim: covered under vnode discussion • lock/unlock: lock and unlock objects (such as directories); ignored in stateless filesystems such as NFS

  12. Inodes • Index nodes (inodes) are the core of the local filesystem management of files kernel data structures disk open file entry inode process descriptor inode data vnode

  13. Type, access mode File’s owner Group-access identifier Timestamp last read/written Timestamp inode most recently updated File size Number of blocks used by file (including indirect blocks) Number of references to the file Flags (e.g. immutable) Generation number Block size of data blocks for the inode Size of extended attributes Inode fields What’s missing? Why?

  14. data data data … … data data data … data data … … data data data data … data data … data Inode mode owners tstamps size direct blocks single ind. double ind triple ind. block cnt blocksize ref cnt xattr size data flags xattr blcks generation data

  15. Inode Management • Like most other things, inodes are cached • Kept on hash table in the kernel (hashed on inode and device numbers) • When vnode’s inactive() or reclaim() is called, this passes through to inactivate or reclaim the inode.

  16. Naming Consider this simple file system tree .. . 2 .. vmunix . usr 4 5 .. foo . bin 6 7 ex groff vi 10 9

  17. Directories • Allocated in chunks • Chunk holds at least one directory entry • Entries may not span chunks • Linked list of entries • Index into inode structure • Type entry • Size of entry • Size of file name in bytes • Name of file

  18. Directory Chunks and Entries # file 5 foo.c # dir 3 bar # file 3 biz A directory chunk with three entries 0 ? An empty directory chunk

  19. Name Lookups • A (the most?) common request is for name lookup in directories • Kernel iterates through the directory entries • Compare lengths • If match, compare names • When found, put into name cache (remember, with positive and negative entries?)

  20. Looking Up All Entries In a Dir • The kernel optimizes requests for all entries in a directory by maintaining a “last lookup” offset • Start next search at the last lookup • Makes sequential access O(n) instead of O(n2)

  21. Pathname Translation: /usr/bin/vi • See pg. 312 in book & overhead

  22. Links • One inode per file • Multiple names possible—links • A directory entry is a hard link • When last link to a file is removed, the inode is deallocated • Recap: name == link == dir entry

  23. Links in Action:Initial Situation /home/sjc /home/pam … … foo biz ref count = 3 … … file inode … bar … /home/sdo

  24. Links in Action:“ rm /home/pam/biz” /home/sjc /home/pam … … foo ref count = 2 … … file inode … bar … /home/sdo

  25. Links in Action:“touch /home/pam/biz” /home/sjc /home/pam … … foo biz ref count = 2 … … file inode … ref count = 1 bar file inode … /home/sdo

  26. To Reestablish the Link /home/sjc /home/pam • Use ln (link) command: ln /home/sjc/foo /home/pam/biz … … ref count = 3 foo biz file inode … … … bar … /home/sdo

  27. Symbolic (Soft) Links • “Just files” • “type” field in directory entry indicates this is a symlink • File contains a pathname • Prepend the contents of the file to the remainder of the pathname • If an absolute path, use that path • If relative, interpret relative to the directory where the link was found

  28. Example Symbolic Links /home/sjc ref count = 1 … file inode foo … /home/sdo ref count = 1 … /home/sjc/foo bleargh …

  29. Symlinks in Action:Initial Situation /home/sjc /home/pam … ref count = 1 … foo file inode biz … … ref count = 1 /home/sjc/foo

  30. Symlinks in Action:“ rm /home/sjc/foo” /home/sjc /home/pam … … biz … … ref count = 1 /home/sjc/foo X

  31. Links in Action:“touch /home/sjc/foo” /home/sjc /home/pam … ref count = 1 … foo file inode biz … … ref count = 1 /home/sjc/foo Note: foo is now a new file.

  32. Treatment of Symbolic Links • In almost all cases, a system call on a symbolic link is passed through to the file referenced by the symlink • Symbolic links can form loops in the filesystem (hard links can’t) • Symbolic links can refer to other filesystems (hard links can’t) • A shell often tracks traversal of symbolic links through the “cd” command—why?

  33. Quotas • Limit the amount of file space used by • Users • Groups • Hard limit: the level of usage at which no further allocation can be done • Soft limit: the level of usage at which a warning is generated; if soft limit is violated for a long time, it becomes the hard limit

  34. Quotas II • Separate quotas for both data blocks and inodes • Check quota at allocation time • Check user quota first • Then check group quota • If either fails, return error up as if filesystem were full • Kept in files in the root of the filesystem

  35. Quota File Structures struct mount for / struct ufs_mount vnode for /quota.user vnode for /usr/quota.user struct mount for /usr struct ufs_mount vnode for /usr/quota.group struct mount for /arch struct nfs_mount

  36. A Quota Record uid 0: block quota (soft limit) uid 1: block quota (hard limit) uid 2: current number of blocks … time to begin enforcing block quota inode quota (soft limit) uid i: … inode quota (hard limit) current number of inodes uid n: time to begin enforcing inode quota

  37. dquot entries • dquot entries hold active quotas in kernel memory (cache, as usual) • Fast lookup via hash table • Loaded when file is first opened for writing • Checked on each write • Save pointer to dquot structs in the inode

  38. File Locking • Locks (advisory!) can be placed on an arbitrary byte range in a file • The range lock structure for a file gives a list of active locks • A list of pending locks hangs off of each active lock

  39. Example Lock Structures i_lockf lf_next lf_next lf_next … type=EX type=SH type=SH ID = 1 ID = 2 ID = 3 range=1,3 range=7,12 range=7,14 inode lf_block lf_block lf_block lf_next lf_next type=SH type=EX ID = 4 ID = 1 range=3,10 range=9,12 lf_block lf_block

  40. lf_next type=SH ID = 2 range=3,5 lf_block Another Lock…Deadlock! i_lockf lf_next lf_next lf_next … type=EX type=SH type=SH ID = 1 ID = 2 ID = 3 range=1,3 range=7,12 range=7,14 inode lf_block lf_block lf_block lf_next lf_next Check for deadlock by looking for cycles. type=SH type=EX ID = 4 ID = 1 range=3,10 range=9,12 lf_block lf_block

  41. Five Possibilities for Lock Overlap on Acquisition • Direct: exact match • Subset: new lock range is entirely contained within old lock range • Superset: new lock range entirely contains old lock range • Extend Past: new lock range starts part way into and extends past old lock range • Extend Into: new lock range starts before and extends into old lock range

  42. Five Possibilities forLock Overlap on Acquisition II Exact Subset Superset Past Into Existing New Becomes

  43. Lock Overlap Example:Initial State i_lockf lf_next lf_next lf_next … type=EX type=SH type=SH ID = 1 ID = 1 ID = 1 range=1,3 range=5,10 range=12,19 inode lf_block lf_block lf_block lf_next type=EX 20 0 10 ID = 2 range=3,12 lf_block

  44. New Request lf_next type=EX ID = 1 range=3,13 lf_block 20 0 10

  45. Result i_lockf lf_next lf_next lf_next … type=EX type=EX type=SH ID = 1 ID = 1 ID = 1 range=1,2 range=3,13 range=14,19 inode lf_block lf_block lf_block lf_next type=EX ID = 2 20 0 10 range=3,12 lf_block

  46. Five Possibilities for Request/Lock Overlap on Release • Direct: exact match • Subset: release request is entirely contained within lock range • Superset: release request entirely contains lock range • Extend Past: release request starts part way into and extends past lock range • Extend Into: release request starts before and extends into lock range

  47. Five Possibilities forRequest/Lock Overlap on Release II Exact Subset Superset Past Into Existing New Becomes

More Related