1 / 54

File Layer and Virtual File System

File Layer and Virtual File System. Chapter Three. Topics. File System Abstractions File System Layers The File Layer The Virtual File System Selected File Related Calls. UNIX File Abstraction. The File Stream of bytes any record structure is imposed by application

Download Presentation

File Layer and Virtual File System

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. File Layer and Virtual File System Chapter Three

  2. Topics • File System Abstractions • File System Layers • The File Layer • The Virtual File System • Selected File Related Calls

  3. UNIX File Abstraction • The File • Stream of bytes • any record structure is imposed by application • Sequential or Random Access • The Directory Structure • Tree-like directory hierarchy • File sharing • hard links - multiple names for same disk file • soft (symbolic) links - stored path shortcut • Access control associated with the file

  4. File Related System Calls • open(), close() • creat(), unlink() • read(), write() • seek() • getattr(), setattr() • mmap() • ioctl() • fsync() • dup(), dup2()

  5. File Descriptor • Applications name for an open file • small integer returned by open() • The first three file descriptors are • 0 -- standard input • 1 -- standard output • 2 -- standard error • These are usually associated with a terminal • Each has an associated offset or file position pointer

  6. Types of Files • Regular • Directory • Block Special (Device) File • Character Special (Device) File • FIFO (Named Pipe) • Symbolic Link • Socket (In AF_UNIX Domain)

  7. UNIX Disk Abstraction • Partitions • Subsets of the disk that may be treated as logical disk drives. • Partitioning a Large Disk • Overcomes 32-bit UNIX limit problems • Isolates directories • Decreases fsck time • disklabel utility writes/edits disk label • Partition identified by a special file • block: /dev/disk/dsk[number][partition_letter] • character: /dev/rdisk/disk[number][partition_letter]

  8. UNIX File System Abstraction • Two Senses • A mountable directory hierarchy administered in /etc/fstab. • A specific implementation of the UNIX file abstraction (UFS, NFS, AdvFS, CDFS, etc). • One file system is the root file system • Other file systems are graphed in to the root by mounting.

  9. File System System Calls • mount(), unmount() • sync()

  10. DOS UNIX rz0a A: B: C: rz3c rz0g The Virtual File System:Transparent Access

  11. Application Process Common system calls: open(), close(), read(), write, seek() VFS To specific filesystem type implementation of the call The Virtual File System: Uniform Access

  12. File System Management Layers System Call read(), write() etc. Manage file access state for a given process File Layer Represent filesystem and files generically Virtual File System Specific file system implementation, UFS, Advfs NFS, MFS, etc. File System(s) In memory block storage for a file system. Could be traditional buffer cache, unified buffer cache or home grown. Cache Device Local Block Device, Network Interface or a Logical Volume.

  13. Digital UNIX File Systems • “True” Data File Systems • UNIX File System (UFS) • Network File System (NFS) • Advanced File System (AdvFS) • Memory File System (MFS) • CD File System, ISO 9660:1988 (CDFS) • Universal Disk Format (UDF) -- DVDFS • Pseudo-File Systems or Layers • Proc File System (procfs) • File Descriptor File System (FDFS) • File-on-File File System

  14. CDFS • Compact Disk File System • Support for common extensions • ISO 9660 Standard with Rocky Ridge Extensions • Joliet (Microsoft) extensions • Multi-session (Kodak) CD format • Can be exported by NFS

  15. CDFS: ISO 9660 Layout • ISO 9660 layout consists of • Primary and Secondary volume descriptors (a.k.a.. super blocks) • Path Tables • Directory and File Data • Directory records contain • Location of file or directory • Size • Length of extended attribute record (XAR) • Interleave attributes • Flags • File Name

  16. Non-Interleaved Interleaved XAR XAR Data Gap Data Gap Data CDFS:Interleaved and Noninterleaved Data Layout • XAR contents: UID and GID • Access Permissions • Creation/Modification dates

  17. Memory File System (MFS) • Memory Only - No Permanent Storage • ufs format (in-memory) • Created with newfs • not wired - backed by swap • use: fast temporary directories • system /tmp • build areas • etc.

  18. Physical Memory swap MFS and swap

  19. The /proc File System • The /proc file system is useful for process tracing or debugging utilities, such as truss or dbx • Structures used by the /proc file system include:prstatus Status of a traced task or threadprrun Actions to be taken before a stopped task or thread is runprpsinfo Information reported by ps

  20. File on File Mounting FS • Layer allowing mounting on a regular file of; • regular files • character device files • block special device files • Provided for SVID Conformance • FIFOs are given a names as files • see fattach(3) and fdetach(3)

  21. File Layer and VFS Structures UFS VFS UNIX Domain vnode inode socket vnode inode vnode inode f_data file mount ttyvp uf_entry[ ][ ] .ufe_ofile proc and session ufs_mount cdir rdir proc utask uthread utask

  22. Per Process File Descriptor “table” • Relates a task to an open file • Referenced through the utask structure • two-level tree structures • Beginning with V5.0 • entries are allocated when • a file is opened or a pipe or socket is created • inherited in a fork() • a descriptor is copied via dup() • entries are deallocated when • a file, pipe or socket is closed • a process terminates

  23. File structure • Records state of access to a file • Access mode (R/W) • Offset into file • Two tasks may share a single file structure • File structures inherited by child processes • Two uses of the file structure • for regular files • includes an ops vector for manipulating regular files • includes a pointer to a vnode • for sockets • includes an ops vector for manipulating sockets • includes a pointer to a socket

  24. File descriptor table within utask • Substructure of utask struct ufile_state uu_file_state; • Ultimately references file entry structures struct ufile_entry { struct file *ufe_ofile; struct socket_sel_queue *ufe_so_sel; int ufe_unused; int ufe_oflags; udecl_simple_lock_data(,ufe_ofile_lock) }

  25. ufile_state structure (1) struct ufile_state { udecl_simple_lock_data(,uf_ofile_lock) int utask_need_to_lock; int uf_first_available; • First available file descriptor int uf_of_count; • Number of overflow entries int uf_flags; • Marks pending changes in file descriptor table int uf_references • Used to block table shrink

  26. ufile_state structure (2) • Open file bit arrays -- indicates open file u_long uf_open_bits_lvl1 ; u_long *uf_popen_bits_lvl0; u_long *uf_popen_bits_lvl1; u_long uf_open_bits_lvl0 ; • Pointers to the file entries struct ufile_entry *uf_entry[U_FE_ARRAY_SIZE]; struct ufile_entry **uf_of_entry ; }

  27. file structure (1) struct file {udecl_simple_lock_data(,f_incore_lock)int f_flag; uint_t f_count; /* reference count*/int f_type; /* descriptor type*/int f_msgcount; /* references from message queue */struct ucred *f_cred; /* descriptor's credentials */struct fileops *f_ops; /* operations on f_data */caddr_t f_data; /* vnode or socket */ ....

  28. file structure (2) … union { /* offset or next free file struct */off_t fu_offset;struct file *fu_freef; } f_u; uint_t f_io_lock; /* I/O lock *//* (lower half of thread ptr) */ int f_io_waiters;/* number of waiters on i/o lock */ };

  29. struct fileops struct fileops { int (*fo_read)(); int (*fo_write)(); int (*fo_ioctl)(); int (*fo_select)(); int (*fo_close)(); }

  30. struct fileops Implementations Regular Files: vfs/vfs_vnops.c struct fileops vnops = { vn_read, vn_write, vn_ioctl, vn_select, vn_close }; Sockets: bsd/sys_socket.c struct fileops socketops = { soo_read, soo_write, soo_ioctl, soo_select, soo_close };

  31. Virtual File System • Originally designed for UNIX by Sun Microsystems, Inc., to support the Network File System (NFS) • Object-oriented support of multiple file system types: • struct vnode is a generic representation of a file for all types of file system implementations • struct mount is a generic representation of a whole mountable file system for all implementations • a file system implements its own set of: • member functions for vnodes and mount structures • data structures to combine with generic vnode and mount structures

  32. mount structure mount structure vnodeops structure vnode structure vnode structure vnode structure vnode structure buf structure buf structure ..... struct vnode (1) multiprocessor exclusion <locks> vnode flags v_flag reference count of users v_usecount page & buffer references v_holdcnt user-level lock counts lock counts last read (read-ahead) v_lastr capability identifier v_id vnode type v_type type of underlying data v_tag v_mount ptr to vfs we are in v_mountedhere ptr to mounted vfs vnode operations v_op vnode freelist forward v_freef vnode freelist back v_freeb vnode mountlist forward v_mountf vnode mountlist back v_mountb protect clean/dirty heads v_buflists_lock clean blocklist head v_cleanblkhd dirty blocklist head v_dirtyblkhd

  33. vm_object structure vnsecops structure struct vnode (2) .... last cache activity time v_ncache_time time on vnode free_list v_free_time protect numoutput, outflag v_output_lock num of writes in progress v_numoutput output flags v_outflag v_cache_lookup_refs count of readers v_rdcnt count of writers v_wrcnt Snapshot count of dirty blocks v_dirtyblkcnt Snapshot count of pushed blocks v_dirtyblkpush v_un ptr to sock, dev specinfo, pipe VM object for vnode v_object vnode security ops v_secops placeholder, private data v_data[ ]

  34. Types of Vnodes Type Description --------- ---------------------------------------------------- VNON Allocated, but as-yet untyped vnode VREG Vnode representing a regular file VDIR Directory vnode VBLK Block device vnode VCHR Character device vnode VLNK Symbolic link vnode VSOCK UNIX domain socket vnode VFIFO FIFO special file vnode

  35. struct vnodeops (1) Operation Functionvn_lookup Looks up a filevn_create Creates a regular filevn_mknod Creates a fifo or device special filevn_open Opens a filevn_close Closes a filevn_access Checks the access for a filevn_getattr Gets file attributesvn_setattr Sets file attributesvn_read Reads a filevn_write Writes to a filevn_ioctl Controls a devicevn_select For synchronous I/O multiplexing

  36. struct vnodeops (2) Operation Functionvn_mmap Map memory of a character devicevn_fsync Synchronize file data and statisticsvn_seek Sets position on a filevn_remove Removes a filevn_link Creates a hard link to a filevn_rename Renames a filevn_mkdir Creates a directoryvn_rmdir Removes a directoryvn_symlink Creates a symbolic link to a filevn_readdir Reads a directoryvn_readlink Reads contents of a symbolic linkvn_abortop Aborts operation

  37. struct vnodeops (3) Operation Functionvn_inactive Sets inactivevn_reclaim Reclaims a vnodevn_bmap Maps to file system blockvn_strategy Calls device strategy routinevn_print Prints the contents of an inodevn_pgrd Reads a pagevn_pgwr Writes a pagevn_swap Swaps handlervn_bread Reads buffervn_brelse Releases buffervn_lockctl Provides file lockingvn_syncdata Synchronizes range in open file

  38. struct vnodeops (4) Operation Function vn_lock Locks an inode vn_unlock Unlocks an inode vn_getproplist Gets extended attributes vn_setproplist Sets extended attributes vn_delproplist Deletes extended attributes vn_pathconf Checks path

  39. m_lock m_flag m_funnel mount structure m_next mount structure m_prev vfsops structure m_op vnode structure m_vnodecovered vnode structure m_mounth m_vlist_lock m_exroot m_uid m_stat m_data m_nfs_errmsginfo m_unmount_lock struct mount Lock for synchronization (SMP) Flags Flag for SMP Next in mount list Previous in mount list Operations on file system Vnode we are mounted on List of vnodes this mount Lock for vnode list Exported mapping for UID 0 UID of mounter File system statistics Private data NFS error information Lock for synchronization

  40. struct vfsops Operation Functionvfs_mount Mounts the file systemvfs_start Starts the file systemvfs_unmount Unmounts the file systemvfs_root Returns vnode for the root of the file systemvfs_quotactl Performs operations associated with quotasvfs_statfs Updates file system statisticsvfs_sync Synchronizes the file systemvfs_fhtovp Returns the vnode pointer, given a file handlevfs_vptofh Returns a file handle, given a vnode pointervfs_init Initializes the file systemvfs_mountroot Mount root file system vfs_smoothsync Gently sync the file system

  41. VFS Switch Table • Identifies file system types that have been implemented. • Contains an entry point for file system operations for each supported file system type. struct vfsops *vfssw[MOUNT_MAXTYPE];

  42. mount vfssw vfsops NULLPTR ufs_mount *m_op &ufs_vfsops ufs_start &nfs_vfsops ufs_unmount ufs_root ufs_quotactl Setting Up File System Operations

  43. rootfs m_next m_next m_next Mount Table m_prev m_prev m_data v_mount v_mount v_mount Vnodes v_data v_data v_data file system specific file information file system specific file information file system specific file information file system specific file information Mounted File System Structures

  44. Recording Mount PointsHow are they mounted? (1) A A B B C C

  45. Recording Mount PointsHow are they mounted? (2) rootfs next next Mount Structs m_mounth m_mounth v_mounted_here Vnode Structs VROOT VDIR VDIR m_vnodecovered

  46. File System Operations • namei() Interprets a pathname • mount() Mounts a file system • open() Opens a file • read()/write() Reads or writes a file

  47. Namei (1) • VFS routine that maps pathnames to vnodes • performs access checks on each component of that pathname. • Uses VOP_LOOKUP to move down the path • Special Cases • Symbolic Links • Mount Points • Process-Specific root (chroot()) • Special Care - unmounts

  48. Namei (2) • A LRU Hash Table< parent-vnode, component-name> to< target-vnode, capabilities> • Capabilities are "tags" assigned to vnodes • prevent cache entries from referring to out-of-date associations • Related data structures include: • namecache - namei cache • nchash - hash list for cache • nchsize - size of cache • nchsz - size of hash list

  49. Start Copy name into local buffer Copy next component to buffer Yes Find parent vnode ".." ? No Call file system specific lookup routine VOP_LOOKUP() Yes Copy name to buffer Symbolic link? No Yes Find root vnode of mounted file system: VFS_ROOT() Mounted on? No No Yes More components? Done namei()flow

  50. namei() mount() vmountset() Mount Table UFS ufs_mount() VFS_MOUNT mount()flow

More Related