CIS 620 Advanced Operating Systems

CIS 620 Advanced Operating Systems Lecture 8 – Naming and File Systems Prof. Timothy Arndt BU 331

Names, Identifiers, And Addresses • Naming was originally (and still is primarily, to some extent, today) applied to file systems • Nowadays other entities are organized into namespaces • Processes, IPCs, etc. • Properties of a true identifier: • An identifier refers to at most one entity. • Each entity is referred to by at most one identifier. • An identifier always refers to the same entity

Flat Naming • Non-hierarchical • Can locate an entity using flat naming by broadcasting in a LAN (e.g. ARP) • In a point-to-point network (WAN) can use multicasting • What about flat naming for mobile entities? • Can use the forwarding pointers approach

Forwarding Pointers • The principle of forwarding pointers using (client stub, server stub) pairs.

Forwarding Pointers • Redirecting a forwarding pointer by storing a shortcut in a client stub.

Home-Based Approaches • The principle of Mobile IP. A home location keeps track of the current location an entity

Distributed Hash Tables • Mapping a name to a location in a DHT can take a long time if we take the naïve approach (just try each host around the logical ring until we find the name) • Instead, we can use a finger table which records locations of entities at logarithmically increasing distances

Distributed Hash TablesGeneral Mechanism • Resolving key 26 from node 1 and key 12 from node 28 in a Chord system.

Name Spaces • An alternative to flat naming is structured naming leading to a name space • A general naming graph with a single root node.

Linking and Mounting • The concept of a symbolic link explained in a naming graph.

Linking and Mounting • Information required to mount a foreign name space in a distributed system • The name of an access protocol. • The name of the server. • The name of the mounting point in the foreign name space.

Linking and Mounting • Mounting remote name spaces through a specific access protocol.

Name Space Distribution • An example partitioning of the DNS name space, including Internet-accessible files, into three layers.

Name Space Distribution • A comparison between name servers for implementing nodes from a large-scale name space partitioned into a global layer, an administrational layer, and a managerial layer.

Name Resolution • Discovering the location of a resource, given its name, is called name resolution • Name resolution in hierarchical systems like DNS uses iteration, recursion (and caching)

Implementation of Name Resolution • The principle of iterative name resolution.

Implementation of Name Resolution • The principle of recursive name resolution.

Implementation of Name Resolution • Recursive name resolution of <nl, vu, cs, ftp>. Name servers cache intermediate results for subsequent lookups.

Example: The Domain Name System • The comparison between recursive and iterative name resolution with respect to communication costs.

The DNS Name Space • The most important types of resource records forming the contents of nodes in the DNS name space.

DNS Implementation • An excerpt from the DNS database for the zone cs.vu.nl.

DNS Implementation • Part of the description for the vu.nl domain which contains the cs.vu.nl domain.

The File System in UNIX • The UNIX file system is organized in a tree structure. The files on this tree are one of the following types: • An ordinary file • A directory file • A special file (representing an I/O device) • A link that points to another file • A socket that is used for interprocess communication

Directories • A directory is a file whose content consists of directory entries for the file in the directory. • A directory entry contains the name of the file and a pointer to the file. The pointer is an index called the i-number to a table known as the i-list. • Each entry in the i-list is an i-node containing status and address information about a file or is free. • The entire file system may contain several self-contained filesystems, each with its own i-list.

Directories grail:/users/faculty/arndt> df /home0 (arthur.cba:/home0 ): 14736728 blocks -1 i-nodes /home1 (arthur.cba:/home1 ): 7846106 blocks -1 i-nodes /userspace (shamu.cba:/export/home0/userspace): 1327190 blocks -1 i-nodes /usr (/dev/vg00/lvol3 ): 65700 blocks 13437 i-nodes /tmp (/dev/vg00/lvol4 ): 219084 blocks 12016 i-nodes /users (/dev/vg00/lvol5 ): 245610 blocks 65662 i-nodes /mnt1 (/dev/vg00/lvol6 ): 129708 blocks 17093 i-nodes /client_server (/dev/dsk/c4d0s2 ): 512710 blocks 357634 i-nodes / (/dev/vg00/lvol1 ): 23880 blocks 6769 i-nodes grail:/users/faculty/arndt>

Special Files • UNIX represents I/O devices such as terminals, printers, tape, and disk drives as special files in the file system. • In this way, an application program can treat file and device I/O in the same way. • Each special file is stored in the directory /dev. • A special file can be either a character special file or a block special file. • A character special file represents a character-oriented I/O device. • A block special represents a high-speed I/O device that transfers data in blocks rather than bytes.

Links • A directory entry may be a pointer to another file. This is called a link. There are two kinds of links: • hard links • symbolic links • A hard link is an entry in a directory with a name and some other file’s i-number. The hard link is not distinguishable from the original file. • A file may have several links to it. The link must be to an ordinary file in the same filesystem.

Links • A link is created to another file by using the ln command: ln file ln file linkname ln file1 . . . Dir • By default, a hard link is created by ln. • It is not necessary to be the owner of a file to link to the file. The command rm removes the directory entry of the file. The file is physically deleted when the last link to it is removed. • The number of hard links to a file is kept as part of the status.

Symbolic Links • A symbolic link is a directory entry that contains the pathname of another file. The symbolic link can be used as an argument in a command. • The file pointed to can be removed and the link will remain. • A symbolic link can span filesystems and may point to a directory. • A symbolic link can be created using ln -s

File Access Control • The file mode for a file as displayed by ls -l is, for example: -rw-rw-rw- • The first position shows the file type: - for normal, d for directory, l for symbolic link, c for character special file, b for block special file. • Positions 2-4 show read, write, and execute permission for the owner of the file. A - means no permission.

File Access Control • Positions 5-7 show the permissions for users in the file group. • Positions 8-10 show the permissions for other users. • To access a directory, you must have execute permission for that directory. To list its contents, you must have read permission. Write permission is necessary to create or delete a file in a directory • New files are created with a default protection. To change this default, use the command umask. This command uses three octal numbers.

File Status • A file status is kept in the i-node for each file in the UNIX files system. The file status include: • mode • number of links • owner, group • size • last access, last content change, last status change • i-number • device • block size • block count

File Mode • The file mode consists of 16 bits. We have already seen the low order 9 bits: the permissions. The four high bits specify the file type while the next three bits define the way in which an executable file is run. • Set userid: owner of a file changed to user during execution. • Set groupid: sets group id during execution. • The stickybit keeps a shared binary in memory. • The mode of a file may be changed using the chmod command

File Mode grail:/usr/bin> ls -l e* | more -r-xr-xr-t 6 bin bin 245760 Sep 7 1993 e -r-xr-xr-t 6 bin bin 245760 Sep 7 1993 edit -rwxr-sr-x 1 root bin 479232 Jun 17 1996 elm -r-xr-sr-x 1 bin mail 270336 Sep 7 1993 elm.orig -r-xr-xr-x 1 bin bin 32768 Jun 17 1996 elmalias -r-sr-xr-x 1 lp bin 20480 Oct 22 1993 enable -r-xr-xr-t 6 bin bin 245760 Sep 7 1993 ex -r-xr-xr-x 1 bin bin 16384 Sep 7 1993 expand grail:/usr/bin> Note the ‘s’ and ‘t’ above in place of ‘x’.

Groups • A user can belong to one or more groups specified in the file /etc/group. • Use the command ‘groups’ to see which groups you belong to. • Each file is associated with a group owner. To change the group owner of a file use ‘chgrp’ • This allows groups to work on the same files (but it doesn’t coordinate their changes!).

File System Implementation • A UNIX file is an array of bytes stored in a number of data blocks in a specific filesystem. • Originally, blocks were 512 bytes. Some implementations now use 4K bytes per block. Since the final block may waste space if it is not full, it may be broken up into fragments of (e.g.) 1K bytes. • As said previously, information on files is stored in an i-node, where the i-nodes for a filesystem are stored in an i-list.

File System Implementation • The i-node also contains pointers to the data blocks of the file. An i-node contains 10 pointers to data blocks, a pointer to a data block containing pointers to data blocks, and a pointer to a data block containing pointers to data blocks containing pointers to data blocks (!). This allows for extremely large files. • Why use this arrangement rather than just a chain of data blocks?

Removable Filesystems • One of the UNIX filesystems is the root filesystems. All of the other filesystems are removable. • Removable filesystems are attached to the root filesystem at some leaf node using the mount command: mount devfile directory [-r] • The command umount is used to unmount a removable filesystem. Mount as read only

Super Block and Cylinder Groups • The super block of a filesystem records information about the filesystem: • Size of the filesystem • Block/fragment size • Length of the i-node list • A list of free blocks and length of the list • A list of free i-nodes and length of the list • The super block is stored at the beginning of a filesystem.

Super Block and Cylinder Groups • Contemporary UNIX system use an improved organization based on the idea of a cylinder group (a set of physically close cylinders on the disk). • I-nodes are distributed within each cylinder group rather than at the beginning of the filesystem. Therefore, i-nodes are close to their data blocks, and access time is improved. Super blocks are also duplicated in order to increase robustness.

Filesystem Table • Each filesystem is represented by its own block-type special file. The name of this file along with other information is kept in /etc/checklist or /etc/fstab: /dev/vg00/lvol1 / hfs defaults 0 1 31484 564 /dev/vg00/lvol2 swap ignore sw 0 0 16392 592 /dev/vg00/lvol3 /usr hfs defaults 0 2 0 /dev/vg00/lvol4 /tmp hfs defaults 0 2 13 16408 /dev/vg00/lvol5 /users hfs quota,defaults 0 2 0 /dev/vg00/lvol6 /mnt1 hfs rw,suid 0 3 0 /dev/dsk/c4d0s2 /client_server hfs quota,defaults 0 2 0 arthur.cba:/home0 /home0 nfs rw 0 2 0 arthur.cba:/home1 /home1 nfs rw 0 2 0 shamu.cba:/export/home0/userspace /userspace nfs rw 0 2 0

Filesystem Table • The currently mounted tables are kept in /etc/mnttab: /dev/vg00/lvol1 / hfs defaults 0 1 917804700 1 /dev/dsk/c4d0s2 /client_server hfs quota 0 2 917804709 1 /dev/vg00/lvol6 /mnt1 hfs defaults 0 3 917804709 1 /dev/vg00/lvol5 /users hfs quota 0 2 917804709 1 /dev/vg00/lvol4 /tmp hfs defaults 0 2 917804709 1 /dev/vg00/lvol3 /usr hfs defaults 0 2 917804709 1 shamu.cba:/export/home0/userspace /userspace nfs rw 0 2 917804764 0 arthur.cba:/home1 /home1 nfs rw 0 2 917804764 0 arthur.cba:/home0 /home0 nfs rw 0 2 917804764 0

Network File System • Many UNIX systems allow file operations on remote filesystems stored on other computers on a LAN. This is done using NFS. • A machine on a LAN can function as file server (e.g. by having all users home directories on that machine). • NFS can work on different hardware and OSs (e.g. different varieties of UNIX). • A filesystem is made available remotely by exporting it. It can then be mounted on a remote machine.

Linux File Types • Besides regular files (-), directory files (d), block special files (b) and character special files (s) Linux also has the following file types • Unix domain socket (s) • Soft link (l) • Named pipe (p)

File System Consistency • In order to improve filesystem performance, a pool of data blocks are kept in memory so that further references to them don’t require a disk access. • Changes made to these blocks are not immediately reflected on disk. • Periodically, they are written to disk. • This arrangement is known as lazy write. • DOS used careful write in which changes are written immediately to disk. • Powering down the computer without going through a shutdown sequence can cause unwritten data to be lost.

File System Consistency • The program fsck can be run when the system is rebooted to check for and repair file system inconsistency (compare Windows chkdisk program). • This loss of data is unacceptable for mission-critical database management systems. • Therefore these systems completely bypass the UNIX filesystem, managing a raw device by themselves in order to ensure a consistent state of the database in case of system failure.

Problems • The user types in the command “cat /user/home/me/sub/file1” - what are the steps involved in retrieving the data in the file? • A removable hard drive formatted for Unix (i.e. Firewire or USB) is attached to a Unix system. What steps are taken in order to make the data on the drive available to application programs?

procfs • procfs (the proc filesystem) is a special filesystem in Linux and other Unix-like OSs that presents information about processes and other kernel information in a a hierarchical file-like structure • Makes it easier to access info about (e.g.) processes • Mounted at /proc

CIS 620 Advanced Operating Systems