1 / 83

I/O and File Systems

I/O and File Systems. CS 519: Operating System Theory Computer Science, Rutgers University Instructor: Thu D. Nguyen TA: Xiaoyan Li Spring 2002. I/O Devices. So far we have talked about how to abstract and manage CPU and memory

buffy
Download Presentation

I/O and File Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. I/O and File Systems CS 519: Operating System Theory Computer Science, Rutgers University Instructor: Thu D. Nguyen TA: Xiaoyan Li Spring 2002

  2. I/O Devices • So far we have talked about how to abstract and manage CPU and memory • Computation “inside” computer is useful only if some results are communicated “outside” of the computer • I/O devices are the computer’s interface to the outside world (I/O  Input/Output) • Example devices: display, keyboard, mouse, speakers, network interface, and disk CS 519: Operating System Theory

  3. Disk Basic Computer Structure CPU Memory Memory Bus (System Bus) Bridge I/O Bus NIC CS 519: Operating System Theory

  4. Intel SR440BX Motherboard CPU System Bus & MMU/AGP/PCI Controller I/O Bus IDE Disk Controller USB Controller Another I/O Bus Serial & Parallel Ports Keyboard & Mouse CS 519: Operating System Theory

  5. OS: Abstractions and Access Methods • OS must virtualizes a wide range of devices into a few simple abstractions: • Storage • Hard drives, Tapes, CDROM • Networking • Ethernet, radio, serial line • Multimedia • DVD, Camera, microphones • Operating system should provide consistent methods to access the abstractions • Otherwise, programming is too hard CS 519: Operating System Theory

  6. User/OS method interface • The same interface is used to access devices (like disks and network lines) and more abstract resources like files • 4 main methods: • open() • close() • read() • write() • Semantics depend on the type of the device (block, char, net) • These methods are system calls because they are the methods the OS provides to all processes. CS 519: Operating System Theory

  7. Unix I/O Methods • fileHandle = open(pathName, flags, mode) • a file handle is a small integer, valid only within a single process, to operate on the device or file • pathname: a name in the file system. In unix, devices are put under /dev. E.g. /dev/ttya is the first serial port, /dev/sda the first SCSI drive • flags: blocking or non-blocking … • mode: read only, read/write, append … • errorCode = close(fileHandle) • Kernel will free the data structures associated with the device CS 519: Operating System Theory

  8. Unix I/O Methods • byteCount = read(fileHandle, byte [] buf, count) • read at most count bytes from the device and put them in the byte buffer buf. Bytes placed from 0th byte. • Kernel can give the process less bytes, user process must check the byteCount to see how many were actually returned. • A negative byteCount signals an error (value is the error type) • byteCount = write(fileHandle, byte [] buf, count) • write at most count bytes from the buffer buf • actual number written returned in byteCount • a Negative byteCount signals an error CS 519: Operating System Theory

  9. Unix I/O Example • What’s the correct way to write a 1000 bytes? • calling: • ignoreMe = write(fileH, buffer, 1000); • works most of the time. What happens if: • can’t accept 1000 bytes right now? • disk is full? • someone just deleted the file? • Let’s work this out on the board CS 519: Operating System Theory

  10. I/O semantics • From this basic interface, three different dimension to how I/O is processed: • blocking vs. non-blocking • buffered vs. unbuffered • synchronous vs. asynchronous • The O/S tries to support as many of these dimensions as possible for each device • The semantics are specified during the open() system call CS 519: Operating System Theory

  11. Blocking vs. Non-Blocking I/O • Blocking: process is suspended until all bytes in the count field are read or written • E.g., for a network device, if the user wrote 1000 bytes, then the operating system would write the bytes to the device one at a time until the write() completed. • + Easy to use and understand • - if the device just can’t perform the operation (e.g. you unplug the cable), what to do? Give up an return the successful number of bytes. • Nonblocking: the OS only reads or writes as many bytes as is possible without suspending the process • + Returns quickly • - more work for the programmer (but really for robust programs)? CS 519: Operating System Theory

  12. Buffered vs. Unbuffered I/O • Sometime we want the ease of programming of blocked I/O without the long waits if the buffers on the device are small. • buffered I/O allows the kernel to make a copy of the data • write() side: allows the process to write() many bytes and continue processing • read() side: As device signals data is ready, kernel places data in the buffer. When the process calls read(), the kernel just copies the buffer. • Why not use buffered I/O? • - Extra copy overhead • - Delays sending data CS 519: Operating System Theory

  13. Synchronous vs. Asynchronous I/O • Synchronous I/O: the user process does not run at the same time the I/O does --- it is suspended during I/O • So far, all the methods presented have been synchronous. • Asynchronous I/O: The read() or write() call returns a small object instead of a count. • separate set of methods in unix: aio_read(), aio_write() • The user can call methods on the returned object to check “how much” of the I/O has completed • The user can also allow a signal handler to run when the the I/O has completed. CS 519: Operating System Theory

  14. Handling Multiple I/O streams • If we use blocking I/O, how do we handle > 1 device at a time? • Example : a small program that reads from serial line and outputs to a tape and is also reading from the tape and outputting to a serial line • Structure the code like this? • while (TRUE) { • read(tape); // block until tape is ready • write(serial line);// send data to serial line • read(serial line); • write(tape); • } • Could use non-blocking I/O, but huge waste of cycles if data is not ready. CS 519: Operating System Theory

  15. Solution: select system call • totalFds = select(nfds, readfds, writefds, errorfds, timeout); • nfds: the range (0.. nfds) of file descriptors to check • readfds: bit map of fileHandles. user sets the bit X to ask the kernel to check if fileHandle X ready for reading. Kernel returns a 1 if data can be read on the fileHandle. • writefds: bit map if fileHandle Y for writing. Operates same as read bitmap. • errorfds: bit map to check for errors. • timeout: how long to wait before un-suspending the process • totalFds = number of set bits, negative number is an error CS 519: Operating System Theory

  16. Three Device Types • Most operating system have three device types: • Character devices • Used for serial-line types of devices (e.g. USB port) • Network devices • Used for network interfaces (E.g. Ethernet card) • Block devices: • Used for mass-storage (E.g. disks and CDROM) • What you can expect from the read/write methods changes with each device type CS 519: Operating System Theory

  17. Character Devices • Device is represented by the OS as an ordered stream of bytes • bytes sent out the device by the write() system call • bytes read from the device by the read() system call • Byte stream has no “start”, just open and start/reading writing • The user has no control of the read/write ratio, the sender process might write() 1 time of 1000 bytes, and the receiver may have to call 1000 read calls each receiving 1 byte. CS 519: Operating System Theory

  18. Network Devices • Like I/O devices, but each write() call either sends the entire block (packet), up to some maximum fixed size, or none. • On the receiver, the read() call returns all the bytes in the block, or none. CS 519: Operating System Theory

  19. Block Devices • OS presents device as a large array of blocks • Each block has a fixed size (1KB - 8KB is typical) • User can read/write only in fixed sized blocks • Unlike other devices, block devices support random access • We can read or write anywhere in the device without having to ‘read all the bytes first’ • But how to specify the nth block with current interface? CS 519: Operating System Theory

  20. The file pointer • O/S adds a concept call the file pointer • A file pointer is associated with each open file, if the device is a block device • the next read or write operates at the position in the device pointed to by the file pointer • The file pointer is measured in bytes, not blocks CS 519: Operating System Theory

  21. Seeking in a device • To set the file pointer: • absoluteOffset = lseek(fileHandle, offset, whence); • whence specifies if the offset is absolute, from byte 0, or relative to the current file pointer position • the absolute offset is returned; negative numbers signal error codes • For devices, the offset should be a integral number of bytes relative to the size of a block. • How could you tell what the current position of the file pointer is without changing it? CS 519: Operating System Theory

  22. Block Device Example • You want to read the 10th block of a disk • each disk block is 2048 bytes long • fh = open(/dev/sda, , , ); • pos = lseek(fh, size*count, ); • if (pos < 0 ) error; • bytesRead = read(fh, buf, blockSize); • if (bytesRead < 0) error; • … CS 519: Operating System Theory

  23. Getting and setting device specific information • Unix has an IO ConTroL system call: • ErrorCode = ioctl(fileHandle, int request, object); • request is a numeric command to the device • can also pass an optional, arbitrary object to a device • the meaning of the command and the type of the object are device specific CS 519: Operating System Theory

  24. Communication Between CPU and I/O Devices • How does the CPU communicate with I/O devices? • Memory-mapped communication • Each I/O device assigned a portion of the physical address space • CPU  I/O device • CPU writes to locations in this area to "talk" to I/O device • I/O device  CPU • Polling: CPU repeatedly check location(s) in portion of address space assigned to device • Interrupt: Device sends an interrupt (on an interrupt line) to get the attention of the CPU • Programmed I/O, Interrupt-Driven, Direct Memory Access • PIO and ID = word at a time • DMA = block at a time CS 519: Operating System Theory

  25. Programmed I/O vs. DMA • Programmed I/O is ok for sending commands, receiving status, and communication of a small amount of data • Inefficient for large amount of data • Keeps CPU busy during the transfer • Programmed I/O  memory operations  slow • Direct Memory Access • Device read/write directly from/to memory • Memory  device typically initiated from CPU • Device  memory can be initiated by either the device or the CPU CS 519: Operating System Theory

  26. Disk Disk Disk Programmed I/O vs. DMA CPU Memory CPU Memory CPU Memory Interconnect Interconnect Interconnect Programmed I/O DMA DMA Device  Memory Problems? CS 519: Operating System Theory

  27. Direct Memory Access • Used to avoid programmed I/O for large data movement • Requires DMA controller • Bypasses CPU to transfer data directly between I/O device and memory CS 519: Operating System Theory

  28. Six step process to perform DMA transfer CS 519: Operating System Theory

  29. Performance • I/O a major factor in system performance • Demands CPU to execute device driver, kernel I/O code • Context switches due to interrupts • Data copying • Network traffic especially stressful CS 519: Operating System Theory

  30. Improving Performance • Reduce number of context switches • Reduce data copying • Reduce interrupts by using large transfers, smart controllers, polling • Use DMA • Balance CPU, memory, bus, and I/O performance for highest throughput CS 519: Operating System Theory

  31. Device Driver • OS module controlling an I/O device • Hides the device specifics from the above layers in the kernel • Supporting a common API • UNIX: block or character device • Block: device communicates with the CPU/memory in fixed-size blocks • Character/Stream: stream of bytes • Translates logical I/O into device I/O • E.g., logical disk blocks into {head, track, sector} • Performs data buffering and scheduling of I/O operations • Structure • Several synchronous entry points: device initialization, queue I/O requests, state control, read/write • An asynchronous entry point to handle interrupts CS 519: Operating System Theory

  32. Some Common Entry Points for UNIX Device Drivers • Attach: attach a new device to the system. • Close: note the device is not in use. • Halt: prepare for system shutdown. • Init: initialize driver globals at load or boot time. • Intr: handle device interrupt (not used). • Ioctl: implement control operations. • Mmap: implement memory-mapping (SVR4). • Open: connect a process to a device. • Read: character-mode input. • Size: return logical size of block device. • Start: initialize driver at load or boot time. • Write: character-mode output. CS 519: Operating System Theory

  33. User to Driver Control Flow read, write, ioctl user kernel ordinary file special file file system character device block device buffer cache character queue driver_read/write driver-strategy CS 519: Operating System Theory

  34. Buffer Cache • When an I/O request is made for a block, the buffer cache is checked first • If block is missing from the cache, it is read into the buffer cache from the device • Exploits locality of reference as any other cache • Replacement policies similar to those for VM • UNIX • Historically, UNIX has a buffer cache for the disk which does not share buffers with character/stream devices • Adds overhead in a path that has become increasingly common: disk  NIC CS 519: Operating System Theory

  35. Disks • Seek time: time to move the disk head to the desired track • Rotational delay: time to reach desired sector once head is over the desired track • Transfer rate: rate data read/write to disk • Some typical parameters: • Seek: ~10-15ms • Rotational delay: ~4.15ms for 7200 rpm • Transfer rate: 30 MB/s Sectors Tracks CS 519: Operating System Theory

  36. Disk Scheduling • Disks are at least four orders of magnitude slower than main memory • The performance of disk I/O is vital for the performance of the computer system as a whole • Access time (seek time+ rotational delay) >> transfer time for a sector • Therefore the order in which sectors are read matters a lot • Disk scheduling • Usually based on the position of the requested sector rather than according to the process priority • Possibly reorder stream of read/write request to improve performance CS 519: Operating System Theory

  37. Disk Scheduling Policies • Shortest-service-time-first (SSTF): pick the request that requires the least movement of the head • SCAN (back and forth over disk): good service distribution • C-SCAN (one way with fast return): lower service variability • Problem with SSTF, SCAN, and C-SCAN: arm may not move for long time (due to rapid-fire accesses to same track) • N-step SCAN: scan of N records at a time by breaking the request queue in segments of size at most N and cycling through them • FSCAN: uses two sub-queues, during a scan one queue is consumed while the other one is produced CS 519: Operating System Theory

  38. Disk Management • Low-level formatting, or physical formatting — Dividing a disk into sectors that the disk controller can read and write. • To use a disk to hold files, the operating system still needs to record its own data structures on the disk. • Partition the disk into one or more groups of cylinders. • Logical formatting or “making a file system”. • Boot block initializes system. • The bootstrap is stored in ROM. • Bootstrap loader program. • Methods such as sector sparing used to handle bad blocks. CS 519: Operating System Theory

  39. Swap-Space Management • Swap-space — Virtual memory uses disk space as an extension of main memory. • Swap-space can be carved out of the normal file system,or, more commonly, it can be in a separate disk partition. • Swap-space management • 4.3BSD allocates swap space when process starts; holds text segment (the program) and data segment. • Kernel uses swap maps to track swap-space use. • Solaris 2 allocates swap space only when a page is forced out of physical memory, not when the virtual memory page is first created. CS 519: Operating System Theory

  40. Disk Reliability • Several improvements in disk-use techniques involve the use of multiple disks working cooperatively. • RAID is one important technique currently in common use. CS 519: Operating System Theory

  41. RAID • Redundant Array of Inexpensive Disks (RAID) • A set of physical disk drives viewed by the OS as a single logical drive • Replace large-capacity disks with multiple smaller-capacity drives to improve the I/O performance (at lower price) • Data are distributed across physical drives in a way that enables simultaneous access to data from multiple drives • Redundant disk capacity is used to compensate for the increase in the probability of failure due to multiple drives • Improve availability because no single point of failure • Six levels of RAID representing different design alternatives CS 519: Operating System Theory

  42. RAID Level 0 • Does not include redundancy • Data is stripped across the available disks • Total storage space across all disks is divided into strips • Strips are mapped round-robin to consecutive disks • A set of consecutive strips that maps exactly one strip to each disk in the array is called a stripe • Can you see how this improves the disk I/O bandwidth? • What access pattern gives the best performance? stripe 0 strip 2 strip 3 strip 1 strip 0 strip 4 strip 5 strip 6 strip 7 ... CS 519: Operating System Theory

  43. RAID Level 1 • Redundancy achieved by duplicating all the data • Every disk has a mirror disk that stores exactly the same data • A read can be serviced by either of the two disks which contains the requested data (improved performance over RAID 0 if reads dominate) • A write request must be done on both disks but can be done in parallel • Recovery is simple but cost is high strip 0 strip 1 strip 1 strip 0 strip 2 strip 3 strip 2 strip 3 ... CS 519: Operating System Theory

  44. RAID Levels 2 and 3 • Parallel access: all disks participate in every I/O request • Small strips since size of each read/write = # of disks * strip size • RAID 2: error correcting code is calculated across corresponding bits on each data disk and stored on log(# data disks) parity disks • Hamming code: can correct single-bit errors and detect double-bit errors • Less expensive than RAID 1 but still pretty high overhead – not really needed in most reasonable environments • RAID 3: a single redundant disk that keeps parity bits • P(i) = X2(i)  X1(i)  X0(i) • In the event of a failure, data can be reconstructed • Can only tolerate a single failure at a time X2(i) = P(i)  X1(i)  X0(i) b1 b0 b2 P(b) CS 519: Operating System Theory

  45. P(0-2) strip 0 RAID Levels 4 and 5 • RAID 4 • Large strips with a parity strip like RAID 3 • Independent access - each disk operates independently, so multiple I/O request can be satisfied in parallel • Independent access  small write = 2 reads + 2 writes • Example: if write performed only on strip 0: • P’(i) = X2(i)  X1(i)  X0’1(i) • = X2(i)  X1(i)  X0’(i)  X0(i)  X0(i) • = P(i)  X0’(i)  X0(i) • Parity disk can become bottleneck • RAID 5 • Like RAID 4 but parity strips are distributed across all disks strip 2 strip 1 strip 4 strip 5 P(3-5) strip 3 CS 519: Operating System Theory

  46. File System • File system is an abstraction of the disk • File Track/sector • To a user process • A file looks like a contiguous block of bytes (Unix) • A file system provides a coherent view of a group of files • A file system provides protection • API: create, open, delete, read, write files • Performance: throughput vs. response time • Reliability: minimize the potential for lost or destroyed data • E.g., RAID could be implemented in the OS as part of the disk device driver CS 519: Operating System Theory

  47. File API • To read or write, need to open • open() returns a handle to the opened file • OS associates a data structure with the handle • Data structure maintains current “cursor” position in the stream of bytes in the file • Read and write takes place from the current position • Can specify a different location explicitly • When done, should close the file CS 519: Operating System Theory

  48. Files vs. Disk Disk Files ??? CS 519: Operating System Theory

  49. Files vs. Disk Disk Files Contiguous Layout What’s the problem with this mapping function? What’s the potential benefit of this mapping function? CS 519: Operating System Theory

  50. Files vs. Disk Disk Files What’s the problem with this mapping function? CS 519: Operating System Theory

More Related