CS 620 Advanced Operating Systems

CS 620 Advanced Operating Systems Lecture 5 – Processes Professor Timothy Arndt BU 331

Threads • Motivation • Keep programs interactive. • Programs may block waiting for I/O. • This is a bad thing if the program is interactive. • Background save. • Consider saving a large file to a slow disk system. • The user has to wait for the save operation to finish before proceeding. • Multiple clients. • Consider also a server program with multiple clients. • We must fork off a new process for each client.

Threads • Basic idea • Multiple lightweight processes. • Traditional processes are created very slowly and use many system resources. • Threads are similar to processes. • They have their own stack and PC. • Single address space. • By sharing a single address space, there is no need to have separate (costly) address spaces. • Since threads of a single program cooperate (unlike processes of separate programs) it is possible for them to share an address space.

Thread Usage in Nondistributed Systems • Context switching as the result of IPC

Threads • Threads also share: • open files • child processes • timers • etc. • Synchronization primitives available. • Necessary since threads share a common memory. • Semaphores, mutexes, etc.

Threads • So what’s a thread? • Stack • Program counter • what about I/O, signals, global variables (like errno?)?

Threads • Is it managed from user or kernel level? • User threads have a very cheap task context switch • Kernel threads handle blocking I/O cleanly • In order for user threads not block, we need extended models for I/O • E.g. select() indicates which files are ready to transfer data so we don’t block • Hybrid is also possible

Thread Implementation • Combining kernel-level lightweight processes and user-level threads.

Threads • Preemption can be implemented via signal • Should user-level threads be preempted? • Easier programming model if processes yield() the processor. • But it is a nuisance to program with extra yield() calls • Preemption can be controlled with special no preempt regions

Threads • So how do you use threads? • User interface/computation/I/O handled separately (think of a browser) • Pop-up server threads • On multiprocessor systems, we can have threads working in parallel on the multiple processors as an alternative to shared memory IPC

Multithreaded Servers (1) • A multithreaded server organized in a dispatcher/worker model.

Threads in Windows • Each process in Windows contains one or more threads. • Threads are the executable units in Windows, not processes. • Threads have the following attributes: • The PID of the process that owns it. • A numeric base priority specifying its importance relative to other threads. • A dynamic priority. • Its execution time so far. • An allowed processor set. • An exit status.

Threads in Windows • The Windows Kernel schedules threads and handles interrupts and exceptions. • The Kernel schedules or dispatches threads for the processor in order of priority. • It also preempts threads of lower priority in favor of threads of higher priority. • It can force context switches, directing the processor to drop one task ands pick up another. • Therefore code operating in this system must be reentrant. (Able to be interrupted and resumed unharmed and shared by different threads executing the code on different processors.)

Threads in Windows • The Kernel’s own code does not, technically, run in threads. • Hence it is the only part of the OS that is not preemptible or pageable. • The rest of the threads in Windows are preemptible and fully reentrant. • Code which is non-reentrant can cause serialization, damaging the performance of the OS on SMP machines. • The Kernel schedules ready threads for processor time based upon their dynamic priority, a number from 1 to 31.

Threads in Windows • The highest priority thread always runs on the processor, even if this requires that a lower-priority thread be interrupted. • The base priority class of a process establishes a range for the base priority of the process and its thread. The base priority classes are: • Idle • Normal • High • Real-Time • The base priority of a process varies within the range established by its base priority class.

Threads in Windows • When a user interacts with a process (the process window is at the top of the window stack), Windows boosts the priority of the process to maximize its response. • The base priority of a thread is a function of the base priority of the process in which it runs. It varies within +/- 2 from the base priority of the process. • The dynamic priority of a thread is a function of its base priority. Windows continually adjusts the dynamic priority of threads within the range established by its base priority. • The base priority class of a running process can be changed by using Task Manager.

Threads in Windows • The Windows Kernel takes maximum advantage of multiprocessor configurations by implementing symmetric multiprocessing (SMP) and soft affinity. • SMP allows the threads of any process, including the OS, to run on any processor. • The threads of a single process can run on different processors at the same time. • With soft affinity, the Kernel attempts to run the thread on the last processor it ran on. • Applications can restrict threads to run only on certain processors (hard affinity).

Threads in Windows • The Kernel manages two types of objects: • Dispatcher objects have a signal state (signaled or nonsignaled) and control dispatching and synchronization of system operations. • Semaphores, mutexes, events, etc. • Control objects are used to control the operation of the Kernel but do not affect dispatching. • Processes, interrupts, etc. • The I/O Manager (a component of the Windows Executive) supports Asynchronous I/O. • Asynchronous I/O allows an application to continue working while an I/O operation completes.

Threads in Windows • A thread may wait for the I/O to complete or we may use an application procedure call (APC) that the I/O manager calls when the I/O completes or we may use a synchronization object (e.g. an event) that the I/O system sets to the signaled state when I/O completes. • The Process Manager creates and deletes processes and tracks process objects and thread objects. • The subsystems define the rules for threads and processes.

System Model • We look at three models: • Workstations (zero cost solution) • Clusters (a.k.a. NOW a.k.a. COW, a.k.a. LAMP, a.k.a. Beowulf, a.k.a. pool) • Hybrid • Workstation model • Connect workstations in department via LAN • Includes personal workstations and public ones • We often have dedicated file servers

Workstation Model

Workstation Model - UMich

Workstations • The workstations can be diskless • Not so popular anymore (disks are cheap) • Maintenance is easy • Must have some startup code in ROM • If you have a disk on the workstation you can use it for • 1. Paging and temporary files • 2. 1. + (some) system executables • 3. 2. + file caching • 4. full file system

Workstations • Case 1 is often called dataless • Just as easy to maintain (software) as diskless • We still need startup code in ROM • Case 2 • Reduces load more and speeds up program start time • Adds maintenance since new releases of programs must be loaded onto the workstations • Case 3 • We can have a very few executables permanently on the disk • Must keep the caches consistent • Not trivial for data files with multiple writers • This issue comes up for NFS as well • Should you cache whole files or blocks?

Workstations • Case 4 • You can work if just your machine is up • But you lose location transparency • Also requires the most maintenance • Using idle workstations • Early systems did this manually via rsh • Still used today. • How do you find idle workstations?

Workstations • Idle = no mouse or keyboard activity and low load average • Workstation can announce it is idle and this is recorded by all • A job looking for a machine can inquire • Must worry about race conditions • Some jobs want a bunch of machines so they look for many idle machines • Can also have centralized solution, processor server • Usual tradeoffs apply here • What about the local environment?

Workstations • Files on servers are no problem • Requests for local files must be sent home • ... but not needed for temporary files • System calls for memory or process management probably need to be executed on the remote machine • Time is a bit of a mess unless have we time synchronized by a system like ntp • If a program is interactive, we must deal with devices • mouse, keyboard, display • What if the borrowed machine becomes non-idle (i.e. the owner returns)?

Workstations • Detect presence of user. • Kill off the guest processes. • Helpful if we made checkpoints (or ran short jobs) • Erase files, etc. • We could try to migrate the guest processes to other hosts but this must be very fast or the owner will object. • Our goal is to make owner not be aware of our presence. • May not be possible since you may have paged out his basic environment (shell, editor, X server, window manager) that s/he left running when s/he stopped using the machine.

Clusters • Bunch of workstations without displays in machine room connected by a network. • They are quite popular now. • Indeed some clusters are packaged by their manufacturer into a serious compute engine. • Ohio Supercomputing Center replaced MPP and Vector supercomputers with clusters • Used to solve large problems using many processors at one time • Pluses of large time sharing system vs. small individual machines.

A Cluster System

Clusters • Also the minuses of timesharing. • We can use easy queuing theory to show that a large fast server better in some cases than many slower personal machines. • Hybrid • Each user has a workstation and uses the pool for big jobs. • It is the dominant model for cluster based machines.

Virtualization • Virtualization has a long history. • It was important in the 1960s/70s • Faded during the 1980s/90s • Increasing importance nowadays • Security • Ease of management • Different types of virtualization • Process virtual machine • Virtual machine monitor (hardware virtual machine)

Virtualization • Process virtual machine • JVM • Macromedia Flash Player • Wine • VMM • VMWare • Parallels • VirtualBox • Microsoft Virtual PC

The Role of Virtualization in Distributed Systems • (a) General organization between a program, interface, and system. (b) General organization of virtualizing system A on top of system B.

Architectures of Virtual Machines • Interfaces at different levels • An interface between the hardware and software consisting of machine instructions • that can be invoked by any program. • An interface between the hardware and software, consisting of machine instructions • that can be invoked only by privileged programs, such as an operating system.

Architectures of Virtual Machines • Interfaces at different levels • An interface consisting of system calls as offered by an operating system. • An interface consisting of library calls • generally forming what is known as an application programming interface (API). • In many cases, the aforementioned system calls are hidden by an API.

Architectures of Virtual Machines • Figure 3-6. Various interfaces offered by computer systems.

Architectures of Virtual Machines • A process virtual machine, with multiple instances of (application, runtime) combinations.

Architectures of Virtual Machines • A virtual machine monitor, with multiple instances of (applications, operating system) combinations.

Processor Allocation • Processor Allocation • Decide which processes should run on which processors. • Could also be called process allocation. • We assume that any process can run on any processor.

Processor Allocation • Often the only difference between different processors is: • CPU speed • CPU speed and amount of memory • What if the processors are not homogeneous? • Assume that we have binaries for all the different architectures. • What if not all machines are directly connected • Send process via intermediate machines

Processor Allocation • If we have only PowerPC binaries, restrict the process to PowerPC machines. • If we need machines very close for fast communication, restrict the processes to a group of close machines. • Can you move a running process or are processor allocations done at process creation time? • Migratory allocation algorithms vs. non migratory.

Processor Allocation • What is the figure of merit, i.e. what do we want to optimize in order to find the best allocation of processes to processors? • Similar to CPU scheduling in centralized operating systems. • Minimize response time is one possibility.

Processor Allocation • We are not assuming all machines are equally fast. • Consider two processes. P1 executes 100 millions instructions, P2 executes 10 million instructions. • Both processes enter system at time t=0 • Consider two machines A executes 100 MIPS, B 10 MIPS • If we run P1 on A and P2 on B each takes 1 second so average response time is 1 sec. • If we run P1 on B and P2 on A, P1 takes 10 seconds P2 .1 sec. so average response time is 5.05 sec. • If we run P2 then P1 both on A finish at times .1 and 1.1 so average response time is .6 seconds!!

Processor Allocation • Minimize response ratio. • Response ratio is the time to run on some machine divided by time to run on a standardized (benchmark) machine, assuming the benchmark machine is unloaded. • This takes into account the fact that long jobs should take longer. • Maximize CPU utilization • Throughput • Jobs per hour • Weighted jobs per hour

Processor Allocation • If weighting is CPU time, we get CPU utilization • This is the way to justify CPU utilization (user centric) • Design issues • Deterministic vs. Heuristic • Use deterministic for embedded applications, when all requirements are known a priori. • Patient monitoring in hospital • Nuclear reactor monitoring • Centralized vs. distributed • We have a tradeoff of accuracy vs. fault tolerance and bottlenecks.

Processor Allocation • Optimal vs. best effort • Optimal normally requires off line processing. • Similar requirements as for deterministic. • Usual tradeoff of system effort vs. result quality. • Transfer policy • Does a process decide to shed jobs just based on its own load or does it have (and use) knowledge of other loads? • Also called local vs. global • Usual tradeoff of system effort (gather data) vs. result quality.

Processor Allocation • Location policy • Sender vs. receiver initiated. • Sender initiated - uploading programs to a compute server • Receiver initiated - downloading Java applets • Look for help vs. look for work. • Both are done.

Processor Allocation • Implementation issues • Determining local load • Normally use a weighted mean of recent loads with more recent weighted higher.

CS 620 Advanced Operating Systems