Chapter 9, Virtual Memory Overheads, Part 2 Sections 9.6-9.10

Chapter 9, Virtual MemoryOverheads, Part 2Sections 9.6-9.10

9.6 Thrashing • As noted earlier, in order for virtual memory to be practical, extremely low fault rates are necessary • If the number of frames allocated to a process falls below a certain level, the process will tend to fault frequently • This leads to bad performance (i.e., the process will run slowly)

Thrashing is the term used to describe the situation when the page fault rate is too high • Thrashing can be loosely defined as follows: A process is thrashing if it’s spending more time paging than it is executing • Note that due to the relative amount of time needed to access secondary storage compared to the size of a time slice, for example, it doesn’t take much paging before the time spent paging exceeds the time spent executing

It was pointed out earlier that there is an absolute minimum number of frames that a process would need in order to execute • If the number of frames fell below that threshold, then the process would have to be swapped out • The number of frames needed to prevent thrashing would be (considerably) higher than the absolute minimum • It is desirable to find ways to avoid thrashing short of simply swapping jobs out

The cause of thrashing • The book illustrates the idea with an example that occurred in an early system before this behavior was well understood • 1. Let the O/S scheduling subsystem monitor CPU utilization, and when it’s low, schedule additional jobs

2. Independently of scheduling, let global page replacement be done • Suppose any one job enters a heavy paging cycle • It will steal pages from other jobs • When the other jobs are scheduled, they end up paging in order to steal memory back

As more processes spend time paging, CPU utilization goes down • The scheduler sees this and schedules additional jobs • Adding more jobs increases the problem • It’s a vicious cycle • (It’s a feedback loop where the feedback causes the wrong action to be taken.) • The following diagram illustrates the effect of thrashing on performance

Thrashing occurs when too many jobs are competing for too little memory • Modern scheduling algorithm take into account not just CPU utilization, but paging behavior • When the values for each of these parameters indicate that thrashing is imminent, no new jobs should be scheduled

In order to keep users happy, multi-user systems tend to try to allow as many users as possible, accepting a certain degradation of performance • In single user, desktop type systems, there is no effective limit on how many different programs the user might try to run at the same time • The consequence is that even with appropriate scheduling algorithms, it’s still possible for a modern system to enter a thrashing state • If a system finally is overcome by thrashing, although not ideal, the ultimate solution is to swap out or terminate jobs

Local page replacement strategies are less prone to thrashing • Since processes can’t steal frames from each other, they are insulated from each other’s behavior • However, if the allocation each process gets is too small, each process individually can end up thrashing for the very reason that it can’t acquire any more frames

The ultimate question is, how many frames does a process need? • More precisely, the question is, how many frames does a process need at any given time? • Locality of reference underlies these questions • At any given time, a program is executing in some cluster of pages • The goal is to allocate enough frames for whatever locality or cluster the program is currently in

The working set model • This is a model of memory usage and allocation that is based on locality of reference • Define a parameter Δ • Let this be an integer which tells how many page references back into the past you want to keep track of

Define the working set window to be the Δ most recent page references • Define the working set for this Δ to be the set of unique page references within the Δ most recent references • Remember memory reference strings—the starting point for considering working sets is the set of memory references

Let this sequence of references be given • 1, 2, 3, 2, 1, 4, 6, 7, 3, 2, 1, 4, 5, 6, 7 • Let Δ = 5 • Let t1 be the set of the first five memory references, {1, 2, 3, 2, 1} • Then the working set of t1 is WS(t1) = {1, 2, 3} • The working set size of t1 is WSS(t2) = 3

Let t2 be the set of the five memory references beginning with page 6, {6, 7, 3, 2, 1} • Then the working set of t2 is WS(t2) = {1, 2, 3, 6, 7} • The working set size of t2 is WSS(t2) = 5

The general idea is to use the number of unique page references that occurred in the last Δ references as the model for ongoing behavior • It’s similar to other spots where past behavior is used to model future behavior • The number of unique references in Δ is called the working set • The goal is to allocate to a process a number of frames equal to the number of unique references

The definition of a working set is that number of frames which a process needs in order to execute without having an excessively high paging rate • So the idea is that the number of unique references in Δ will be used to determine how many frames that should be

Note that this still doesn’t address all of the problems • How big should Δ be? • If the number of unique references always equals Δ, that suggests Δ that doesn’t actually include a sufficient working set and Δ should be larger • If the number of unique references is significantly smaller than Δ, that suggests that it is not necessary to look at that many previous references and Δ could be smaller

The problem with this is imprecision • As noted earlier, a desired page fault rate might be in the range of 1 out of 100,000 or less • For any given Δ, what are the chances that the number of unique references in that Δ will so closely correspond to the number of unique references as the program continues to run? • The problem is not just that Δ is an imprecise measure. The problem also is that the working set size of a process changes over time

Δ’s and working set sizes calculated from them may be more useful in the aggregate • System-wide demand for memory frames is given by this formula, where the subscript i represents different processes: • D = Σ(WWSi)

If D is greater than the total number of frames, memory is over-allocated • This means that some degree of thrashing is occurring • This suggests that the scheduling algorithm should be suspending/swapping out jobs • This leads to the question of how to choose victims

If D is less than the total number frames, that means that from the point of view of memory, more jobs could be loaded and potentially scheduled • Note that the optimal number of jobs to be scheduled also depends on the mix of CPU and I/O bound jobs • It is conceivable that in order to optimize some measure of performance, like response time, the number and size of jobs scheduled would not always complete allocate physical memory

The book gives some a sample scenario for how to approximately determine a working set • Let a reference bit be maintained (in the page table, for example) which records whether or not a page has been accessed • In addition, let two bits be reserved for recording “historical” values of the reference bit • Let the Δ of interest be 10,000

Let a counter be set to trigger an interrupt after every 5,000 page references • When the interrupt occurs, shift the right history bit one position left, and shift the current reference bit into the vacated position • Then clear the current reference bit

What this really accomplishes is to make the clock algorithm historical • When a page fault occurs, the question boils down to, what page is a suitable victim? • Put another way, the question is, what page is no longer in the working set? • The point is that a page that is no longer in the working set is a suitable victim

A page where the current access bit is 0 and both history bits are 0 is considered to be out of the working set • Saying that Δ was 10,000 was inexact • When a page fault occurs, you are somewhere within the current count of 5,000 accesses, say at count x • if all of the access bits are 0, then the page has not been accessed within the last x + 5,000 + 5,000 page references

If a finer grained approximation ofΔ is desired, more history bits can be maintained and the interrupt can be triggered more often, with attendant costs • On the other hand, although the working set model is useful and has been implemented to varying degrees, it is not necessarily the cleanest, easiest way of avoiding thrashing

Page fault frequency • This topic is mercifully brief and short on details • The basic problem with thrashing is a high page fault rate • Rather than trying to track the working set sizes of processes, the problem can be approached more directly by tracking the page fault rate

The question again arises, are you talking globally or locally • In general, the discussion seems to be global • If the global fault rate goes too high, pick a victim and decrease the multi-programming level

There are various ways to pick a victim • In particular, you might think there was one offending process that had caused thrashing by entering a phase where it was demanding more memory • That may be true, but if was a high priority process, it might be a suitable victim • Any victim will do, because any victim will release memory to the remaining processes

If you were tracking page fault rate on individual processes, the scheme still works • If a process starts generating too many page faults, it means that it hasn’t been granted a large enough working set • If that process is of sufficient priority, then some other process will have to be swapped in order to give up its memory allocation

9.7 Memory-Mapped Files • In a high level language program, file operations include open(), read(), write(), etc. • It is possible and usual for these operations to trigger an action that directly affects the file in secondary storage • Using virtual memory techniques, it’s also possible to memory map the file

That means that just like a program file or dynamic data structure, memory pages can be allocated to keep all or part of the data file in memory • Then at run time the file operations would simply be translated into memory accesses

Assuming the system supports this and you have memory to spare, a program with file operations can run orders of magnitude faster and accomplish the same result • At the end of the run, all of the changed pages can be written out to update the copy of the file in secondary storage

Various details of memory mapped files • In theory, for a small file, you might read the whole thing in in advance • However, it should also be possible to rely on demand paging to bring it in as needed • Note that unlike program files, data files would probably be paged out of the file system, not out of an image in swap space

There is also the question of how memory page sizes map to secondary storage blocks (sectors, etc.) • This question also exists for normal paging—it is resolved at the lowest level of the O/S in concert with the MMU and disk access subsystem • In both cases, program paging and memory mapped files, it is simply transparent • An open file is accessible using file operations, whether it’s memory mapped or not

Some systems periodically check for modified pages and write them out • Again, this can be done both for regular paging and for memory mapped files • In either case, it’s transparent • In the case of a memory mapped file, it means that the cost of writing it out in its entirety will not be incurred at closing time.

Implementation details • Some systems only memory map files if specific system calls are made • Other systems, like Solaris, in effect always memory map files • If an application makes memory mapping calls, this is done in user space • If an application doesn’t make memory mapping calls, the system still does memory mapping, but in system space

The concepts of shared memory and shared libraries were raised earlier • Some systems also support shared access to memory mapped files • Of processes only read the files, no complications result • If processes write to the files, then concurrency control techniques have to be implemented

It is also possible to apply the copy on write principle • This was mentioned earlier in the context of memory pages shared by a parent and child process • With files, the idea is that if >1 process have access to the memory mapped file, if a process writes to it, then a new frame is allocated, the page is copied, and the write is made to the copy

The result is potentially a different copy of the file for each process • Pages that had not been written to could still be shared • Those that had been written could not • When each process closed the file, its separate copy would have to be saved with a distinct name or identifier

In Windows NT, 2000, and XP, shared memory and memory mapped files are implemented using the same techniques and are called using the same interface • In Unix and Linux systems, the implementation of shared memory and memory mapped files are separate

Memory mapped files in Java • The syntactical details themselves are not important—an example is given, but no practical example of its use, and there is no assignment on this • However, it’s useful to understand that a high level language API includes facilities for memory mapped files • This is not just some hidden feature of an O/S. It’s a functionality that an application programmer can specify in code

Going through the keywords in Java is a handy way of reviewing the concrete aspects of memory mapping files • Not surprisingly, this functionality exists in the I/O packages in Java. The book’s example imports these packages: • java.io.* • java.nio.* • java.nio.channels.*

For any type of file object in Java you can call the method getChannel() • This returns a reference to a FileChannel object • On a FileChannel object you can call the method map() • This returns a reference to a MappedByteBuffer object • This is a buffer of bytes belonging to the file which is mapped into memory

The map() method takes three parameters: • 1. mode = READ_ONLY, READ_WRITE, or PRIVATE • If the mode is private, this means copy-on-write is in effect • 2. position = the byte offset into the file where memory mapping is started • In a garden variety case, you’d just start at the beginning, offset 0

3. size = how many bytes of the file beyond the starting position are memory mapped • In a garden variety case you’d map the whole file • The FileChannel class has a method size() which will return the length of the file in bytes (-1?)

The Java API doesn’t have a call that will return the size of a memory page on the system • If it’s desirable in user code to keep track of pages, the page size will have to be obtained separately • In other words, the page size will have to hardcoded, and the code will then be system dependent

Chapter 9, Virtual Memory Overheads, Part 2 Sections 9.6-9.10

Chapter 9, Virtual Memory Overheads, Part 2 Sections 9.6-9.10

Presentation Transcript

Chapter 9: Virtual Memory

Chapter 9: Virtual Memory

Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

Chapter 9: Virtual Memory

Chapter 9, Virtual Memory Overheads, Part 1 Sections 9.1-9.5

Chapter 9: Virtual Memory

Chapter 9 Virtual Memory

Chapter 9: Virtual Memory

Chapter 9: Virtual Memory

Chapter 9 Virtual Memory

Chapter 9: Virtual Memory

Chapter 9: Virtual Memory

Chapter 9: Virtual Memory

Chapter 9: Virtual Memory

Chapter 9 Virtual Memory

Chapter 9: Virtual Memory

Chapter 9: Virtual Memory

Chapter 9: Virtual Memory