Memory management

Memory management Lecture 7 ~ Winter, 2007 ~

Contents • Context and definition • Basic memory management • Swapping • Virtual memory • Paging • Page replacement algorithms • Segmentation

The context • The need for a memory to be • very large • very fast • nonvolatile • Types of memory – hierarchy • small, very fast and expensive, volatile, cache memory • hundreds of MBs, medium-speed, medium-price, volatile main memory (RAM) • tens or hundreds of GBs of slow, cheap, nonvolatile disk storage

Definition • Memory manager • an OS component that manage the main memory of a system (memory management) • Its role is to • coordinate how the different types of memory are used • keep track of which part of memory are in use and which are not • allocate and release areas of main memory to processes • manage swapping between main memory and disk, when main memory is to small to hold all the processes

Basic memory management Mono-programming (1) • No swapping or paging • Run only one program at a time • In memory are loaded • the only program that is run • the OS • Way of memory organization • OS at the bottom of memory and the user program above • OS at the top of memory (ROM) and the user program bellow • OS at the bottom of memory, device drivers in ROM (mapped at the top of memory – BIOS) and the user program between them

Basic memory management Mono-programming (2)

Basic memory managementMulti-programming • More programs loaded into the memory in the same time • Increase CPU utilization • Multi-programming with fixed memory partitions • Divide memory up into n partitions • fixed sizes, not necessarily equal  lost space • Waiting queues for partitions • different waiting queues for different partitions • one global waiting queue • Different strategies to choose a process that fits in a free partition • the first that fit  waste of space • the largest that fit  discriminates against small processes • a process not to be skipped over more than k times

Basic memory managementMultiprogr. with fixed partitions

Basic memory managementRelocation and protection (1) • Context of multiprogramming • More processes in the same time in memory • Different processes will be loaded and run at different addresses • Need for relocation • address locations of variables, code routines cannot be absolute, but relative • the relative addresses used in a process must be translated into real addresses • Need for protection • Protect the code of OS against the processes • Protect one process against other processes

Basic memory managementRelocation and protection (2) • Linker – Loader method • Relocate the addresses as the program is loaded into memory • The linker has to generate a list of the addresses that have to be relocated • Base and limit registers • Add the base register value to every address • Compare every address with the value of limit reg.

SwappingDescription • The reason • no more space in memory to keep al the active processes • The technique • some processes are kept on the disk and brought in to run dynamically • swap out (memory HDD) • swap in (memory HDD) • at one moment a process is entirely in the memory to be run or entirely on the HDD

SwappingAn example

SwappingAdvantages and disadvantages • Similar with the technique of fixed size partitions, but • variable number of partition • variable size of partitions • Improves memory utilizations • Complicated allocating and deallocating memory • Memory compaction – eliminates holes • Pre-allocates more space than needed – for possibly growing segments • Complicated keeping track of free memory

SwappingPreallocation of space

SwappingMemory management with bitmaps (1) • The memory is divided up into allocation units of the same size • Each allocation unit has a bit corresponding bit in the bitmap • 0  the unit is free • 1  the unit is allocated • The size of the allocation unit is important • The smaller the size, the greater the bitmap • The greater the size, the smaller the bitmap, but results in waste of memory (internal fragmentation) • Simple to use and implement • The search of k consecutives free units is slow

SwappingMemory management with bitmaps (2)

SwappingMemory management with linked lists (1) • A single list of allocated and free segments (process or hole) of memory • List sorted by the memory address • updating the list is simple and fast

SwappingMemory management with linked lists (2) • Allocation of memory • First fit – fast • Next fit – slightly worse performance than first fit • Best fit – slower; results in more wasted memory • Worst fit • Separate lists for processes and holes • Speed up searching for a hole at allocation • Complicates releasing of memory • Holes list can be sorted by the size • Quick fit

Virtual MemoryDefinition and terms (1) • The context • the programs (code, data, stack) exceed the amount of physical memory available for them • The technique • the programs are not entirely loaded in memory • the OS keeps in main memory only those parts of a program that are currently in use, and the rest on the disk • swapping is used between main memory and disk • The result • the illusion that a computer has more memory than it actually has • each process has the illusion that it is the only process loaded in memory and can access entire memory

Virtual MemoryDefinition and terms (2) • Virtual addresses • program memory addresses • Virtual address space • all the (virtual) addresses a program can generate • is given by the number of bytes used to specify an address • Physical (real) addresses • addresses in main memory (on memory bus) • Physical memory available • Memory Management Unit (MMU) • a mapping unit from virtual addresses into physical addresses

Virtual MemoryMemory Management Unit (MMU)

PagingDefinition • Virtual address space • divided up into units of the same size called pages • pages from 0 to AddressSpaceSize / PageSize - 1 • The physical memory • divided up into units of the same size called page frames • page frames from 0 to PhysicalMemSize / PageSize - 1 • Pages and frames are the same size • PageSize is typically be a value between 512 bytes and 64KB • Transfers between RAM and disk are in units of a page

PagingMapping virtual onto physical move REG, 0  move REG, 8192 • virtual address 0 = (virtual page 0, offset 0) • virtual page 0  physical page 2 • physical address 8192: = (physical page 2, offset 0) move REG, 20500  move REG, 12308 • virtual address 20500 = (virtual page 5, offset 20) • virtual page 5  physical page 3 • physical address 12308 = (physical page 3, offset 20)

PagingProviding virtual memory • Only some virtual pages are mapped onto physical memory • Some virtual pages are kept on disk • Page table – mapping unit • Present/Absent bit • Page fault • a trap into the OS because of a reference to an address located in a page not in memory • generated by the MMU • result in a swap between physical memory and disk • the referenced page is loaded from disk into memory • the trapped instruction is re-executed

PagingPage tables (1)

PagingPage tables (2) • Role • to map virtual pages onto physical page frames • Each process has its own page table • Page tables can be extremely large • 32 bits, 4KB page => 1.048.576 entries • Mapping must be fast • is done on every memory reference

PagingPage tables (3) • An array of fast registers, with one entry for each entry in the virtual page • page table is copied from memory into registers • no more memory references needed • context switch is expensive • A single register • page table in memory • the register points to the start of page table • context switch is fast • more references to the memory for reading page table entries

PagingMultilevel Page Tables • 32 bit virtual addresses • 10 bits – PT1 • 10 bits – PT2 • 12 bits – offset • Page size = 4KB • No. of pages = 220 • Top-level page table – 1024 entries • An entry  4MB • Second-level page table – 1024 entries

PagingStructure of a page table entry • The size is computer dependent • 32 bit is commonly used • Page frame number • Present/Absent bit • Protection bits • read, write, read only etc. • Modified bit (dirty bit) • Referenced bit

PagingTranslation Lookaside Buffers (TLB) (1) • Observations • Keeping the page tables in memory reduce drastically the performance • Large number of references to a small number of pages • TLB or associative memory • a small fast hardware for mapping virtual addresses to physical addresses • a sort of table with a small number of entries (usually less than 64)  maps only a small number of virtual pages • a TLB entry contains information about one page • the search of a virtual page in TLB is done simultaneously in all the entries of the TLB • can be also implemented in software

PagingTranslation Lookaside Buffers (TLB) (2)

PagingInverted page tables (1) • Is a solution for handling large address spaces • 64 bit computer, with 4KB pages  252 page table entries, with 8 bytes/entry  over 30 mil. GB page table • One table per system with an entry for each page frame • An entry contains the pair (process, virtual page) mapped into the corresponding page frame • The virtual-to-physical translation becomes much harder and slower • search the entire table at every memory reference • Practically: use of TLB and hash tables

PagingInverted page tables (2)

Page replacement algorithmsThe context • At page fault and full physical memory • Space has to be made • A currently loaded virtual page has to be evicted from memory • Choosing the page to be evicted • not a heavily used page  reduce the number of page faults • Page replacement • the old page has to be written on the disk if it was modified • the new virtual page overwrite the old virtual page into the page frame

Page replacement algorithmsThe optimal algorithm • Choose the page that will be the latest one accessed in the future between all the pages actually in memory • Very simple and efficient (optimal) • Impossible to be implemented in practice • there is no way to know when each page will be referenced next • It can be simulated • At first run collect information about pages references • At second run use results of the first run (but with the same input) • It is used to evaluate the performance of other, practically used, algorithms

Page replacement algorithmsNot Recently Used (NRU) (1) • Each page has two status bits associated • Referenced bit (R) • Modified bit (M) • The two bits • updated by the hardware at each memory reference • once set to 1 remain so until they are reset by OS • can be also simulated in software when the mechanism is not supported by hardware

Page replacement algorithmsNot Recently Used (NRU) (2) • At process start the bits are set to 0 • Periodically (on each clock interrupt) the R bit is cleared • For page replacement, pages are classified • Class 0: not referenced, not modified • Class 1: not referenced, modified • Class 2: referenced, not modified • Class 3: referenced, modified • The algorithm removes a page at random from the lowest numbered nonempty class • It is easy to understand, moderately efficient to implement, and gives an adequate performance

Page replacement algorithmsFirst-In, First-out (FIFO)

Page replacement algorithmsThe second chance (1) • A modification of FIFO to avoid throwing out a heavily used page • Inspect the R bit of the oldest page • 0  page is old und unused  replaced • 1  page is old but used  its R bit = 0 and the page is moved at the end of the queue as a new arrived page • Look for an old page that has not been not referenced in the previous clock interval • If all the pages have been references  FIFO

Page replacement algorithmsThe second chance (2)

Page replacement algorithmsThe Clock algorithm

Page replacement algorithmsLeast Recently Used (LRU) (1) • Based on the observation that • pages that have been heavily used in the last few instructions will probably be heavily used again in the next few • Throw out the page that has been unused for the longest time • The algorithm keeps a linked list • the referenced page is moved at the front of the list • the page at the end of the list is replaced • the list must be updated at each memory reference  costly

Page replacement algorithmsLeast Recently Used (LRU) (2)

Page replacement algorithmsLeast Recently Used (LRU) (3) • Implementing LRU with a hardware counter • keep a 64-bit counter which is incremented after each instruction • each page table entry has a field large enough to store the counter • the counter is stored in the page table entry for the page just referenced • the page with the lowest value of its counter field is replaced • Implementing LRU with a hardware bits matrix • N page frames  N x N bits matrix • when virtual page k is referenced • bits of row k are set to 1 • bits of column k are set to 0 • the page with the lowest value is removed

Page replacement algorithmsLeast Recently Used (LRU) (4) Reference string: 0 1 2 3 2 1 0 3 2 3

Page replacement algorithmsNot Frequently Used and Aging (1) • A software implementation of LRU • NFU consists of • a software counter associated with each page • at each clock tick the R bit is added to the counter for all the pages in memory; after that the R bits are reset to 0 • the page with the lowest counter is chosen • Problem: never forgets anything; does not evict pages which were heavily used in the past, but are not any more used (their counter remains great) • Aging – modification of NFU • Shift right the counter one position • R bit is added to the leftmost bit • Differences from LRU • does not know the page which was referenced first between two ticks; for example pages 3 and 5 at step (e) • the finite number of bits of counters  does not differentiate between pages with the value 0 of their counter

Page replacement algorithmsIllustration of aging

Modeling Page Repl. AlgorithmsBelady’s Anomaly FIFO Algorithm with 3 page frames FIFO Algorithm with 4 page frames

Design Issues for PagingLocal versus Global Allocation Policy (1) • The context • Page fault • Page replacement • Question: which pages are taken into account? • Local: pages of the current process • Global: pages of all processes • Answer: depends on the strategy used to allocate memory between the competing runnable processes • Local: every process has a fixed fraction of memory allocated • Global: page frames are dynamically allocated among runnable processes

Design Issues for PagingLocal versus Global Allocation Policy (2) • Global algorithms work better, especially when the working set size can vary • greater than the allocated size  thrashing • smaller than the allocated size  waste of memory • Strategies for global policy • Monitor the size of working set of all processes • based on the age of pages • Page frames allocation algorithm • allocate pages proportionally with each process size • give each process a minimum number of frames • the allocation is updated dynamically • for example use PFF (Page Fault Frequency) algorithm • Same page replacement algorithms • can work with both policies (FIFO, LRU) • can work only with the local policy (WSCLock)

Memory management