1 / 161

Memory Management

Memory Management. CS 550 Spring 2014 Kenneth Chiu. Background. Program must be brought (from disk) into memory and placed within a process for it to be run Main memory and registers are only storage CPU can access directly Register access in one CPU clock (or less)

sarah
Download Presentation

Memory Management

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory Management CS 550Spring 2014Kenneth Chiu

  2. Background • Program must be brought (from disk) into memory and placed within a process for it to be run • Main memory and registers are only storage CPU can access directly • Register access in one CPU clock (or less) • Main memory can take many cycles • Cachesits between main memory and CPU registers • Protection of memory required to ensure correct operation.

  3. Many Ways to Do Memory Management! • We will cover a variety of them. • Many of them are not mutually exclusive, and can be hybridized in various ways. • Also, many of them are not actually used as is, but are studied because they help illustrate basic principles. • What you need to learn is to get some idea of the possibilities so that when you encounter a specific one, you will be able to grasp it quickly.

  4. The Problem • There is a finite amount of RAM on any machine. • There is a bus with finite speed, non-zero latency. • There is a CPU with a given clock speed. • There is a disk with a given speed. • How do we manage memory so that we make most effective use of the machine? • Mainly, that means preventing CPU idle time, maintaining good response time, etc. • While maintaining isolation and security.

  5. Base and Limit Registers • Put every process at a different location in memory. • A pair of baseandlimitregisters define the logical address space.

  6. How do you handle memory references in the machine code in such a scheme?

  7. Logical vs. Physical Address Space • The concept of a logical address space that is bound to a separate physical address spaceis central to proper memory management. • Logical address– generated by the CPU; also referred to as virtual address. • Physical address– address seen by the memory unit. • Logical and physical addresses can be the same in primitive schemes; logical (virtual) and physical addresses differ in most modern schemes.

  8. Memory-Management Unit (MMU) • Hardware device that maps virtual to physical address. • In a simple MMU scheme, the value in the relocation register is added to every address generated by a user process at the time it is sent to memory. • The user program deals with logical addresses; it never sees the real physical addresses.

  9. Simple MMU.

  10. Swapping • A process can be swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution. • Backing store– fast disk large enough to accommodate copies of all memory images for all users; must provide direct access to these memory images. • Major part of swap time is transfer time; total transfer time is directly proportional to the amount of memory swapped. • Modified versions of swapping are found on many systems (i.e., UNIX, Linux, and Windows). • System maintains a ready queueof ready-to-run processes which have memory images on disk. • The term swap is not that precise. Often used slightly differently, depending on the OS.

  11. Schematic View of Swapping

  12. Contiguous memory allocation

  13. A common early way of allocating memory is to simply provide each process with a contiguous logical address space, and map that to a contiguous range of physical addresses, called a partition. • Relocation registers used to protect user processes from each other, and from changing operating system code and data. • Relocation register contains value of smallest physical address. • Limit register contains range of logical addresses. Each logical address must be less than the limit register. • MMU maps logical address dynamically. • OS also needs a partition: • Usually held in low memory with interrupt vector. • User processes then held in high memory.

  14. Hardware Support

  15. Memory Allocation • Can statically divide memory into several fixed-size partitions. • Each process goes into one partition, and each partition can only hold one process. • Disadvantage? • Can also use variable sized partitions. • OS must keep a table of occupied parts of memory. • An unoccupied chunk is called a hole. • Memory consists of occupied partitions and holes of various sizes.

  16. Holes of various size are scattered throughout memory. • When a process arrives, it is placed into an input queue. • OS makes periodic decisions about how to allocate a partition to a process from the input queue. • A hole that is too large will be split into a partition and a new hole. • When a process exits, a neighboring hole will be merged with the newly created hole. OS OS OS OS process 5 process 5 process 5 process 5 process 9 process 9 process 8 process 10 process 2 process 2 process 2 process 2

  17. Dynamic Storage-Allocation Problem • How to satisfy a request of size n from a list of free holes? • First-fit: Allocate the first hole that is big enough. • Fast. • Best-fit: Allocate the smallest hole that is big enough; must search entire list, unless ordered by size. • Produces the smallest leftover hole. • Worst-fit: Allocate the largest hole; must also search entire list. • Produces the largest leftover hole • Which do you thinks works best? • First-fit and best-fit better than worst-fit in terms of speed and storage utilization.

  18. Input queue interaction also a factor. If there is a hole of size N, should we? • Use it for the smallest job in the queue that fits? • Use it for the largest job in the queue that fits? • Use it only for the next job in the queue, if it doesn’t fit, don’t allocate (FCFS). • Which will guarantee highest number of jobs per second?

  19. Fragmentation • External Fragmentation: Wasted space due to discontiguous holes. • General rule of thumb, for a fully utilized memory, about 50% may be wasted. • Should we allocate from top or bottom of hole? • Internal Fragmentation: Allocated memory may be larger than actual requested memory, due to allocation granularity. • For example, if there is a hole of 2 bytes, no point in keeping track of it. • Usually very small for variable sized partitions. • External fragmentation can be addressed by compaction • Shuffle memory contents to place all free memory together in one large block. • Compaction is possible only if relocation is dynamic and is done at execution time. • I/O problem: What if I/O is going on? • Latch job in memory while it is involved in I/O, or • Do I/O only into OS buffers

  20. paging

  21. Paging • Many of the issues seen so far are due to contiguity requirements of the RAM. Let’s relax that. • Divide physical memory into fixed-sized blocks called frames, size is a power of 2. • Divide logical memory into blocks of same size called pages • Keep track of all free frames. • To run a program of size n pages, need to find n free frames and load program. • Set up a page table to translate logical to physical addresses.

  22. Addresses • With paging, address generated by CPU is divided into: • Page number (p): Used as an index into a page table which contains base address of each page in physical memory. • Page offset (d): Added to the base address to define the physical memory address that is sent to the memory unit. • Assume that there are m bits in the logical address, and that the page size is 2^n. Then: page number page offset p d m - n n

  23. Paging Hardware

  24. Mapping Between Logical and Physical Memory

  25. Example • How big is page? • Any external fragmentation? • Any internal fragmentation? • How big should a page be? 16-byte logical address space 32-byte physical memory

  26. Free Frames • When a new process arrives, it will request N pages. • N frames need to be found. • How would you do this?

  27. Before allocation After allocation

  28. Implementation of Page Table • Page tables are typically done via a combination of OS support and hardware support. Many variations. • Register scheme: • Page table is kept in registers. Registers loaded during context switch. • PDP-11: Address space is 16 bits, page size is 8 KB  13 bits. Page table needs ?? entries. • Will this work if address space is 32 bits? How big does the page table need to be?

  29. Another scheme, page table is kept in main memory: • Page-table base register (PTBR) points to the page table. • Page-table length register (PTLR) indicates size of the page table. • These registers are reloaded on context switch. • How many memory accesses does a load instruction take? • In this scheme every data/instruction access requires two memory accesses. One for the page table and one for the data/instruction. • The two memory access problem can be solved by the use of a special hardware cache called the translation look-aside buffer (TLBs), which is a form of associative memory.

  30. Associative Memory • Associative memory is a fancy name for a lookup table. • Use a key, produce a value. • In software, we would implement it using a hash table, binary search tree, etc. • In hardware, the idea is that it’s a parallel search on all keys. Very expensive. • Address translation of (p, d) is then: • If p is in associative register, produce the frame #. • Otherwise get frame # from page table in memory. Page # Frame #

  31. Paging Hardware With TLB

  32. Effective Access Time • How long does a lookup take with a TLB? • A crucial performance measure is the hit ratio, h, which is the percentage of time that the desired page is in the TLB. • Assume TLB lookup time is l. • Assume memory cycle time is m. • The effective access time (EAT) is then: EAT = (m + l) h + (2m + l)(1 – h) = 2m + l – h*m • Example: Assume 80% hit ratio, 20 ns to search TLB, 100 ns to access memory. • If in TLB, how long does it take access memory? • If not in TLB, how long does it take? • EAT = .8*120 + .2*220 = 140 ns

  33. Context Switches • What is the key in the TLB? • What happens when we do a context switch? • Normally must flush the TLB. • What does this do to performance? • Some TLBs store address-space identifiers (ASIDs) in each TLB entry which uniquely identifies each process to provide address-space protection for that process. • This allows the TLB to not be flushed.

  34. Memory Protection • Does memory need to be protected? Why? • Should we associate the protection with the frame (physical), or the page (logical)? • Memory protection implemented by associating protection bit with each page. • Bits are added to the page table entry. • These bits indicate things like read/write/execute. • One important one is the valid-invalid: • “Valid” indicates that the associated page is in the process’ logical address space, and is thus a legal page. • “Invalid” indicates that the page is not in the process’ logical address space.

  35. Consequence of internal fragmentation? Process only uses to 10,468 Page ends at 12,287

  36. Sharing Pages • Do we ever want to share memory between processes? Why? • Multiple processes executing the same program: • One copy of read-only code shared among processes (i.e., text editors, compilers, window systems). • Shared libraries: • Each process keeps a separate copy of the non-shared code and data. • The pages for the private code and data can appear anywhere in the logical address space. • IPC: • Implementing messages, etc. • How can this be done with paging?

  37. structure of the page table

  38. Structure of the Page Table • How big is the page table? • 4K pages, 32-bit system • 2^12 == 4K, 2^20 == 1 M entries • Each entry is 4 bytes, so 4 MB • Why is this a problem? • Is the page table managed by hardware or software? • Could be software. On TLB miss, exception is generated, software is responsible for traversing data structures and loading TLB. • The page table is just a lookup. What techniques are there available for lookup? • Binary search tree • Linked list • Arrays • Radix-based • Hash-based

  39. Hierarchical Page Tables • Break up the logical address space into multiple page tables. • A simple technique is a two-level page table. • First-level page table is contiguous, but small. • Each first-level PTE points to a separate 2nd-level table. • Each 2nd-level table must still be contiguous, but there are many of them, so each one is (relatively) small.

  40. Two-Level Page-Table Scheme

  41. Two-Level Paging Example • A logical address (on 32-bit machine with 4K page size) is divided into: • A page number consisting of 20 bits. • A page offset consisting of 12 bits. • Since the page table is paged, the page number is further divided into: • A 10-bit page number • A 10-bit page offset • Thus, a logical address is as follows:where p1 is an index into the outer page table, and p2 is the displacement within the page of the outer page table. page number page offset p2 p1 d 10 10 12

  42. Address-Translation Scheme

  43. VAX Example • 32-bit machine, 512 byte page size • Logical address space divided into 4 sections, each 230 bytes. • High-order 2 bits used to designate the section. • Next 21 bits for the page number of the section. • Last 9 bits for offset within the page. How big is each page table? section page offset 2 bits 21 bits 9 bits

  44. 64-Bit Systems • How big is the outer page table? • Three-level?

  45. Hashed Page Tables • Use a hash table for lookup. • The virtual page number is hashed and used as an index into a hash table. • That entry in the table contains a chain of elements hashing to the same location. • I.e., virtual page numbers that have the same hash code. • Virtual page numbers are compared in this chain searching for a match. • If a match is found, the corresponding physical frame is extracted.

  46. page number frame number

  47. Inverted Page Table • Normal page table has one entry for each page. • Optimizing it amounts to figuring out how to not use resources for large chunks of the page table that are empty. • If the ratio of logical address space to physical address space is large, this suggests using one entry per physical frame, rather than one entry per page. • Entry consists of the virtual address of the page stored in that real frame, with information about the process that owns that page. • Decreases memory needed to store each page table, but increases time needed to search the table when a page reference occurs. • Use hash table to limit the search to one — or at most a few — page-table entries.

  48. Why do we need the pid in the table?

  49. logical address physical address pid2 p2 d i d inverted page table hash anchor table hashfunction i

More Related