1 / 37

MAMAS – Computer Architecture Address spaces and Memory management Dr. Avi Mendelson

MAMAS – Computer Architecture Address spaces and Memory management Dr. Avi Mendelson. Some of the slides were taken from: (1) Jim Smith (2) Patterson presentations. Memory System Problems. Different Programs have different memory requirements How to manage program placement?

chessa
Download Presentation

MAMAS – Computer Architecture Address spaces and Memory management Dr. Avi Mendelson

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MAMAS – Computer ArchitectureAddress spaces and Memory managementDr. Avi Mendelson Some of the slides were taken from: (1) Jim Smith (2) Patterson presentations.

  2. Memory System Problems • Different Programs have different memory requirements • How to manage program placement? • Different machines have different amount of memory • How to run the same program on many different machines? • At any given time each machine runs a different set of programs • How to fit the program mix into memory? Reclaiming unused memory? Moving code around? • The amount of memory consumed by each program is dynamic (changes over time) • How to effect changes in memory location: add or subtract space? • Program bugs can cause a program to generate reads and writes outside the program address space • How to protect one program from another?

  3. Address Spaces

  4. Address Space of a single process (Linux) • Each process has a fixed size address space that can be divided into: • Symbol tables and other management areas • Code • Data • “static data” allocated by compiler • “Dynamic data allocated by “malloc” • Stack • Managed automatically memory invisible to user code kernel virtual memory stack %esp the “brk” ptr runtime heap (via malloc) uninitialized data (.bss) initialized data (.data) program text (.text) forbidden 0

  5. Virtual Memory • Main memory can act as a cache for the secondary storage (disk) • Divide memory (virtual and physical) into fixed size blocks (Pages, Frames) • Pages in Virtual space, Frames in Physical space • Page size = Frame size • Page size is a power of 2: page size = 2k • All pages in the virtual address space are contiguous • Pages can be mapped into physical Frames in any order • Some of the pages are in main memory (DRAM), some of the pages are on disk. Physical Addresses Virtual Addresses Address Translation Disk Addresses

  6. 2GB 2GB 2GB 2GB Managing Multiple Virtual Address spaces • Each process “sees” a private address space of 4G. • 2G private address space the process can use • 2G system address space, shared by all processes, and can be accessed only if the system at “supervisor” mode (kernel mode). • All processes are sharing the same (small) physical memory • Small part of the address space is kept in main memory (DRAM), most of the address space is either not mapped or is kept on disk. • All programs are written using Virtual Memory Address Space • The hardware does on-the-fly translationbetween virtual and physical address spaces. U Area U Area U Area Virtual address spaces

  7. How a process is created (exec) • The executable file contains the code, the initial values for the data area and initial tables (symbol tables etc). • A page table is created to map the virtual addresses to the physical addresses • At the start time, no page is in the main memory, so • Code pages are pointed to the disk where the code section within the file is • The initial data pages (.data) are mapped to the .data section within the executable file. • The system may allocate pages for initial stack space, data space (.BSS) and for the heap area (this is a performance optimization, the system can avoid it). • As the execution continues, the system allocates pages in the main memory, copies their content from the disk and change the pointer to point to the main memory. • When a page is replaced from the main memory, the system keeps it in a special place on the disk, called swap area.

  8. Virtual Memory can help to share address spaces • If several processes map different pages to the same physical page, the data/code of that page is shared (we will discuss how the OS takes advantage of this feature later on) 0 Physical Address Space (DRAM) Address Translation Virtual Address Space for Process 1: 0 VP 1 PP 2 VP 2 ... N-1 (e.g., read only or library code) PP 7 Virtual Address Space for Process 2: 0 VP 1 PP 10 VP 2 ... M-1 N-1

  9. 0: Read? Write? Physical Addr 1: VP 0: VP 0: Yes No PP 9 VP 1: VP 1: Yes Yes PP 4 VP 2: VP 2: No No XXXXXXX • • • • • • • • • Read? Write? Physical Addr Yes Yes PP 6 N-1: Yes No PP 9 No No XXXXXXX • • • • • • • • • VM can help process protection • Page table entry contains access rights information • hardware enforces this protection (trap into OS if violation occurs) Page Tables Memory Process i: Process j:

  10. Virtual Memory Implementation

  11. Page Tables Page Table Physical Page Or Disk Address Virtual page number Physical Memory Valid 1 1 1 1 0 1 1 0 Disk 1 1 0 1

  12. Virtual to Physical Address translation Virtual Address 3 1 3 0 2 9 2 8 2 7 1 5 1 4 1 3 1 2 1 1 1 0 9 8 3 2 1 0 Virtual page number Page offset Page Table 2 9 2 8 2 7 1 5 1 4 1 3 1 2 1 1 1 0 9 8 3 2 1 0 Physical page number Page offset Physical Addresses Page size: 212 byte=4K byte

  13. The Page Table Virtual Address 31 12 11 0 Page offset Virtual Page Number Frame number V D AC Page table base reg Access Control Dirty bit 1 0 Valid bit 29 11 0 12 Page offset Physical Frame Number Physical Address

  14. Address Mapping Algorithm If V = 1 then page is in main memory at frame address stored in table  Access data else (page fault) need to fetch page from disk  causes a trap, usually accompanied by a context switch: current process is suspended while page is fetched from disk Access Control (R = Read-only, R/W = read/write, X = execute only) If kind of access not compatible with specified access rights then protection_violation_fault  causes trap to hardware, or software fault handler • Missing item fetched from secondary memory only on the occurrence of a fault demand load policy

  15. Managing virtual addresses • Virtual memory management is mainly done by the OS. • Deciding which pages should be in memory and which – on disk. • Deciding when to write pages to disk. • Deciding what access rights are assigned to pages. • Et cetera, et cetera. • The exact algorithms are out of the scope of this course. • Here, we study two hardware mechanisms • Pseudo LRU mechanism – to decide which pages to keep • TLB – to perform fast lookup

  16. 1 0 page table entry 1 0 1 0 0 0 Ref bit Page Replacement Algorithm (pseudo LRU) • Not Recently Used (NRU) • Associated with each page is a reference flag such that ref flag = 1 if the page has been referenced in recent past • If replacement is needed, choose any page frame such that its reference bit is 0. • This is a page that has not been referenced in the recent past • Clock implementation of NRU: last replaced pointer (lrp) if replacement is to take place, advance lrp to next entry (mod table size) until one with a 0 bit is found; this is the target for replacement; As a side effect, all examined PTE's have their reference bits set to zero. • Possible optimization: search for a page that is both not recently referenced AND not dirty

  17. Page Faults • Page faults: the data is not in memory  retrieve it from disk • The CPU must detect situation • The CPU cannot remedy the situation (has no knowledge of the disk) • CPU must trap to the operating system so that it can remedy the situation • Pick a page to discard (possibly writing it to disk) • Load the page in from disk • Update the page table • Resume to program so HW will retry and succeed! • Page fault incurs a huge miss penalty • Pages should be fairly large (e.g., 4KB) • Can handle the faults in software instead of hardware • Page fault causes a context switch • Using write-through is too expensive so we use write-back

  18. code code stack stack Virtual Memory Unix (VAX) Swappable area • VAX was the first system to run UNIX, and so many of the UNIX mechanisms are based on the VAX implementation. • Unix distinguishes between 3 different address spaces, each of them is controlled by a separate page table • The system address space – one per system • User code + data address space – one per process • User Stack address space – one per process. Only the system page table (one table) resists in the main memory all the other tables and address spaces are pageable. User space System space User PT Run Kernel Sys page table PP table

  19. VM in VAX: Address Format Virtual Address 30 31 9 8 0 29 Page offset Virtual Page Number 0 0 - P0 process space code and data 0 1 - P1 process space stack 1 0 - S0 system space 1 1 - S1 not used Physical Address 29 8 0 9 Page offset Physical Frame Number Page size: 29 = 512 bytes

  20. 20 31 0 Physical Frame Number V PROT M Z OWN S S 3 ownership bits Modified bit 4 Protection bits Valid bit =1 if page mapped to main memory, otherwise page on the swap area and the address indicates where I can find the page on the disk Page Table Entry (PTE) This is the structure of all the entries in each of the page tables. Indicate if the line was cleaned (zero)

  21. Address Translation • The System address space is divided into “resident” part that always exists in the main memory and the rest of the 2G address space of the system is swappable. • User space is always swappable. • The system’s page table is part of the resident area and the SBR register points on its physical base address. • For each process, there are two dedicated registers: • P0BR: points to the virtual address (in the system’s address space) of the code segment page table of the user • P1BR: points to the virtual address (in the system’s address space) of the stack segment page table of the user

  22. 31 9 29 8 0 offset 10 VPN SBR VPN*4 PFN 9 29 8 0 offset PFN System Space - Address Translation • From the virtual address of the system’s address space we can extract the virtual page number (VPN). • Assuming that each entry in the system address space is 4 bytes, the PTE that contains the physical address of the line exists in SBR+VPN*4. • If V=1, we can extract the physical address of the page. • If V=0, it causes a page fault and we need to bring the page from the disk.

  23. P0/P1 Address Translation • Address translation need to be done in few phases: • Find the address of the proper PTE in the user level page table. • Since the PTE address is within the system’s address space, check if the page that contains the PTE is in the main memory • If page exists, check if the user page is in the main memory • If the page does not exist, causes a page fault to bring the value of the PTE and only than, we can check if the address exists in the memory or not. • We may have up to 2 page-faults in the translation process of the singe access to a user level address.

  24. 31 31 9 9 29 29 8 8 0 0 Offset’ offset 10 00 VPN VPN’ 9 9 29 29 8 8 0 0 Offset Offset’ PFN PFN’ PFN P0/P1 Space Address Translation (cont) P0BR+VPN*4 SBR VPN’*4 PFN’ Physical addr of PTE

  25. Handling large address space • For a large virtual address space, we may get a large page table, for example: • For a 2 GB virtual address space, page size of 4KB, we get 231/212=219 entries • Assuming each entry is 4 bytes wide, page table alone will occupy 2MB in main memory • Two proposed techniques to handle it: • Hierarchical page tables • “segmentation”

  26. 1K PTEs 4KB 10 10 12 P1 index P2 index page offest 4 bytes 4 bytes Large Address Spaces • Solution: use two-level Page Tables • Master page table resides in physical memory • Secondary page tables reside in virtual address space • Virtual address format • At the lower levels of the tree we will keep only the segments of the tables we use

  27. Physical Addresses Virtual Addresses Address Translation Disk Addresses Virtual Memory – Memory-management • We do not need to map the addresses between the TOS (Top Of Stack) and the Break point • The system needs to know how to map new regions when needed

  28. The VAX Solution Segmentation p0 • Map only the sections the program uses • Need to extend at run time • Stack manipulation • SBRK command p1

  29. How a process is forked • When a process is forked, the new process and the old processes starts at the same point with the same state. The only difference between the two processes is that the Fork command returns the PID to the original process and 0 for the “new born” process. • When a process is forked, we try to avoid to copy physical pages • At the initial point, the translation tables are copied • All pages in the memory that belong to the original process are marked as “read-only” • Only when one of the processes tries to change the page, we copy the page to private copies and allow private writable copies. (this mechanism is called COW – Copy On Write)

  30. TLB – hardware support for address translation • TLB is a cache that keeps the last translations the system made, using the virtual address as an index • If the translation was found in the TLB, no further translation is needed. • If it misses, we need to continue the process as described before.

  31. Making Address Translation Fast TLB (Translation Lookaside Buffer) is a cache for recent address translations: TLB Valid Tag Physical Page Virtual page number 1 Physical Memory 1 1 1 0 1 Page Table Valid 1 1 Disk 1 1 0 1 1 0 Physical Page Or Disk Address 1 1 0 1

  32. Process TLB Access Process TLB hit? System TLB hit? 00 VPN offset P0 space Address translation Using TLB Memory Access No Calculate PTE virtual addr (in S0): P0BR+4*VPN Yes System TLB Access Get PTE of req page from the proc. TLB Access Sys Page Table in SBR+4*VPN(PTE) No Yes PFN Get PTE from system TLB PFN Calculate physical address Get PTE of req page from the process Page table Access Memory

  33. Virtual Address Access Cache TLB Access No No Access Memory Access Page Table Cache Hit ? TLB Hit ? Yes Yes Physical Addresses Data Virtual Memory And Cache TLB access is serial with cache access

  34. 2 9 2 8 2 7 1 5 1 4 1 3 1 2 1 1 1 0 9 8 3 2 1 0 2 9 2 8 2 7 1 5 1 4 1 3 1 2 1 1 1 0 9 8 3 2 1 0 Overlapped TLB & Cache Access Virtual Memory view of a Physical Address Page offset Physical page number Cache view of a Physical Address Tag # Set Disp • In the above example #Set is not contained within the Page Offset • The #Set is not known Until the Physical page number is known • Cache can be accessed only after address translation done.

  35. 2 9 2 8 2 7 1 5 1 4 1 3 1 2 1 1 1 0 9 8 3 2 1 0 2 9 2 8 2 7 1 5 1 4 1 3 1 2 1 1 1 0 9 8 3 2 1 0 Overlapped TLB & Cache Access (cont) Virtual Memory view of a Physical Address Page offset Physical page number Cache view of a Physical Address Tag # Set Disp • In the above example #Set is contained within the Page Offset • The #Set is known immediately • Cache can be accessed in parallel with address translation • First the tags from the appropriate set are brought • Tag comparison takes place only after the Physical page number is known (after address translation done).

  36. Overlapped TLB & Cache Access (cont) • First Stage • Virtual Page Number goes to TLB for translation • Page offset goes to cache to get all tags from appropriate set • Second Stage • The Physical Page Number, obtained from the TLB, goes to the cache for tag comparison. • Limitation: Cache < (page size * associativity) • How can we overcome this limitation ? • Fetch 2 sets and mux after translation • Increase associativity

  37. Virtual memory and process switch • Whenever we switch the execution between processes we need to: • Save the state of the process • Load the state of the new process • Make sure that the P0BPT and P1BPT registers are pointing to the new translation tables • Clean the TLB • We do not need to clean the cache (Why????)

More Related