260 likes | 390 Views
Chap. 7.4: Virtual Memory. Review: Caches. Cache design choices: size of cache: speed v. capacity direct-mapped v. associative for N-way set assoc: choice of N block replacement policy 2nd level cache? Write through v. write back?
E N D
Review: Caches • Cache design choices: • size of cache: speed v. capacity • direct-mapped v. associative • for N-way set assoc: choice of N • block replacement policy • 2nd level cache? • Write through v. write back? • Use performance model to pick between choices, depending on programs, technology, budget, ...
{ Thus far { Next: Virtual Memory Another View of the Memory Hierarchy Regs Upper Level Instr. , Operands Faster Cache Blocks L2 Cache Blocks Memory Pages Disk Files Larger Tape Lower Level
Recall: illusion of memory • Programmer’s view about memory • Unlimited amount of fast memory • How to create the above illusion? 無限大的快速記憶體 Book shelf Scene: library books desk
Virtual memory 虛擬記憶體 • Create an illusion of unlimited memory • Motivation • A collection of programs running at once on a machine • Total memory required by all programs > the amount of main memory available • Allow a single user program to exceed the size of primary memory • Software solution: programmers divided programs into pieces, called overlays 多個程式同時執行,大於主記憶體 單一程式大於主記憶體
Enough space for User D, but discontinuous (“fragmentation problem”) $base+$bound $base Simple Example: Base and Bound Reg ¥ User C User B • Want discontinuous mapping • Process size >> mem • Addition not enough! => What to do? User A OS 0
Virtual memory system • Main memory as a cache for the secondary storage (disk) cache Cache for disk How to map to disk? Disk
More about Virtual Memory • Called “Virtual Memory” • Also allows OS to share memory, protect programs from each other • Today, more important for protection vs. just another level of memory hierarchy • Each process thinks it has all the memory to itself • Historically, it predates caches
Code Static Heap Mapping Virtual Memory to Physical Memory Virtual Memory ¥ • Divide into equal sizedchunks (about 4 KB - 8 KB) Stack • Any chunk of Virtual Memory assigned to any chuck of Physical Memory (“page”) Physical Memory 64 MB 0 0
Virtual Address Physical Address page 0 0 1K page 0 1K 0 page 1 1K 1024 1K page 1 Addr Trans MAP 1024 ... ... ... page 2 1K 2048 ... ... ... page 7 1K 7168 page 31 1K Physical Memory 31744 Virtual Memory Paging Organization (assume 1 KB pages) Page is unit of mapping
Page size=212 VM address translation (1) Virtual address Address space=232 =4GB Physical address Address space=230 =1GB How to translate virtual addresses ? Ans: Direct mapped, set-associative, full-associative?
VM address translation (2) Address translation: Full-associative Each program has its own page table 220
A Page Table B Page Table Paging/Virtual Memory Multiple Processes User A: Virtual Memory User B: Virtual Memory ¥ Physical Memory ¥ Stack Stack 64 MB Heap Heap Static Static 0 Code Code 0 0
Page fault (~cache miss) Virtual Page number
Key issues in VM • Page fault penalty takes millions of cycles • Pages should be large enough to amortize the high access time • 32-64KB • Reduce thepage fault rate: full-associative placement • Use clever algorithm to replace pages while page faults occur • LRU (least recently used) ? • Use write-back instead of write-through
Use a cache to store page table Make the address translation faster • Motivation: page tables are stored in main memory Load/store virtual address 2 memory accesses !!! Page table In memory Physical address Main memory data
Translation-lookaside buffer (TLB) TLB TLB miss
hit PA miss VA TLB Lookup Cache Main Memory Processor miss hit Trans- lation data On TLB miss, get page table entry from main memory VM, TLB, and cache • TLBs usually small, typically 128 - 256 entries • Like any other cache, the TLB can be direct mapped, set associative, or fully associative
VM, TLB, and caches 31 30 … 12 11 … 1 0 Virtual address TLB cache
Comparing the 2 levels of hierarchy Cache Version Virtual Memory vers. Block or Line Page Miss Page Fault Block Size: 32-64B Page Size: 4K-8KB Placement: Fully AssociativeDirect Mapped, N-way Set Associative Replacement: Least Recently UsedLRU or Random (LRU) Write Thru or Back Write Back
Review: 4 Qs for any Memory Hierarchy • Q1: Where can a block be placed? • One place (direct mapped) • A few places (set associative) • Any place (fully associative) • Q2: How is a block found? • Indexing (as in a direct-mapped cache) • Limited search (as in a set-associative cache) • Full search (as in a fully associative cache) • Separate lookup table (as in a page table) • Q3: Which block is replaced on a miss? • Least recently used (LRU) • Random • Q4: How are writes handled? • Write through (Level never inconsistent w/lower) • Write back (Could be “dirty”, must have dirty bit)
Q1: Where block placed in upper level? • Block 12 placed in 8 block cache: • Fully associative • Direct mapped • 2-way set associative • Set Associative Mapping = Block # Mod # of Sets Block no. 0 1 2 3 4 5 6 7 Block no. 0 1 2 3 4 5 6 7 Block no. 0 1 2 3 4 5 6 7 Set 0 Set 1 Set 2 Set 3 Fully associative: block 12 can go anywhere Direct mapped: block 12 can go only into block 4 (12 mod 8) Set associative: block 12 can go anywhere in set 0 (12 mod 4)
Block Address Block offset Index Tag Q2: How is a block found in upper level? • Direct indexing (using index and block offset), tag compares, or combination • Increasing associativity shrinks index, expands tag Set Select Data Select
Q3: Which block replaced on a miss? • Easy for Direct Mapped • Set Associative or Fully Associative: • Random • LRU (Least Recently Used) Miss RatesAssociativity: 2-way 4-way 8-way Size LRU Ran LRU Ran LRU Ran 16 KB 5.2% 5.7% 4.7% 5.3% 4.4% 5.0% 64 KB 1.9% 2.0% 1.5% 1.7% 1.4% 1.5% 256 KB 1.15% 1.17% 1.13% 1.13% 1.12% 1.12%
Q4: What to do on a write hit? • Write-through • update the word in cache block and corresponding word in memory • Write-back • update word in cache block • allow memory word to be “stale” => add ‘dirty’ bit to each line indicating that memory be updated when block is replaced => OS flushes cache before I/O !!! • Performance trade-offs? • WT: read misses cannot result in writes • WB: no writes of repeated writes
Short conclusion of VM • Virtual memory • Cache between memory and disk • VM allows • A program to expand its address space • Sharing of main memory among multiple processes • To reduce page fault • Large pages – use spatial locality • Full associative mapping • OS use clever replacement method (LRU,…)