1 / 52

Memory Management Unit

Memory Management Unit. The Memory System. Embedded systems and applications The memory system requirements: vary considerably Simple blocks Multiple types of memory Caches Write buffers Virtual memory. Memory management units. Memory management unit (MMU) translates addresses:

deepak
Download Presentation

Memory Management Unit

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Memory Management Unit

  2. The Memory System • Embedded systems and applications • The memory system requirements: vary considerably • Simple blocks • Multiple types of memory • Caches • Write buffers • Virtual memory

  3. Memory management units • Memory management unit (MMU) translates addresses: • Protection checks main memory logical address memory management unit physical address CPU

  4. Memory management tasks • Allows programs to move in physical memory during execution • Allows virtual memory: • memory images kept in secondary storage; • images returned to main memory on demand during execution • Page fault: request for location not resident in memory

  5. Address translation • Requires some sort of register/table to allow arbitrary mappings of logical to physical addresses • Two basic schemes: • segmented • paged • Segmentation and paging can be combined (x86)

  6. memory segment 1 page 1 page 2 segment 2 Segments and pages

  7. Segment address translation segment base address logical address + range error segment lower bound range check segment upper bound physical address

  8. Page address translation page offset page i base concatenate page offset

  9. page descriptor page descriptor flat tree Page table organizations

  10. Caching address translations • Large translation tables require main memory access • TLB: cache for address translation • Typically small

  11. ARM Memory Management Unit

  12. ARM Memory Management • System control coprocessor(CP15) • Memory • Write Buffers • Caches • Registers • Up to 16 primary registers • Physical registers in CP15 more than 16 • Register access instructions • MCR (ARM to CP15) • MRC (CP15 to ARM)

  13. Cached MMU memory system

  14. ARM Memory Management • MMU can be enabled and disabled • Memory region types: • section: 1 Mbytes block • large page: 64 Kbytes • small page: 4 Kbytes • tiny Page: 1 Kbytes • Two-level translation scheme (why?) • First-level table • Second-level table Page table size for 4-KB pages : 220 X 4 bytes = 4 MB

  15. ARM address translation Translation table base register 1st index 2nd index offset 1st level table descriptor concatenate 2nd level table descriptor physical address

  16. First-level descriptors • AP: access permission • C,B: cachability and bufferability

  17. Section descriptor and translating section references CP reg 2: 16 KB boundary 4K Entries 1 MB block (section) Max: 16KB

  18. Coarse Page table descriptor 256 entries 4 K entries Max: 16KB Max: 1KB

  19. Fine page table descriptor 1 K entries Max: 4 KB

  20. Second-level descriptor

  21. Translating large page references

  22. Access permissions • System (S) and ROM (R) in CP15 register 1

  23. TLB functions • Invalidate instruction TLB • Invalidate instruction single entry • Invalidate entire data TLB • Invalidate data single entry • TLB lockdown

  24. Caches

  25. Cache operation • Many main memory locations are mapped onto one cache entry • May have caches for • instructions • data • data + instructions (unified) • Memory access time is no longer deterministic

  26. Cache performance benefits • Keep frequently-accessed locations in fast cache • Temporal locality • Cache retrieves more than one word at a time • Sequential accesses are faster after first access • Spatial locality

  27. Terms • Cache hit: required location is in cache • Cache miss: required location is not in cache • Working set: set of locations used by program in a time interval

  28. Types of misses (3-C) • Compulsory (cold): location has never been accessed • Capacity: working set is too large • Conflict: multiple locations in working set map to same cache entry

  29. Memory system performance • h = cache hit rate • tcache = cache access time, tmain = main memory access time • Average memory access time: • tav = htcache + (1-h)tmain

  30. Multiple levels of cache L2 cache CPU L1 cache

  31. Multi-level cache access time • h1 = cache hit rate • h2 = rate for miss on L1, hit on L2 • Average memory access time: • tav = h1tL1 + (h2-h1)tL2 + (1- h2-h1)tmain

  32. Cache organizations • Set-associativity • Fully-associative: any memory location can be stored anywhere in the cache (almost never implemented) • Direct-mapped: each memory location maps onto exactly one cache entry • N-way set-associative: each memory location can go into one of n sets

  33. Cache organizations, cont’d • Cache size • line (or block) size

  34. Types of caches • Unified or separate caches • Write operations • Write-through: immediately copy write to main memory • Write-back: write to main memory only when location is removed from cache

  35. Types of caches, cont’d • Cache block allocation • Read-allocation • Write-allocation • Access address • Physical address • Virtual address • Advantage: fast • Problems: context switching, synonyms or aliases

  36. Replacement policies • Replacement policy: strategy for choosing which cache entry to throw out to make room for a new memory location • Two popular strategies: • Random • Least-recently used (LRU) • Pseudo-LRU

  37. 1 0xabcd byte byte byte ... byte Direct-mapped cache valid tag data cache block tag index offset = value hit

  38. Direct-mapped cache locations • Many locations map onto the same cache block • Conflict misses are easy to generate • Array a[] uses locations 0, 1, 2, … • Array b[] uses locations 1024, 1025, 1026, … • Operation a[i] + b[i] generates conflict misses

  39. Set-associative cache • A set of direct-mapped caches: Set 1 Set 2 Set n ... hit data

  40. Example: direct-mapped vs. set-associative

  41. After 001 access: block tag data 00 - - 01 0 1111 10 - - 11 - - After 010 access: block tag data 00 - - 01 0 1111 10 0 0000 11 - - Direct-mapped cache behavior

  42. After 011 access: block tag data 00 - - 01 0 1111 10 0 0000 11 0 0110 After 100 access: block tag data 00 1 1000 01 0 1111 10 0 0000 11 0 0110 Direct-mapped cache behavior, cont’d.

  43. After 101 access: block tag data 00 1 1000 01 1 0001 10 0 0000 11 0 0110 After 111 access: block tag data 00 1 1000 01 1 0001 10 0 0000 11 1 0100 Direct-mapped cache behavior, cont’d.

  44. 2-way set-associtive cache behavior • Final state of cache (twice as big as direct-mapped): set blk 0 tag blk 0 data blk 1 tag blk 1 data 00 1 1000 - - 01 0 1111 1 0001 10 0 0000 - - 11 0 0110 1 0100

  45. 2-way set-associative cache behavior • Final state of cache (same size as direct-mapped): set blk 0 tag blk 0 data blk 1 tag blk 1 data 0 01 0000 10 1000 1 10 0111 11 0100

  46. ARM Caches and write buffers

  47. Caches and write buffers • Caches • Physical address • Virtual address • Memory coherence problem • Write buffers • To optimize stores to main memory • Physical address • Virtual address • Memory coherence problem

  48. CP15 register 0 S=0 : unified cache S=1 : separate caches

  49. CP15 register 0, cont’d

  50. CP15 register 1 • C(bit[2]) : • 0=cache disabled, 1=cache enabled • W(bit[3]): • 0=wirte buffer disabled, 1=enabled • I(bit[12]): if separate caches are used, • 0=disabled, 1=enabled

More Related