1 / 36

Computer Architecture

Computer Architecture. Lecture 8: Memory hierarchy. Cache memory Piotr Bilski. Characteristics of the memory systems. Location Capacity Transfer unit Access mode Performance Physical structure Physical characteristics Organization. Memory location.

Download Presentation

Computer Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computer Architecture Lecture 8: Memory hierarchy. Cache memory Piotr Bilski

  2. Characteristics of the memory systems • Location • Capacity • Transfer unit • Access mode • Performance • Physical structure • Physical characteristics • Organization

  3. Memory location • Processor (registers, L1cache memory) • Internal (main) memory (RAM) • External memory (auxilary – disk drives)

  4. Memory capacity • Word size • Number of words • Memory capacity is expressed in bytes and their multiplications, so: 1 B = 8 b 1 KB = 1024 B, 1 MB = 1024 KB etc.

  5. Transfer unit • Number of the data lines connected to the memory module (normally equal to the word length), but: • Word is a basic unit in the memory organization • Adressable unit is used to direct memory addressing (byte or word) • Transfer unit can be equal to word or addressable unit

  6. Memory access modes • Sequential access (e.g. tape memory) • Direct access (disk memory) • Random access (main memory) • Associative access (cache memory)

  7. Memory performance • Access time– time between putting address to the address bus and acquiring information on the data bus • Cycle time – access time increased by the time of the gap between the next access • Transfer speed – for RAM: 1 / cycle time

  8. Physical memory structure • Semiconductor (RAM, ROM) • Magnetic (hard disks, floppy disks, streamers) • Optical (CD-ROM, DVD-ROM) • Magnetooptical (WORM)

  9. Physical characteristics • Volatility • Volatile memory (RAM) • Non-volatile memory (ROM) • Content modification • Erasable (np. RAM, EPROM) • Non-erasable (ROM)

  10. Memory organization • One level („flat”) • Multilevel (e.g. cache) T1 + T2 T2 T1 Access time 0 1 Hit ratio

  11. Memory hierarchy Speed • access time –  cost / bit • capacity –  cost / bit • capacity –  access time Processor registers Cache memory Main (operational) memory Access time External memory Capacity cost

  12. Why do we need cache memory? • Locality of references rule – executed program consists of the fragments existing next to each other and executed one by one • Time locality • Spatial locality

  13. Cache memory work regime Flag Memory address Rows Block 0 1 2 3 0 1 2 C-1 Block 1 (K words) Block length (K words) Block N (K words) 2n - 1 Main memory addressed using n bits (total 2n words) Cache memory has C rows Word length

  14. Cache memory work regime (cont.) Cache memory Processor Main memory Transfer of words Transfer of blocks

  15. Reading from cache memory START Accessing main memory for the addressed block Acquiring address from CPU Is this block’s address in the cache memory? Assignment of the block to the cache memory row NO YES Transfer of word to CPU Transferring block into the cache memory Transferring word to CPU EXECUTION

  16. Details of the cache memory • Size • Mapping • Replacement algorithm • Writing algorithm • Row size • Number of the cache memories

  17. Size of the cache memory • Minimization of the memory cost • Maximization of the processor’s speed

  18. Mapping function • The number of the rows in the cache is smaller than the number of the blocks in the main memory • Three methods exist: • Direct • Associative • Set-associative

  19. Cache memory with direct mapping Main memory W0 s+w Data Flag W1 B0 W2 Memory address W3 L0 Flag Row Word s-r r w s-r s … Comparison Li w w hit miss

  20. Direct mapping (cont.) • i – number of the row in the cache memory • j – number of the block in the main memory • m – number of rows in the cache memory i = j mod m Address length: s+w bits Number of the addressed units: 2s+w words Block size = row size: 2w words Number of blocks in the main memory: 2s Number of rows in the cache memory: 2r

  21. Result of the direct mapping

  22. Example of the direct mapping • For the cache memory having 214 rows (4 B each) and main memory of 16 MB capacity: Row width: 8 b flag, 32 b data

  23. Cache memory of associative mapping Main memory W0 s+w Data Flag W1 B0 W2 Memory address W3 L0 Flag Word s s s … s w Comparison Li w w hit miss

  24. Associative mapping (cont.) Address length: s+w bits Number of the addressed units: 2s+w words Block size = row size: 2w words Number of the main memory blocks: 2s Number of rows in the cache memory: any Flag size: s words

  25. Example of the associative mapping Data Address 000000 000004 12357A FFFFF4 FFFFF8 FFFFFC 35281987 F235A72C 3982FB1A Flag Data 000000 3FFFFF 048D5E 35281987 3982FB1A F235A72C 32 b 22 b Word (2 b) Flag (22 b)

  26. Cache memory with set-associative mapping Main memory W0 s+w Data Flag W1 W2 Memory address W3 S0 Flag Section Word d w s-d s-d s+w … Comparison Si hit miss

  27. Set-associative mapping (cont.) • i – number of the row in the cache memory • j – number of the block in the main memory • m – number of rows in the cache memory m = v x k i = j mod v Address length: s+w bits Number of addressed units: 2s+w words Block size = row size: 2w words Number of blocks in the main memory: 2s

  28. Set-associative mapping (cont.) Number of rows in a section: k Number of sections: v = 2d Number of rows in the cache memory: kv = k x 2d Flag size: (s-d) bits

  29. Example of the set-passociative mapping Data Flag 0000 0004 7FFC 0000 0004 7FFC 000 01A 1FF 35281987 F235A72C 67321342 3982FB1A Flag Data 01A F235A72C 000 01A 35281987 67321342 32 b 9 b 32 b 9 b Słowo (2 b) Flag (9 b) Section (13 b)

  30. Algorithms of the cache memory content replacement • Least recently used (LRU) • First in - first out (FIFO) • Least frequently used (LFU) • Random choice

  31. Algorithms of writing into the cache memory • write through • write back • System assuring consistency (multiprocessor system with cache) • Bus control with write through • Hardware transparency • Memory not mapped by the cache memory

  32. Other problems • Row size and block size • Number of the cache memories • Memory of the higher level is integrated in one chip with the processor, works with identical frequency • Memory of the lower level works with the bus frequency (it is on the mainboard)

  33. Pentium 4 cache memory

  34. Pentium 4 processor core • Instruction fetching/decoding unit • Fetches instructions from L2 cache memory • Decodes them into microoperations • transfers microoperations to L1 cache memory • Non-sequential instruction execution unit • Queues microoperations • Execution units • Execute microoperations • Fetch data from the L1 cache • Write results into the registers • Memory subsystem • Communicates with the system bus and L2 cache memory

  35. PowerPC cache memory

  36. PowerPC cache memory (cont.)

More Related