1 / 9

Cache Basics (Section 1.7, 5.1)

Cache Basics (Section 1.7, 5.1). A cache is a small, fast memory located close to the CPU that holds the most recently accessed code or data A block is a fixed-size collection of data containing the requested word, that is retrieved from the memory

tovah
Download Presentation

Cache Basics (Section 1.7, 5.1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cache Basics (Section 1.7, 5.1) • A cache is a small, fast memory located close to the CPU that holds the most recently accessed code or data • A block is a fixed-size collection of data containing the requested word, that is retrieved from the memory • Temporal locality tells us that we are likely to need this word again in the near future • Spatial locality tells us that the other data in the block may be needed soon.

  2. Cache Basics(Cont’d) • The time required for the cache miss depends on the latency and bandwidth of the memory • Latency determines the time to retrieve the first word of the block • Bandwidth determines the time to retrieve the rest of the block • Hit (or Miss) rate in the fraction of cache accesses that result in a hit (or a miss) • Example on page 42

  3. Performance of Cache Memory Stall Cycles = No of misses x Miss penalty = IC x(Misses/instruction) x Miss penalty = IC x Memory references per instruction x Miss rate x Miss penalty CPU Executive Time = (CPU clock cycles + Memory stall cycles) x Clock cycle time Example on page 43

  4. When Can a Block Be Placed in a Cache? (Figure 5.2) • Direct Mapped:each block has only one place in the cache. (Block address) mod (no.of blocks in cache) • Fully Associative:a block can be placed anywhere in the cache • Set Associate: A block can be placed in a restricted set of places in the cache. A set is a group of blocks. If n blocks in a set, it is n-way set- associative cache placement. A set in chosen as (Block address) mod (no. of sets in cache)

  5. How IS a Block Found if it is in the Cache? • Each block has an address tag and an index that give the block address (Fig 5.3) • A block offset points to the desired data within the block • The index field selects the set • The tag field is compared to determine a hit • Increasing associativity means increasing the tag field and decreasing the index field • Fully associative caches have no index field

  6. Which Block Should Be Replaced? • A block needs to be replaced in the cache whenever there is a cache miss • In direct-mapped cache, there is a fixed place for each block, so the choice is simple • In fully associative or set-associative caches, three strategies exist for replacement • Random • Least-recently used (LRU) • FIFO

  7. What Happens on a Write? Options when writing to cache: • Write through: write to the cache and to the memory • Next lower level has the most current copy • Write back: write to the cache only • Write occurs at the speed of cache • Dirty bit specifies if the block has been modified or not Options on a write miss • Write allocate: the block is loaded on a write miss • No-write allocate: the block is not loaded into the cache

  8. Alpha AXP 21064 Data Cache (Figure 5.5) • 8192 bytes data cache • 32-byte blocks (5 –bit offset, 8-bit index) • Direct-mapped • Write through with a 4-block write buffer • No-write allocate: write around the cache on a miss • 34-bit address: 21-bit tag, 8-bit index, 5-bit offset • Write buffers use merging • Separate instructions and data caches

  9. Cache Performance • Average memory access time = Hit time + Miss rate x Miss penalty • CPU Time = (CPU execution CCs + Mem. Stall CCs) x CC time Examples on pages 384-389 CC: Clock Cycle

More Related