cache l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Cache PowerPoint Presentation
Download Presentation
Cache

Loading in 2 Seconds...

play fullscreen
1 / 23

Cache - PowerPoint PPT Presentation


  • 152 Views
  • Uploaded on

Cache. Cache Design for Performance Comparison Intel Pentium 4 Motorola PowerPC G4 Trends Glossary. Memory Hierarchy / Locality. Design for Performance. Size High performance All memory the system can cache The design of chipset Type Unified/Separated Data+Instruction cache.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Cache' - nibaw


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
cache
Cache
  • Cache Design for Performance
  • Comparison
    • Intel Pentium 4
    • Motorola PowerPC G4
  • Trends
  • Glossary
design for performance
Design for Performance
  • Size
    • High performance
      • All memory the system can cache
      • The design of chipset
  • Type
    • Unified/Separated Data+Instruction cache
design for performance4
Design for Performance
  • Size
  • Mapping
  • Replacement algorithms
  • Write policy
  • Block size
  • Number of caches
design for performance5
Design for Performance
  • Write Policy
    • Write back
      • One extra data bit
      • Higher performance than write through
    • Write through
      • multiple CPUs can monitor main memory traffic to keep local to CPU cache up to date
      • Lots of traffic and slow-down write (buffer = big help)
  • * Multi processors : data consistancy in different caches
design for performance6
Design for Performance
  • Block size
    • In general miss rate block size
    • Miss penalty block size
  • Number of caches
    • Cost, performance
pentium 4 l2 cache options
Pentium 4 L2 cache options
  • 256 KB L2 integrated cache,

400 Mhz system bus

  • 512 KB L2 integrated cache

400 or 533 Mhz front side bus

p4 l1 execution trace cache
P4 L1 Execution Trace Cache
  • stores up to 12K decoded micro-ops in the order of program execution
  • increases performance by circumventing decoder in main execution loop
  • efficient storage: branched instructions are not stored
powerpc g4
PowerPC G4
  • Motorola’s MPC7457 Processor: 1.3 GHz
  • Dual processor G4s share cache between processors
  • General market availability: Q4 2003
powerpc g4 l1 cache
PowerPC G4: L1 Cache
  • Harvard architecture: split between data and instruction cache
  • 32 KB 8-way set associative
  • Pseudo-LRU replacement algorithm
  • 32 byte (8 word) L1 cache block
  • 4 instructions per clock cycle (instruction) & 4 words per clock cycle (date)
  • Cache way locking allows performance alogorithms and code to be locked into the cache.
  • Write back/through programmed per page or per block
powerpc g4 l2 cache
PowerPC G4: L2 Cache
  • On-chip, unified instruction and data cache
  • 512 KB, 8-way set associative. Runs at processor speed.
  • 32 bytes/clock cycle communication with L1 cache
  • Write back/through programmed per page or per block
  • Pseudo-LRU replacement algorithm
  • Two 32 byte sectors per line
  • Twice the amount as previous G4s
powerpc g4 l3 cache
PowerPC G4: L3 Cache
  • 2 MB off-chip L2 cache using DDR SDRAM
  • Runs at speeds up to 500 MHz
  • 1, 2, or 4 MB SRAM space
  • 1 or 2 MB cache space
  • 1 MB or 2 MB sectored line size
  • Dedicated bus to processor (4 GB/sec)
  • Dual processors: each has own L3 cache
g4 enhancement plan
G4 enhancement plan
  • Pipelining: Increase #, keep pipelines short
  • Small # of simultaneous instructions in various stages of execution (up to 16)
  • execute instructions in parallel, move them off chip in as few cycles as possible.
drawbacks g4 plan
Drawbacks: G4 plan
  • more execution units added to handle decoding
  • Fewer instructions pipelined at any given time
benefits g4 plan
Benefits: G4 Plan
  • Shorter miss penalties (min. 4 clock cycles)
  • Introduce Mitigating Factor:
    • Added L3 cache with 64 bit bus
  • Increased size of execution units speeds performance “wider and shallower”
p4 enhancement plan
P4 Enhancement Plan
  • Innovative cache implementation
  • “Narrow, but deep” pipelining:
    • higher clock speeds
  • Deep scheduling queues to prevent pipeline “bubbles”
drawbacks p4 plan
Drawbacks: P4 plan
  • Larger miss penalties (min. 19 clock cycles)
    • due to longer pipelining
  • Longer slower pipelines require complicated scheduling
  • No L3 cache
benefits p4 plan
Benefits: P4 Plan
  • Trace Cache significantly improves throughput to execution cycle
  • Faster clock speeds overall
sources
Sources
  • http://www.intel.com
  • http://www.apple.com/powermac/architecture.html
  • “MPC7457 RISC Microprocessor Hardware Specs” http://e-www.motorola.com/webapp/sps/library
  • “Motorola – PowerPC G4” http://www.geek.com
  • “Cache Latencies of the MPC7450” http://www.motorola.com
  • “Motorola Delivers….” http://www.us.design-reuse.com/news/news4852.html
  • “The Fundamentals of Cache” http://www.systemlogic.net/articles/00/10/cache/page2.php