Increasing the Cache Efficiency by Eliminating Noise

Increasing the Cache Efficiency by Eliminating Noise Philip A. Marshall

Outline • Background • Motivation for Noise Prediction • Concepts of Noise Prediction • Implementation of Noise Prediction • Related Work • Prefetching • Data Profiling • Conclusion

Background • Cache Fetch • On Cache Miss • Prefetch • Exploiting Spatial Locality • Cache words are fetched in blocks • Fetch neighboring block(s) on a cache miss • Results in fewer cache misses • Fetches words that aren’t needed

Background • Cache noise • Words that are fetched into the cache but never used • Cache utilization • The fraction of words in the cache that are used • Represents how efficiently the cache is used

Motivation for Noise Prediction • Level 1 data cache utilization is ~57% for SPEC2K benchmarks [2] • Fetching unused words: • Increases bandwidth requirements between cache levels • Increases hardware and power requirements • Wastes valuable cache space [2] D. Burger et. al., Memory bandwidth limitations of future microprocessors, Proc. ISCA-23, 1996

Motivation for Noise Prediction • Cache block size • Larger blocks • Exploit spatial locality better • Reduce cache tag overhead • Increase bandwidth requirements • Smaller blocks • Reduced cache noise • Any block size results in suboptimal performance

Motivation for Noise Prediction • Sub-blocking • Only portions of the cache blocks are fetched • Decreases tag overhead by associating one tag with many sub-blocks • Words fetched must be in contiguous blocks of fixed size • High miss-rate and cache noise for non-contiguous access patterns

Motivation for Noise Prediction • By predicting which words will actually be used, cache noise can be reduced • But: • Fetching fewer words could increase the number of cache misses

Concepts of Noise Prediction • Selective fetching • For each block, fetch only the words that are predicted to be accessed • If no prediction is available, fetch the entire block • Uses a valid bit for each word and a words usage bit to track which words have been used

Concepts of Noise Prediction • Cache Noise Predictors • Phase Context Predictor (PCP) • Based on the usage pattern of the most recently evicted block • Memory Context Predictor (MCP) • Based on the MSBs of the memory address • Code Context Predictor (CCP) • Based on the MSBs of the PC

Concepts of Noise Prediction • Prediction table size • Larger tables decrease the probability of “no predictions” • Smaller tables use less power • A prediction is considered successful if all the needed words are fetched • If extra words are fetched, still considered a success

Concepts of Noise Prediction • Improving Prediction • Miss Initiator Based History (MIBH) • Keep separate histories according to which word in the block caused the miss • Improves predictability if relative position of words accessed is fixed • Example: looping through a struct and accessing only one field

Concepts of Noise Prediction • Improving Prediction • OR-ing Previous Two Histories (OPTH) • Increases predictability by looking at more than the most recent access • Reduces cache utilization • OR-ing more than two accesses reduces utilization substantially

Results • Empirically, CCP provides the best results • MIBH greatly increases predictability • OPTH improves predictability only marginally while increasing cache noise • Cache utilization increased from 57% to 92%

Results

Related Work • Existing work focuses reducing cache misses, not on improving utilization • Sub-blocked caches used mainly to decrease tag overhead • Some existing work on prediction of which sub-blocks to load in a sub-blocked cache • No existing techniques for predicting and fetching non-contiguous words

Related Work

Prefetching • Prefetching improves the cache miss rate • Commonly, prefetching is implemented by also fetching the next block on a cache miss • Prefetching increases noise and increases bandwidth requirements

Prefetching • Noise prediction leads to more intelligent prefetching but requires extra hardware • On average, prefetching with noise prediction leads to less energy consumption • In the worst case, energy requirements increase

Prefetching

Data Profiling • For some benchmarks there are a low number of predictions • The predictor table is too small to hold all the word usage histories • Don’t increasing table size, profile the data • Profiling increases prediction rate by ~7% • Gains aren’t as high as expected

Data Profiling

Analysis of Noise Prediction • Pros • Small increase in miss rate (0.1%) • Decreased power requirements in most cases • Decreased bandwidth requirements between cache levels • Adapts effective block size to access patterns • Dynamic technique but profiling can be used • Scaleable to different predictor sizes

Analysis of Noise Prediction • Cons • Increased hardware overhead • Increases power in the worst case • Not all programs benefit • Profiling provides limited improvement

Other Thoughts • How were benchmarks chosen? • 6 of 12 integer and 8 of 14 floating point SPEC2K benchmarks were used • Not all predictors were examined equally • 22-bit MCP predictor performed slightly poorer than a 28-bit CCP • 28-bit MCP? • How can the efficiency of the prediction table be increased?

Increasing the Cache Efficiency by Eliminating Noise

Increasing the Cache Efficiency by Eliminating Noise

Presentation Transcript

Funnel Sort*: Cache Efficiency and Parallelism

Increasing Energy Efficiency at the Allegan WWTP

Amoeba-Cache Adaptive Blocks for Eliminating Waste in the Memory Hierarchy

Increasing efficiency – increasing return on investment in Mediterranean Ports

Increasing our Efficiency

Increasing our Efficiency

Eliminating Sporadic Noise in Engineering Writing

Increasing Selling and Moving Efficiency

Amoeba-Cache Adaptive Blocks for Eliminating Waste in the Memory Hierarchy

Amoeba-Cache Adaptive Blocks for Eliminating Waste in the Memory Hierarchy

Increasing DC Efficiency by 4x

Increasing efficiency with global visibility

Identifying and eliminating limiting noise sources of GEO600

Increasing Thermoelectric Efficiency by Using the Phonon-Glass/Electron-Crystal Approach

Line Distillation: Increasing Cache Capacity by Filtering Unused Words in Cache Lines

Increasing Returns and Economic Efficiency

Increasing the Work Efficiency with Digital Temperature Recorder

Amoeba-Cache Adaptive Blocks for Eliminating Waste in the Memory Hierarchy

Increasing DC Efficiency by 4x

Increasing Cache Efficiency by Eliminating Noise

Eliminating Intermittent Noise in Writing

Increasing the efficiency of printing by using Tray Selector