1 / 11

Analysis of and Dynamic Page Remapping Technique to Reduce L2 Misses in an SMT Processor

CSE 240B Class Project Spring 2005, UCSD Subhradyuti Sarkar Siddhartha Saha. Analysis of and Dynamic Page Remapping Technique to Reduce L2 Misses in an SMT Processor. Motivation.

Download Presentation

Analysis of and Dynamic Page Remapping Technique to Reduce L2 Misses in an SMT Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSE 240B Class Project Spring 2005, UCSD Subhradyuti Sarkar Siddhartha Saha Analysis of and Dynamic Page Remapping Technique to Reduce L2 Misses in an SMT Processor

  2. Motivation • Considerable amount of penalty for cache miss. L2 miss penalty is usually orders of magnitude higher than L1 miss. • SMT Processors maybe more vulnerable to L2 misses due to: • If more than one identical thread runs in the processor, then they will always collide in the same cache page • Even for different threads, if they are compiled by the same compiler then they will have similar virtual address range for stack, heap and data segment.

  3. Introduction • In this work, we look at the hybrid hardware/software technique to to reduce L2 cache misses in an SMT processor. • We use a set of hardware counters for every cache page to keep track of the relative hotness/coldness of cache pages. • If the miss-rate and/or access rate amongst the cache pages become skewed over a certain threshold, we use an adaptive algorithm which tries to smooth out the cache utilization.

  4. Contribution Summary • An adaptive algorithm which can detect variation in utilization amongst cache pages. • Another algorithm that can smoothen the cache utilization, possibly improving the cache preference.

  5. Hot/Cold Detection Algorithm • Short Term History • We detect if a cache page is hot, cold and neutral in last epoch • COLD: • access_count[i] < total_access_count/N*t_cold • HOT: • miss_rate[i] > t_miss • && • access_count[i] > total_access_count/N*t_hot • NEUTRAL: • otherwise

  6. Hot/Cold Detection Algorithm • Long Term History • We keep a N element circular history to keep the state of the cache pages for last N epochs. • In our simulations, we took N = 4 • Based on the Long Term History, we determine when to classify a page as HOT or COLD. • If number of HOT pages and number of COLD pages is non-zero, then call the re-coloring algorithm.

  7. Re-coloring Algorithm • For each page, keep track of the virtual pages which access the cache page most frequently. • From each HOT page, move all but one frequently accessed virtual page to a COLD page in cache. • We exit when number of HOT pages or COLD pages becomes zero – or when a maximum number of pages have been re-colored.

  8. Re-Coloring • Ideally, the page in memory should be moved. • Following the idea of Calder et al, we can achieve the same effect by modifying the TLB. • We simulated this in SMTSIM by implementing a address remap module.

  9. Changes to SMTSIM Processor Processor IC DC IC DC MAF MAF Translation Unit L2 Cache Remap L2 Cache Hot/Cold Detection

  10. Result

  11. Future Work • This experiments did not produce very good results. But there are further scopes of improvements. • Many more design choices are there for the re-coloring algorithm. Our choice was a basic one. • Based on the HOT/COLD information, the effect of a skewed cache indexing may also be investigated.

More Related