1 / 10

By : Ido Shayevitz and Yoav Shargil Supervisor: Zvika Guz

“ NAHALAL : Cache Organization for Chip Multiprocessors ” New LSU Policy. By : Ido Shayevitz and Yoav Shargil Supervisor: Zvika Guz. NAHALAL ARCHTECTURE NAHALAL architecture defines the memory cache banks of the L2 cache.

Download Presentation

By : Ido Shayevitz and Yoav Shargil Supervisor: Zvika Guz

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “NAHALAL : Cache Organization for Chip Multiprocessors”New LSU Policy By : Ido Shayevitz and Yoav Shargil Supervisor: Zvika Guz

  2. NAHALAL ARCHTECTURE NAHALAL architecture defines the memory cache banks of the L2 cache. Each processor has a private backyard bank and all processors shared a small bank. The architecture is based on the hot shared line phenomenon.

  3. X LSU LRU LSU Improvement • Placement Policy • Replacement Policy from Private Bank : LRU • Replacement Policy from Public Bank : NAHALAL LSU policy wisely select the Least Shared Used line to throw from the public bank.

  4. LSU Implementation • Shift-register with N cells for each Line. • Each cell in the shift-register hold CPU num • In throwing by CPUi : For each shift-register do XOR between each cell and the ID of CPUi. The shift-register on which the XOR produce 0, will be the chosen one. If non produce 0 then do regular LRU. • In order ro reduce memory overhead, define N=4. Therefore 2 *4*3 = 0.1875MB  18.75% memory overhead. 14 Simple, short time algorithm in HW

  5. Simulation Structure in Simics Using pyhton script we defined :

  6. Writing Benchmarks Writing Benchmarks is done in the simulated target console :

  7. Writing Benchmarks • Using Threads with pthread library • Each Thread is associated to a CPU using sched library. • Parallel code is written in the benchmark • Also OS code and pthread code cause to Parallel code. • Each benchmark we run first without LSU and second with LSU.

  8. Collecting Statistics Cache statistics: l2c ----------------- Total number of transactions: 610349 Total memory stall time: 31402835 Total memory hit stall time: 28251635 Device data reads (DMA): 0 Device data writes (DMA): 0 Uncacheable data reads: 17 Uncacheable data writes: 30738 Uncacheable instruction fetches: 0 Data read transactions: 403488 Total read stall time: 17488735 Total read hit stall time: 14383135 Data read remote hits: 0 Data read misses: 10352 Data read hit ratio: 97.43% Instruction fetch transactions: 0 Instruction fetch misses: 0 Data write transactions: 176106 Total write stall time: 4687600 Total write hit stall time: 4687600 Data write remote hits: 0 Data write misses: 0 Data write hit ratio: 100.00% Copy back transactions: 0 Number of replacments in the middle (NAHALAL): 557

  9. Results 1 2 4 3 1. Improvement of 54% in average stall time per transaction. 2.Improvement of 61% in average stall time per transaction. 3. 8.375% from the transactions cause a replacement in the middle without LSU, and with LSU only 0.09% ! Improvement of∆=8.28% 4. 8.75% from the transactions cause a replacement in the middle without LSU, and with LSU only 0.02% ! Improvement of∆=8.73%

  10. Conclusions LSU policy significantly improve average stall time per transaction, Therefore : LSU Policy implemented in NAHALAL architecture significantly reduce number of cycles for a benchmark. LSU policy significantly reduce number of replacements in the middle, Therefore : LSU Policy implemented in NAHALAL architecture, better keep the hot shared lines in the public bank. According to our implementation, LRU is activated if LSU did not find a line, Therefore : LSU Policy as we implemented is always preferable then LRU.

More Related