Effects of wrong path mem ref in cc mp systems
This presentation is the property of its rightful owner.
Sponsored Links
1 / 25

Effects of wrong path mem. ref. in CC MP Systems PowerPoint PPT Presentation


  • 52 Views
  • Uploaded on
  • Presentation posted in: General

Effects of wrong path mem. ref. in CC MP Systems. Gökay Burak AKKUŞ Cmpe 511 – Computer Architecture. About the papers.

Download Presentation

Effects of wrong path mem. ref. in CC MP Systems

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Effects of wrong path mem ref in cc mp systems

Effects of wrong path mem. ref. in CC MP Systems

Gökay Burak AKKUŞ

Cmpe511 – Computer Architecture


About the papers

About the papers

R. Sendag, A. Yilmazer, J.J. Yi, and Augustus K. Uht, Quantifying and Reducing the Effects of Wrong-Path Memory References in Cache-Coherent Multiprocessor Systems, IPDPS2006, 2006

O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Cache filtering techniques to reduce the negative impact of useless speculative memory references on processor performance. Symposium on Computer Architecture and High Performance Computing, 2004.

O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Understanding the effects of wrong-path memory references on processor performance. Workshop on Memory Performance Issues, 2004.


What is it all about

What is it all about?

how wrong-path memory accesses affect the

cache coherence traffic

state transitions,

the resource utilization.

proposes

a filtering mechanism

and areplacement policy


Subjects

Subjects

SMPs: Shared-memory MultiProcessor systems

Cache Coherence

Branch Prediction and prefetching

Wrong paths


Cache coherence solutions

Cache Coherence Solutions

Snooping Solution (Snoopy Bus):

Send all requests for data to all processors

Processors snoop to see if they have a copy and respond accordingly

Requires broadcast, since caching information is at processors

Works well with bus (natural broadcast medium)

Dominates for small scale machines (most of the market)

Directory-Based Schemes

Keep track of what is being shared in 1 centralized place (logically)

Distributed memory => distributed directory for scalability (avoids bottlenecks)

Send point-to-point requests to processors via network

Scales better than Snooping

Actually existed BEFORE Snooping-based schemes


Cache coherence protocols

Cache Coherence Protocols

MSI (Modified, Shared, Invalid)

MESI (Modified, Shared, Exclusive, Invalid)

MOESI (Modified, Owned, Shared, Exclusive, Invalid)


Wrong path effects

Wrong-path effects

Replacements

Writebacks

Invalidations

Cache Block State Transitions

Data/Bus Traffic and Coherence Transactions

Power Consumption

Resource Contention


Replacements

Replacements

Cause:

speculatively-executed load instruction

mispredicted path

a cache block brought into data cache

One of the cache blocks replaced by the new one


Writebacks

Writebacks

When a replacement occurs by a wrong path reference

The evicted cache block may have the state M (exclusive, dirty) or O (share, dirty)

Before removing this block from cache a writeback occurs

For MSI and MESI

if a requested cache block has the state M, before it is sent to the requestor it is written back to memory

Then its state is set to S in the original owner’s cache.


Invalidations

Invalidations

Assume MOESI protocol

A wrong-path load instruction accesses a cache block that is modified by nother processor

The owner sets the state to O

The requestor gets the block and the state is S

if the owner needs to write to that block

Changes state from O to M

Then invalidates all other copies


Cache block state transitions

Cache Block State Transitions

2 extra cache transitions in the owner’s cache

When a modified block is requested

Cache state changes from M to O

When that block is modified

Again the cache state becomes M


Data bus traffic and coherence transactions

Data/Bus Traffic and Coherence Transactions

Due to L1 and L2 cache accesses

Caused by extra replacements, writebacks, invalidations and state transitions

Traffic also increases

Snoop or Directory requests also increase traffic


Power consumption

Power Consumption

As there are

unnecessary snoops,

Traffic overhead

State transition overhead

Power consumption increases

Ex:

Filtering unnecessary snoops may reduce L2 cache power by 30% (see Moshovos et al.)


Resource contention

Resource Contention

wrong-pathmemory accesses compete with correct-path memoryaccesses for the multiprocessor’s resources

additional cache coherence transactions may increase the frequency of full service buffers

Result: increasing chance of deadlocks


Simulation

Simulation

SPLASH-2 benchmark suite

em3d simulation benchmark

MOSI and MOESI protocols used

16-processor SPARC v9


Statement based on experiments

Statement based on experiments

mispredicted branches are resolved before 94% of wrong-path L2 misses complete.

Therefore, whether “an L2 cache miss is speculative” is usually known before the block is placed into the L2 cache. [REF2]


Reducing cache pollution

Reducing Cache Pollution

Filtering

Filtering applied to L2 cache

Observation:

if a speculatively-fetched cache block is not used while it resides in the L1 cache, then it is likely that that block will not be used at all or will not be used before being evicted from the L2 cache

In this mechanism

all memory references made by wrong-path instructions or the prefetcher are fetched only into the first-level cache

the processor monitors whether they are referenced by non-speculative (correctpath) instructions

Based on the predefined observation, the processor may choose to not write the block into the L2 cache or may adopt a policy that gives lower priority to the unused speculatively-fetched block.


Wrong path aware replacement policy

Wrong Path Aware Replacement Policy

when a block is brought into the cache, it is marked as being either on the correct-path or on the wrong-path

when a block needs to be evicted

wrong-path blocks are evicted first, on a LRU basis if there are multiple wrong-path blocks.


Performance evaluation

Performance Evaluation


Conclusions critics

Conclusions & Critics

IPC (instruction per cycle) can be used as the metric

In some cases wrong-path executions positively effect overall performance

mcf, parser, and perlbmk

In some cases significantly negative effect

vpr and gcc

To model or not to model

especially for future systems with longer memory interconnect latencies

and processors with larger instruction windows.

The real effect:

Cache pollution

In SMP case especially

For a workload with many cache-to-cache transfers, wrong-path memory references can significantly affect the coherence actions.

Proposed solutions yet not studied deeply


References

References

R. Sendag, A. Yilmazer, J.J. Yi, and Augustus K. Uht, Quantifying and Reducing the Effects of Wrong-Path Memory References in Cache-Coherent Multiprocessor Systems, IPDPS2006, 2006

O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Cache filtering techniques to reduce the negative impact of useless speculative memory references on processor performance. Symposium on Computer Architecture and High Performance Computing, 2004.

O. Mutlu, H. Kim, D. Armstrong, and Y. Patt. Understanding the effects of wrong-path memory references on processor performance. Workshop on Memory Performance Issues, 2004.


  • Login