1 / 19

Multiprocessors—Cache Coherency, Snooping Protocol

Multiprocessors—Cache Coherency, Snooping Protocol. Small-Scale—Shared Memory. Caches serve to: Increase bandwidth versus bus/memory Reduce latency of access Valuable for both private data and shared data What about cache consistency?. The Problem of Cache Coherency.

jacoba
Download Presentation

Multiprocessors—Cache Coherency, Snooping Protocol

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Multiprocessors—Cache Coherency, Snooping Protocol

  2. Small-Scale—Shared Memory • Caches serve to: • Increase bandwidth versus bus/memory • Reduce latency of access • Valuable for both private data and shared data • What about cache consistency?

  3. The Problem of Cache Coherency • Value of X in memory is 1 • CPU A reads X – its cache now contains 1 • CPU B reads X – its cache now contains 1 • CPU A stores 0 into X • CPU A’s cache contains a 0 • CPU B’s cache contains a 1

  4. What Does Coherency Mean? • Informally: • Any read must return the most recent write • Too strict and very difficult to implement • Better: • Any write must eventually be seen by a read • All writes are seen in order (“serialization”) • Two rules to ensure this: • If P writes x and P1 reads it, P’s write will be seen if the read and write are sufficiently far apart • Writes to a single location are serialized: seen in one order • Latest write will be seen • Otherwise could see writes in illogical order (could see older value after a newer value)

  5. Potential Solutions • Snooping Solution (Snoopy Bus): • Send all requests for data to all processors • Processors snoop to see if they have a copy and respond accordingly • Requires broadcast, since caching information is at processors • Works well with bus (natural broadcast medium) • Directory-Based Schemes • Keep track of what is being shared in one centralized place • Distributed memory => distributed directory • Send point-to-point requests to processors • Scales better than Snoop

  6. Basic Snoopy Protocols • Write Invalidate Protocol: • Multiple readers, single writer • Write to shared data: an invalidate is sent to all caches which snoop and invalidate any copies • Read Miss: • Write-through: memory is always up-to-date • Write-back: snoop in caches to find most recent copy • Write Broadcast Protocol: • Write to shared data: broadcast on bus, processors snoop, and update copies • Read miss: memory is always up-to-date • Write serialization: bus serializes requests • Bus is single point of arbitration

  7. Write Invalidate • Contents of memory location X = 0. • CPU A reads X – its cache contains 0. • CPU B reads X – its cache contains 0. • CPU A writes a 1 to X – CPU B’s cache contents are invalidated, memory contains a stale value. • CPU B reads X – CPU A responds with the value 1, CPU A aborts the memory request, the contents of B’s cache and memory are updated to 1.

  8. Write Broadcast • Contents of memory location X = 0. • CPU A reads X – its cache contains 0. • CPU B reads X – its cache contains 0. • CPU A writes a 1 to X – Write broadcast of X updates CPU B’s cache contents and memory contents to 1. • CPU B reads X – value is available in cache.

  9. Basic Snoopy Protocols • Write Invalidate versus Broadcast: • Invalidate requires one transaction per write-run • Invalidate uses spatial locality: one transaction per block • Broadcast has lower latency between write and read • Broadcast: BW (increased) vs. latency (decreased) tradeoff

  10. An Example Snoopy Protocol • Invalidation protocol, write-back cache. • Each cache block is in one state: • Shared: block can be read • OR Exclusive: cache has only copy, its writeable, and dirty • OR Invalid: block contains no data • Read miss of a block in exclusive state will change state in the owning cache to shared. • Transition to exclusive state requires a write miss to be placed on the bus. • Write hit is treated as a write miss – for simplicity. • Memory block in shared state is always up to date in memory.

  11. Snoopy-Cache State Machine-I • State machinefor CPU requests • Cache Block State

  12. Snoopy-Cache State Machine-II • State machinefor bus requests

  13. Example Assumes A1 and A2 map to same cache block

  14. Example Assumes A1 and A2 map to same cache block

  15. Example Assumes A1 and A2 map to same cache block

  16. Example Assumes A1 and A2 map to same cache block

  17. Example Assumes A1 and A2 map to same cache block

  18. Example Assumes A1 and A2 map to same cache block

More Related