implementation and verification of a cache coherence protocol using spin l.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Implementation and Verification of a Cache Coherence protocol using Spin PowerPoint Presentation
Download Presentation
Implementation and Verification of a Cache Coherence protocol using Spin

Loading in 2 Seconds...

play fullscreen
1 / 23

Implementation and Verification of a Cache Coherence protocol using Spin - PowerPoint PPT Presentation


  • 140 Views
  • Uploaded on

Implementation and Verification of a Cache Coherence protocol using Spin. Steven Farago. Goal. To use Spin to design a “plausible” cache coherence protocol

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Implementation and Verification of a Cache Coherence protocol using Spin' - vince


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
slide2
Goal
  • To use Spin to design a “plausible” cache coherence protocol
    • Introduce nothing in the Spin model that would not be realistic in hardware (e.g. instant global knowledge between unrelated state machines)
  • To verify the correctness of the protocol
background
Background
  • Definition: Cache = Small, high-speed memory that is used by a single processor. All processor memory accesses are via the cache.
  • Problem:
    • In a multiprocessor system, each processor could have a cache.
    • Each cache could contain (potentially different) data for the same addresses.
    • Given this, how to ensure that processors see a consistent picture of memory?
coherence protocol
Coherence protocol
  • A Coherence protocol specifies how caches communicate with processors and each other so that processors will have a predictable view of memory.
  • Caches that always provide this “predictable view of memory” are said to be coherent.
a definition of coherence
A Definition of Coherence
  • A “view of memory” is coherent if the following property holds:
    • Given cacheline A, two processors may not see storage accesses to A in a conflicting order.
    • Example:
    • Processor 0 Processor 1 Processor 2 Processor 3
    • Store A, 0 Load A, 0 Load A, 0 Load A, 1
    • Store A, 1 Load A, 1 Load A, 0 Load A, 0
    • Coherent Coherent ** NOT Coherent
  • Informally, a processor may not see “old” data after seeing “new” data.
standard coherence protocol
Standard Coherence Protocol
  • MESI (Modified, Exclusive, Shared, Invalid)
    • Standard protocol that is supposed to guarantee cache coherence
  • Each block in the cacheline is marked with one of these states.
  • Cacheline accesses are only allowed if the cache states are “correct” w.r.t the coherence protocol
  • Examples:
    • A cache that is marked “invalid” may not provide data to a processor.
    • Cacheline data may not be updated unless the line is in the Exclusive or Modified
system model
System Model
  • Initial version
  • Three state machines
    • ProcessorModel: Non-deterministically issues Loads and Stores to cache forever
    • CacheModel: Two parts - initially combined into a single process
      • MainCache - Services processor requests.
      • Snooper - Responds to messages from memory controller
    • MemoryController - Services requests from each cache and maintains coherency among all
system model8
System Model

Processor

Processor

MainCache

MainCache

Snooper

Snooper

MemoryController

processormodel
ProcessorModel
  • Simple
  • Continually issues Load/Store requests to associated Cache.
    • Communication done via Bus Model.
    • Read requests are blocking
  • Coherence verification done when Load receives data (via Spin assert statement)
cachemodel
CacheModel
  • Two parts: MainCache and Snooper
    • MainCache services ProcessorModel Load and Store requests and initiates contact with the MemoryController when an “invalid” cache state is encountered
    • Snooper services independent request from MemoryController. Requests necessary for MemoryController to coordinate coherence responses.
memorycontrollermodel
MemoryControllerModel
  • Responsible for servicing Cache requests
  • 3 Types of requests
    • Data request: Cache requires up-to-date data to supply to processor
    • Permission-to-store: A Cache may not transition to the Modified state w/o MC’s permission
    • A combination of these two
  • All types of requests may require MC to communicate with all system caches (via Snooper processes) to ensure coherence
implementation of busses
Implementation of Busses
  • All processes represent independent state machines. Need communication mechanism
  • Use Spin depth 1 queues to simulate communication.
  • Destructive/Blocking read of queues requires global bool to indicate bus activity (required for polling).
    • Global between processes valid to make up for differences between Spin queues and real busses
problems part 1
Problems - Part 1
  • MainCache and Snooper initially implemented as a single process.
  • Process nondeterministically determines which to execute at each iteration
  • Communication between Processor/Cache and Cache/Memory done with blocking queues
  • Blocked receive in MainCache --> Snooper cannot execute
  • Leads to deadlock in certain situations
solution 1
Solution 1
  • Split MainCache and Snooper into separate processes.
  • Both can access “global” cacheData and cacheState variables independently
problems part2
--> Problems - Part2
  • As separate processes, Snooper and MainCache could change cache state unpredictably.
  • Race conditions: Snooper changes cache state/data while MainCache is in mid-transaction --> returns invalidated data to processor.
solution 2
Solution 2
  • Add locking mechanism to cache.
    • MainCache or Snooper may only access cache if they first lock it.
  • Locking mechanism: For simplicity, cheated by using Spin’s atomic keyword to implement test-set on a shared variable.
  • Assumption: Real hardware would have some similar mechanism available to lock caches.
  • Question: Revised model now equivalent to original??
problem 3
--> Problem 3
  • Memory controller allows multiple outstanding requests from caches.
  • Snooper of cache which has a MainCache request outstanding cannot respond to MC queries for other outstanding requests (due to locked cacheline).
  • Deadlock.
solution 3
Solution 3
  • Disallow multiple outstanding Cache/MC transactions.
  • Introduce global bool variable shared across all caches: outstandingBusOp.
  • A cache may only issue requests to the memory controller if no requests from other caches outstanding.
  • Global knowledge across all caches unrealistic.
  • Equivalent to “retries” from MC??
problem 4
--> Problem 4
  • Previous problems failed in Spin simulation within 1000 steps.
  • Given last solution, random simulation failures vanish in first 3000 steps.
  • Verification fails after ~20000 steps
  • Cause of problem as yet unresolved
verification
Verification
  • How to verify coherence generally??
  • Verify something stronger: A processor will never see conflicting ordering of data if it always sees the newest data available in the system.
  • For all loads, assert that data is “new”
modeling of data
Modeling of Data
  • Concern that modeling data as random integer would cause Spin to run out of memory
  • Model data as a bit with values OLD and NEW.
  • All processor Stores store NEW data.
  • When transitioning to a Modified state, a cache will change all other values of data in memory and other caches to OLD
    • Global access to data here strictly a part of verification effort, not algorithm. Thus allowed.
debugging
Debugging
  • Found debugging parallel processes difficult.
  • Made much easier by Spin’s message sequence diagrams
    • Graphically shows sends and receives of all messages.
    • Requires use of Spin queues rather than globals for interprocess communication
future work
Future work
  • Make existing protocol completely bug free
  • Activate additional “features” disabled for debugging purposes (e.g. bus transaction types)
  • Verify protocol specific rules
    • No two caches may be simultaneously Modified
    • Cache Modified or Exclusive --> no other cache is Shared