1 / 36

Serdar Tasiran Systems Research Center, HP Labs (formerly Compaq)

Using Formal Specifications to Monitor and Guide Simulation: Verifying the Cache Coherence Engine of the Alpha 21364 Microprocessor. Serdar Tasiran Systems Research Center, HP Labs (formerly Compaq)

casper
Download Presentation

Serdar Tasiran Systems Research Center, HP Labs (formerly Compaq)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Using Formal Specifications to Monitor and Guide Simulation:Verifying the Cache Coherence Engine of the Alpha 21364 Microprocessor Serdar Tasiran Systems Research Center, HP Labs (formerly Compaq) Yuan Yu (Microsoft Research, formerly Compaq)Brannon Batson, Scott Kreider, (Intel, formerly Compaq)

  2. The Problem • Given • A complex protocol specified formally • A hardware implementation • Verify that • All executions of implementation are consistent with protocol spec • An “implementation verification” problem • Properties of protocol verified separately

  3. 0 M 1 M 2 M 3 M IO IO IO IO 4 M 5 M 6 M 7 M IO IO IO IO 8 M 9 M 10 M 11 M IO IO IO IO Alpha 21364 (EV7) System Block Diagram • Distributed shared memory • Up to 256 processors, 32 GB per processor

  4. R EV7 core & cache system data buffers C SVDB, FB0, FB1 Z0 Z1 mem mem EV7 Cache Coherence: • Cache coherence protocol spec (~3K lines of TLA, written by architects and us) • Executable • Not list of properties • Implementation in hardware (~20K lines of HDL code) • Does the hardware implement the protocol spec correctly?

  5. Why is the problem difficult? • Thousands of state variables per processor • Parallelism, deep pipelining, speculative execution, redundancy, ... • Need 4+ processors for an interesting system • Out of reach of automatic formal methods • Limited to several hundred state variables • Decomposition methods difficult for non-specialists, large design teams • Complete verification of hardware against protocol not practical • Simulation only viable approach • Even simulation is expensive

  6. Validation Guided by Formal Spec Coverage Automated inputgeneration Simulation Correctness checking using formal spec Coverage analysis on formal spec

  7. Validation Guided by Formal Spec Coverage “Automated” inputgeneration Simulation 1 Correctness checking using formal spec Coverage analysis on formal spec

  8. Contributions: 1: Formal Spec + Model Checker as Monitor • Spec written in formal language • Properties of spec can be verified formally • Model checker checks properties satisfied during simulation: • More reliable than hand-written code • More flexible than automatically generated assertions • Must relate Implementation state  Spec state • Devised two-phase mapping approach • Applicable to complex designs by non-specialists

  9. Validation Guided by Formal Spec Coverage “Automated” inputgeneration Simulation 2 Correctness checking using formal spec Coverage analysis on formal spec

  10. Contributions 2: Coverage analysis and input generation using formal protocol spec • Formal spec encapsulates design intent • Full coverage = All scenarios exercised • Spec at same level of abstraction as existing coverage data • Model checker used to • Measure coverage, detect gaps • Generate simulation input traces to reach coverage holes • Determine if unexercised scenario actually possible

  11. Outline • The cache-coherence protocol • The EV7 cache-coherence engine • Spec-guided simulation • Conclusion

  12. 0 M 1 M 2 M 3 M IO IO IO IO 4 M 5 M 6 M 7 M IO IO IO IO 8 M 9 M 10 M 11 M IO IO IO IO The EV7 Coherence Protocol • Distributed shared memory • Each address belongs to a “home node” but may be in other caches • Directory-based protocol • Cache states:Modified (Dirty), Exclusive (Clean), Shared, Invalid • Directory states: Local, Shared, Exclusive, Incoherent • Directory distributed with memory at each node • CPU requests that miss in local caches are sent to home node • Home node may forward request to other nodes • Directory In Flight Table (DIFT) keeps track of pending requests

  13. S H R S S Example: write, remote sharers • Conditions: • home is remote, directory state is shared • Actions: • read-exclusive request to home • home sends invalidation requests to sharers, sends data back to requester with invalidation count (early exclusive reply) • sharing nodes reply to requester with invalidation acknowledgements • requester proceeds when data arrives, but must “stall” incoming requests and potential writeback of line until all InvalAcks are received ReadMod SharedInv BlkExclusive InvalAck

  14. TLA Description for Protocol • Temporal Logic of Actions [Leslie Lamport] • Formal language for writing high-level specs of concurrent, reactive systems • Very expressive. Incorporates • First-order logic, set theory, temporal operators • Sets, queues, records, tuples, … • EV7 protocol description is a TLA formula

  15. ReadMod SharedInv Preconditions BlkExclusiveCnt Messages sentand state variablesupdated S H R S S One Protocol Action

  16. Outline • The cache-coherence protocol • The EV7 cache-coherence engine • Spec-guided simulation • Conclusion

  17. Alpha 21364 Chip Block Diagram IPx4 L2 Data Array Data Buffers Router I/O L2 Tag Array Memory Controller 0 RDRAM Core L2 CacheController Memory Controller 1 RDRAM Data L1 Cache Address & Control

  18. Directory in Flight Table (DIFT) • Front end of memory controller • Tracks up to 32 in-flight transactions Cache State Lookup (from cache controller) DIFT Coherence Engine New directory to Memory Forwards andresponses to other CPUs Request Directory State from Memory

  19. DIFT block diagram From Back End From CBox From CBox New Request Decode zc_dft_acc Address File (zx_dft_af*) Event File Proto File zc_dft_ros addr maf pid vdba vdbv fb directory cmd src ack vic lpr rd fwd rsp inv wr vcp akp To RBox New Directory Logic (zx_dft_plt) Protocol Logic (zx_dft_plc) Address Output Logic (ao*) Issue Logic (zx_dft_isp*) Messages to the Ring New Directory Output (aod) Next DIFT State DIFT Free List Grant logic to everywhere DIFT Conflict Array To Back End Rd/Wr Requests to the Zbox middle

  20. Outline • The cache-coherence protocol • The EV7 cache-coherence engine • Spec-guided simulation • Conclusion

  21. Validation Guided by Formal Spec Coverage “Automated” inputgeneration Simulation 1 Correctness checking using formal spec Coverage analysis on formal spec

  22. Formal Spec as Simulation Monitor Spec State-Space Model checker (TLC) checks if transition is legal fabs : Abstraction mapping fabs fabs Implementation State-Space

  23. Actioni1 (pj1, ak1) Actioni2 (pj2, ak2) Actioni3 (pj3, ak3) time Refinement Mapping Issues • Protocol transactions look instantaneous at spec level but in the implementation they happen • over many clock cycles, • interleaved with other actions • Want designers, developers to write the mapping • Implementation has parallelism, pipelining, speculative execution, redundancy AND many processors • Burch, Dill style “flushing” not practical

  24. Actioni1 (pj1, ak1) Actioni2 (pj2, ak2) Actioni3 (pj3, ak3) time The Refinement Mapping Preconditions Messages sentand state variablesupdated

  25. IO R EV7 core & cache system data buffers C Z0 Z1 mem mem A two-phase recipe for refinement mappings • Collect “tokens”, e.g., cache state looked-up, invalidate message sent, directory state written • Implementation state  Intermediate state • Determine • interfaces: write, read ports • state machines that relate to protocol state • Watch • Messages crossing interfaces • Updates to state machines • Record in intermediate state Record in intermediate state for processor pj3 address ak3 Actioni3 (pj3, ak3) time

  26. A two-phase recipe for refinement mappings • All tokens collected. “Fire” action. 2. Intermediate state  Protocol spec state • For each protocol transaction, check when • All preconditions hold • All state updates happen, all required messages sent • Update abstract state Abstract state getsupdated here Preconditions and implementation state updates related to Actioni3 (pj3, ak3) Actioni3 (pj3, ak3) time

  27. A two-phase recipe for refinement mappings • Implementation state  Intermediate state • Hardware signal transitions  Protocol events (tokens) • Intermediate state  Protocol spec state • Protocol event sequences  Protocol transactions (actions) • Component implementers can write step 1 • System architects can put together step 2 • Distinguishes protocol errors from component implementation errors • Well-defined, clean interface • Easier to keep implementation and spec consistent throughout design process • Modular description makes reasoning easier

  28. Validation Guided by Formal Spec Coverage “Automated” inputgeneration Simulation 2 Correctness checking using formal spec Coverage analysis on formal spec

  29. Formal Spec as Coverage Model Spec State-Space Model checker (TLC) records visited states fabs : Abstraction mapping fabs fabs Implementation State-Space

  30. Model Checker Tracks and Improves Coverage Spec State-Space • Identify parts of spec not exercised by simulation • Path in spec state space = unexamined scenario • Problems: • Spec has too many states • Not feasible to track coverage, generate paths for each • Want to explore “qualitatively distinct” scenarios Path generatedby model checker Non-covered state

  31. Coverage Metric Defined on Spec States Coverage State-Space c1 c6 c0 c4 c7 c2 c8 c5 c3 Spec State-Space

  32. Coverage Metric Examples • All possible directory state transitions: • Invalid  Exclusive  Shared  SharedMask • All legal combinations of • Request type • Source of request (Local or remote) • Cache state • Directory state • All possible transitions of some protocol state field in the DIFT • WaitingForAck  WaitingForVictim, …

  33. Conclusions • Novel approach uses formal spec and model checker • to monitor simulation • to identify coverage gaps • to guide input generation • Found valuable by architects and verification engineers • EV7 verification engineers want to use model checker to analyze their coverage data • EV8 design started with formal specification first! • First attempt at verifying industrial implementation of this scale and complexity against formal spec

  34. Actioni1 (pj1, ak1) Actioni2 (pj2, ak2) Actioni3 (pj3, ak3) time Abstract state getsupdated here Preconditions and implementation state updates related to Actioni3 (pj3, ak3) Actioni3 (pj3, ak3) time

  35. ReadMod SharedInv R H S S S BlkExclusiveCnt

  36. 0 M 1 M 2 M 3 M IO IO IO IO 4 M 5 M 6 M 7 M IO IO IO IO 8 M 9 M 10 M 11 M IO IO IO IO

More Related