1 / 50

Verification of Hierarchical Cache Coherence Protocols for Future Processors

Verification of Hierarchical Cache Coherence Protocols for Future Processors. Student: Xiaofang Chen Advisor: Ganesh Gopalakrishnan. Outline. Background Proposed solutions High level hierarchical coherence protocol verification

chaela
Download Presentation

Verification of Hierarchical Cache Coherence Protocols for Future Processors

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Verification of Hierarchical Cache Coherence Protocols for Future Processors • Student: Xiaofang Chen • Advisor: Ganesh Gopalakrishnan

  2. Outline • Background • Proposed solutions • High level hierarchical coherence protocol verification • Refinement check: specifications vs. RTL implementations • Conclusion

  3. Hierarchical Cache Coherence Protocols Chip-level protocols Intra-cluster protocols … mem mem dir dir Inter-cluster protocols

  4. Modeling and Verification of Coherence Protocols • High-level modeling approaches • Model checking • Low-level modeling: RTL or VHDL • Simulation

  5. Problems with Hierarchical Coherence Protocols • For high level modeling • Handle the complexity of hierarchical protocols • For RTL implementations • Verify a RTL correctly implements the specification

  6. Example: Verification Complexity (I) Remote Cluster 1 Home Cluster Remote Cluster 2 L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem

  7. Example: Verification Complexity (II) • Tool: Murphi • Verification • IA-64 machine • 18GB memory • 40-bit hash compaction • Non-conclusive after >30 hours of state enumeration

  8. Differences in Modeling: Specs vs. Impls Multiple steps in low-level One step in high-level 1.3 1 1.1 1.2 1.4 buf client home local cache 1.5

  9. Differences in Execution: Specs vs. Impls 1 2 3 Interleaving in HL 1.2 1.1 Concurrency in LL 1.3 2.1 2.2 3.1 3.3 3.2

  10. Proposed Mechanisms • For high level modeling, develop • A few M-CMP coherence protocols • A compositional approach • For specifications vs. implementations, develop • A formal theory • A compositional approach • A practical tool

  11. Outline • Background • Proposed solutions • High level hierarchical coherence protocol verification • Refinement check: specifications vs. RTL implementations • Conclusion

  12. An M-CMP Benchmark Protocol Intra-cluster Remote Cluster 1 Home Cluster Remote Cluster 2 L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem Inter-cluster

  13. Protocol Features • Both levels use MESI protocols • Intra-cluster: FLASH • Inter-cluster: DASH • Silent drop on non-Modified cache lines • Network channels are non-FIFO • Inclusive caches

  14. Another Benchmark: Non-inclusive Caches Remote Cluster 1 Home Cluster Remote Cluster 2 L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem

  15. Our Compositional Approach Original protocol

  16. Our Compositional Approach

  17. One Way to Decompose Protocols • Create three abstract protocols • Each with 1 detailed cluster + 2 abstracted clusters

  18. Abstract Protocol #1 Home Cluster L1 Cache L1 Cache Remote Cluster 2 Remote Cluster 1 L2 Cache+Local Dir’ L2 Cache+Local Dir’ L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem

  19. Abstract Protocol #2 Remote Cluster 1 L1 Cache L1 Cache Remote Cluster 2 Home Cluster L2 Cache+Local Dir’ L2 Cache+Local Dir’ L2 Cache+Local Dir RAC RAC RAC Global Dir Main Mem

  20. Problems with This Approach • Every abstract protocol contains 2 protocols • Duplicated behaviors in abstract protocols • State space still large # of states Time (hour) Mem (GB) M1 284,088,425 12 18 M2 636,613,051 18 18

  21. Second Way to Decompose Protocols Remote Cluster 1 Home Cluster L1 Cache L1 Cache L1 Cache L1 Cache ABS #1 ABS #2 L2 Cache+Local Dir L2 Cache+Local Dir Remote Cluster 1 Home Cluster Remote Cluster 2 L2 Cache+Local Dir’ L2 Cache+Local Dir’ L2 Cache+Local Dir’ RAC RAC RAC Global Dir Main Mem ABS #3

  22. Model Checking Results

  23. Details of Our Approach • Abstraction • States • Transitions, properties • Constraining • Assume guarantee reasoning

  24. Abstraction on States Intra-cluster Inter-cluster

  25. State Representation Original cluster L1 Cache L1 Cache RAC L2 L1s Network Local Dir L2 Cache+Local Dir RAC L1 Cache L1 Cache L2 L1s Network Local Dir L2 Cache+Local Dir L2 Cache+Local Dir’ RAC L2 Local Dir’ RAC Abstract clusters

  26. Abstracting Transitions and Properties • Rule:guard action • guard • Become more permissive • action • Allow more behaviors

  27. An Example of Abstraction Abstract intra-cluster protocol L1 Cache L1 Cache Clusters[c].WbMsg.Cmd = WB Clusters[c].L2.Data := Clusters[c].WbMsg.Data; Clusters[c].L2.HeadPtr := L2; … WB L2 Cache+Local Dir RAC True Clusters[c].L2.Data := nondet; … L2 Cache+Local Dir’ RAC Abstract inter-cluster protocol

  28. Abstraction, Now Constraining

  29. An Example of Constraining True & Clusters[c].L2.State = Excl Clusters[c].L2.Data := nondet; … L2 Cache+Local Dir’ RAC L1 Cache L1 Cache Clusters[c].WbMsg.Cmd = WB Clusters[c].L2.State = Excl WB L2 Cache+Local Dir RAC

  30. Non-inclusive Protocols: History Variables Remote Cluster 1 Home Cluster L1 Cache L1 Cache L1 Cache L1 Cache L2 Cache+Local Dir L2 Cache+Local Dir Remote Cluster 1 Home Cluster Remote Cluster 2 L2 Cache+Local Dir’ L2 Cache+Local Dir’ L2 Cache+Local Dir’ RAC RAC RAC Global Dir Main Mem

  31. Experimental Results

  32. Outline • Background • Proposed solutions • High level hierarchical coherence protocol verification • Refinement check: specifications vs. RTL implementations • Conclusion

  33. Our Approach • Use a hardware language • Hardware Murphi • Develop a formal theory of refinement check • Develop a compositional approach • Abstraction • Assume guarantee • Develop a practical tool

  34. Hardware Murphi • Murphi extension by S. German and G. Janssen • A concurrent shared variable language • On each cycle • Multiple transitions execute concurrently • Exclusive write to a variable • Shared reads to variables • Write immediately visible within the same transition • Write visible to other transitions on the next cycle • Support transactions, signals, etc

  35. Transaction • Group multiple steps in impl Transaction Rule-1 …. … Rule-6 … End; 4 2 1 5 3 6

  36. Workflow of Our Refinement Check Murphi Spec model Muv Property check Product model in Hardware Murphi Product model in VHDL Hardware Murphi Impl model Check low-level correctly implements high-level

  37. Full List of Assertions for Refinement Check • Serializability for specifications • No write-write conflicts • Initial states containment • Write set variables containment • Enableness for specifications • Joint variables match at the end of transactions

  38. An Example Impl transaction Transaction Rule-1 guard1 action1; Rule-2 guard2 action2; Rule-3 guard3 action3; End; Spec rule Rule spec_guard spec_action;

  39. An Example (Cont’d) Transaction Rule-1 guard1 action1; assert spec_guard; spec_action; Rule-2 guard2 action2; Rule-3 guard3 action3; End; assert impl_var1 = spec_var1; assert impl_var2 = spec_var2; …

  40. Driving Benchmark Dir Cache Mem Local Buf Home Buf Remote Buf Router Dir Cache Mem Local Buf Home Buf Remote Buf S. German and G. Janssen, IBM Research Tech Report 2006

  41. Bugs Found with Refinement Check • Benchmark satisfies cache coherence already • Bugs still found • Bug 1: router unit loses messages • Bug 2: home unit replies twice for one request • Bug 3: cache unit gets updated twice from one reply • Refinement check is an automatic way of constructing checks

  42. Model Checking Approaches • Monolithic • Straightforward property check • Compositional • Divide and conquer Monolithic Product model in VHDL Compositional

  43. Compositional Refinement Check • Reduce the verification complexity • Basic Techniques • Abstraction • Removing details to make verification easier • Assume guarantee • A simple form of induction which introduces assumptions and justifies them

  44. In More Detail • Abstraction • Change variables to free input variables • E.g. change a latch to free input signal • Assume guarantee Assume for reads of a transaction (spec.Var = impl.Var) holds

  45. Experimental Results • Configurations • 2 nodes, 2 addresses, SixthSense VerificationTime 1-day Monolithic approach Compositional approach 30 min Datapath 1-bit 10-bit

  46. Outline • Background • Proposed solutions • High level hierarchical coherence protocol verification • Refinement check: specifications vs. RTL implementations • Conclusion

  47. Thank you.

  48. Related Work • Parameterized verification • Chou et al. • Bluespec • Arvind et al. • Aggregation of distributed actions • Park and Dill • Compositional verification • Many previous works including McMillan, Jones, etc.

More Related