1 / 17

Verification of cache-coherence protocols with TLA+

Verification of cache-coherence protocols with TLA+. Homayoon Akhiani, Damien Doligez, Paul Harter, Leslie Lamport, Joshua Scheid, Mark Tuttle, Yuan Yu Compaq Computer Corporation. TLA+. A formal specification language based on set theory, first-order logic, temporal logic

nbeal
Download Presentation

Verification of cache-coherence protocols with TLA+

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Verification of cache-coherence protocols with TLA+ Homayoon Akhiani, Damien Doligez, Paul Harter, Leslie Lamport, Joshua Scheid, Mark Tuttle, Yuan Yu Compaq Computer Corporation Compaq Computer Corporation

  2. TLA+ • A formal specification language based on set theory, first-order logic, temporal logic • Hierarchical style clarifies written • specifications: becomes • proofs: becomes • Engineers find reading easy, writing not too hard <1>1. <2>1. CASE <2>2. CASE <2>3. QED Compaq Computer Corporation

  3. Used TLA+ to demonstrate formal methods to engineering • Analyzed cache-coherence protocols for • EV6: Alpha 21264 processor • EV7: Alpha 21364 processor • Built TLC, a model-checker for TLA+ • Analyzed proposals for industry standards • PCI-X, … Compaq Computer Corporation

  4. Cache coherence protocols cache cache processor processor memory x=2 cachex=1 cachex=2 processor processor Alpha memory model defines ordering of reads and writes to x. Cache coherence protocol enforces the Alpha memory model. Goal: prove the cache coherence protocol is correct. Compaq Computer Corporation

  5. EV6 cache coherence in “three easy steps”+“two-man years” Model Alpha memory model.(200 lines) Prove implementation (550 lines, 2 months, informal) Model abstract protocol.(500 lines) Prove implementation (5500 lines, 4+ months, incomplete) Model complete protocol.(2000 lines, 3 months) Compaq Computer Corporation

  6. Step 1: Alpha memory model We specified the Alpha memory memory model: • The official specification is an informal description of the allowed sequences of reads and writes. • We needed a precise, state-based specification. • We specified a slightly simplified memory model. Compare the specifications: • Official, English specification: 12 pages • Logical, precise specification: 200 lines Compaq Computer Corporation

  7. Step 2: Model abstract protocol protocol = abstract protocol + implementation junk Surprisingly, • abstract protocol’s correctness was far from obvious • we discovered a bug… in the memory model Proved hardest part of correctness: • 35-line invariant based on 300 lines of definitions • 550-line proof, cases nested 10 levels deep Compaq Computer Corporation

  8. Step 3: Model complete protocol Obstacle 1: find a single, complete description • English documents: 20 documents, 4-inch stack • Lisp simulator: crucial to understanding some details Obstacle 2: algorithm complexity • 60 different kinds of messages • 15 “quarks” could combine to model all 60 messages Protocol: 9 man-months, 1900 lines of TLA+ Partial proof: 7 man-months, 1000-line invariant Compaq Computer Corporation

  9. Results: one bug • Quite unexpected to find only one bug! • Heavy simulation had found the easy bugs • Demonstrating our bug requires • four processors • two memory locations • fifteen messages • Hand proof appears essential to finding this bug: • extensive simulation did not find it • state space too large for exhaustive model checking Compaq Computer Corporation

  10. Lessons learned • The designers had no trouble reading our spec. • The level of rigorous analysis resulting even from a partial proof delighted the designers • The demonstration convinced engineers to consider doing the same thing on their own... • The basic methodology worked as expected • Tools, even simple tools, are essential… Compaq Computer Corporation

  11. Check for Invariant false Deadlock TLC model checker State machine in rich subset of TLA+ (Initial, NextState) Configuration file making state machine finite Minimal state trace from an initial state to a bad state Invariant Compaq Computer Corporation

  12. TLC implementation • Require no changes to TLA+ specifications • use the richness of TLA+, no primitive language • use configuration files instead • Interpret specifications, don’t compile them • better user interaction possible • Use explicit state representation, not BDDs • BDD encoding of TLA+ formulas difficult • use canonical state representation + fingerprinting • use efficient disk-based state set and queue implem. Compaq Computer Corporation

  13. TLC status • 20,000 lines of Java • Compaq internal distribution available now • Performance is good, sometimes slow: threaded and distributed implementations now exist. • Liveness checking/livelock detection coming • Coverage analysis is desired: What does lack of an error mean: a correct spec or a buggy spec? Compaq Computer Corporation

  14. EV7 cache coherence • First intense application of TLC model checker • First TLA+ specification written by engineers • Specification is 1800 lines • Specification accepted by TLC w/o modification • State space reduced 50% by adding 15 lines to remove a lot of symmetry in state space Compaq Computer Corporation

  15. Results • 73 bugs found (90% found by TLC): • 37 minor: typos, type errors, etc • 12 bugs: wrong message/wrong state • 14 missing cases • 7 spurious cases (dead code) • 3 miscellaneous (1 TLA+, 1 MC, 1 spec design) • War story: Find bug B by hand; find bug B’ like B by simulation; find bug B’’ in bug-fix for B; find “???” written in original documentation! Compaq Computer Corporation

  16. Lessons learned • Learning TLA+ is not a major task, but writing good specifications still requires experience • EV6 verification was • humbling: only one error actually found • encouraging: the basic method works as expected • EV7 verification was very satisfying: • TLA+ specifications can be written by engineers • TLC can handle industrial-sized specifications • Formal specification belongs in design process… Compaq Computer Corporation

More Related