0 likes | 11 Views
Presented by Michael Kuperstein, this presentation discusses the use of partial-coherence abstractions in relaxed memory models for programs. It covers concepts like mutual exclusion algorithms, store buffer-based models, memory fences, automatic solutions for enforcing memory consistency, abstract memory models, and fence inference methods for precise analysis. The goal is to verify program correctness without arbitrary bounds and synthesize optimal fences for mutual exclusion primitives.
E N D
1 PARTIAL-COHERENCE ABSTRACTIONS FOR RELAXED MEMORY MODELS Presented by Michael Kuperstein, Technion Joint work with Martin Vechev, IBM Research and Eran Yahav, Technion
Sequential Consistency We expect our programs to have “Interleaving semantics” Consistent with program order “The result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program.” – Leslie Lamport, 1973 2
Dekker’s Algorithm for Mutual Exclusion Process 0: flag[0] := true while flag[1] = true { if turn ≠ 0 { flag[0] := false while turn ≠ 0 { } flag[0] := true } } // critical section turn := 1 flag[0] := false Process 1: flag[1] := true while flag[0] = true { if turn ≠ 1 { flag[1] := false while turn ≠ 1 { } flag[1] := true } } // critical section turn := 0 flag[1] := false Specification: mutual exclusion over critical section 3
Store Buffer Based Models TSO & PSO x86 ~ TSO fence store flush … X Y Z Memory Fences Restore order 1 2 3 … … P0 Main Memory Every store before the fence becomes globally visible before anything after the fence executes … X Y Z … … P1 load 4
Memory Fences Process 0: flag[0] := true fence while flag[1] = true { if turn ≠ 0 { flag[0] := false fence while turn ≠ 0 { } flag[0] := true fence } } // critical section turn := 1 fence flag[0] := false fence Fences are expensive 10s-100s of cycles Practical Significance Data structures Linux Kernel spinlocks Placing fences manually Overfencing: hurts performance Underfencing: subtle bugs 5
Memory Fences Process 0: flag[0] := true fence while flag[1] = true { if turn ≠ 0 { flag[0] := false while turn ≠ 0 { } flag[0] := true } } // critical section turn := 1 flag[0] := false Fences are expensive 10s-100s of cycles Practical Significance Data structures Linux Kernel spinlocks Placing fences manually Overfencing: hurts performance Underfencing: subtle bugs 6
Automatic Solutions Equivalence to Sequential Consistency Reduce program behaviors to sequentially consistent (SC) runs High-level specifications are ignored Goes back to Shasha & Snir [TOPLAS ’88] Place fences to satisfy provided specification Using specification may forbid less executions May require fewer fences SC PSO Safe 7
Goal Finite-State Program P Safety Specification S Program P’ with Fences Memory Model M P’ satisfies the specification S under M 8
General Recipe Compute reachable states 1. Compute weakest constraints that guarantee all “bad states” are avoided 2. Implement the constraints with fences 3. 9
Constraints Constraint language Not every transition can be prevented using a fence 10 A B C A B C Unavoidable P1: X P1: X 1 2 3 1 2 3 P2 : (D) LOAD R1 = X X X P2: P2: A B C A B C P1: X P1: X 1 2 3 1 2 3 P1 : (D) LOAD R1 = X X X P2: P2: [A < D] [B < D] [C < D] 10
Concrete Transition System Building transition system under TSO/PSO is hard No a-priori bound on buffer length Unbounded state-space Even for programs that were finite-state under SC Reachability has non-primitive recursive complexity [Atig et al., POPL ’10] 11
Abstract Memory Models (AMM) Bounded approximation of unbounded buffers Strictly weaker than concrete TSO/PSO Finite-state programs remain finite-state Reachability becomes effectively computable Construct finite (abstract) transition system Apply fence inference Can also be used for verification PSO SC Safe AMM 12
Partial Coherence Abstractions Record what values appeared (without order or number) Allows precise fence semantics Keeps the analysis precise for “well behaved” programs Allows precise loads from buffer Recent value Unordered elements Bounded length k X … X Y Z Z … … Y P0 P0 Main Main Memory Memory X … X Y X Z … … Y P1 P1 13
Partial Coherence Abstractions Concrete 1 1 2 2 3 3 4 4 5 5 6 6 7 7 Abstract {2,3,4,5} 14
Abstract Fence Inference Compute reachable abstract states 1. Compute constraints. Precision depends on abstraction. 2. Implement the constraints with fences 3. 15
Fence Inference Results Program FD k=0 FD k=1 FD k=2 PD k=0 PD k=1 PD k=2 Sense0 Pet0 Dek0 Lam0 Fast0 Fast1a Fast1b Fast1c Benchmarks are mutual exclusion primitives k - the bound on the FIFO part of the abstract buffer PD more “aggressive” than FD 16
Summary Partial-coherence abstractions Verification without arbitrary bounds Abstraction precision affects quality of results Synthesis of fences Can infer optimal fences for mutual exclusion primitives P S P’ M 17
Questions 18
Related Work Under-approximation CheckFence [Burckhardt et al., PLDI ’07] Fender [KVY, FMCAD ’10] And more… Over-approximation Equivalence to SC Very imprecise Goes back to Shasha & Snir [TOPLAS ‘88] Abstract Interpretation Varying precision Regular Abstraction [Linden et al., SPIN ’10] Partial-Coherence [KVY, PLDI ’11] 19