1 / 38

Open Source Model Checking Radu Grosu SUNY at Stony Brook

Open Source Model Checking Radu Grosu SUNY at Stony Brook. Joint work with X. Huang, S. Jain and S. A. Smolka. GCC Compiler. Early stages: A modest C compiler. Translation: source code translated directly to RTL. Optimization: at low RTL level.

Download Presentation

Open Source Model Checking Radu Grosu SUNY at Stony Brook

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Open Source Model CheckingRadu GrosuSUNY at Stony Brook Joint work with X. Huang, S. Jain and S. A. Smolka

  2. GCC Compiler • Early stages: A modest C compiler. • Translation: source code translated directly to RTL. • Optimization: at low RTL level. • High level information lost: calls, structures, fields, etc. • Now days: Full blown,multi-language compiler • generating code for more than30 architectures. • Input: C, C++, Objective-C, Fortran, Java and Ada. • Tree-SSA: added GENERIC, GIMPLE and SSA ILs. • Optimization: at GENERIC, GIMPLE, SSA and RTL levels. • Verification: Tree-SSA API suitable for verification, too.

  3. C File C++ File Java File GPL AST Build CFG C Parser C++ Parser Java Parser .. Parse Tree SSA/GPL CFG Rest Comp Genericize GEN AST RTL Code Code Gen Gimplify Obj Code GPL AST GCC Compilation Process

  4. C Program and its GIMPLE IL int main { int a,b,c; int T1,T2,T3,T4; a = 5; b = a + 10; T1 = foo(a,b); T2 = a + T1; if (a > T2) goto fi; T3 = b / a; T4 = b * a; c = T2 + T3; b = b + 1; fi:bar(a,b,c); } int main() { int a,b,c; a = 5; b = a + 10; c = a + foo(a,b); if (a > c) c = b++/a + b*a; bar(a,b,c); }

  5. FUNCTION DECL a b c T1 T2 T3 T4 Entry int int int int int int int A a = 5; b = a + 10; T1 = foo(a,b); T2 = b + T1; if (a > T2) goto B; CE = CE CE a 5 CE = B true false + C b T3 = b / a; T4 = b * a; c = T3 + T4; b = b + 1; = bar(a,b,c); return; a 10 = + if T2 CallE T1 > b T1 B foo a b Exit a T2 Associated GIMPLE CFG

  6. GCC Model Checking (GMC) • GMC: a suite of analysis and verification tools we are developing for the Tree-SSA level of GCC. Currently: • Intra-procedural slicer: in work is inter-procedural slicing. • Symbolic execution engine: for Boolean C programs. • Interpreter: traverses the CFG using Tree-SSA iterators. • Monte Carlo MC (GMC2): OSE, randomized alg. for LTL MC. • GMC2: a newly developed technique that uses the theory of geometric random variables,statistical hypothesis testing and random sampling of lassos.

  7. LTL MC  Finding Accepting Lassos Lassos Computation tree (CT) recurrence diameter LTL Explore alllassos in the CT DDFS,SCC: time efficient DFS: memory efficient

  8. Randomized Algorithms • Takes of next step algorithm may depend on random choice(coin flip). • Benefits: simplicity, efficiency, and symmetry breaking. • Monte Carlo: may produce incorrect result but with bounded error probability. • Example: Election’s result prediction • Las Vegas: always gives correct result but running time is a random variable. • Example: Randomized Quick Sort

  9. Monte Carlo Approach Lassos Computation tree (CT) recurrence diameter … LTL flip a k-sided coin Explore N(,) independent lassos in the CT Error margin andconfidence ratio 

  10. 1 1 1 2 2 ½ 4 3 3 4 1 4 4 ¼ ⅛ 4 ⅛ Bernoulli Random Variable Z(coin flip) Probability mass function: p(1) = P[Z=1] = pZ = 1/8 p(0) = P[Z=0] = qZ = 7/8

  11. Geometric Random Variable • Value ofgeometricRV Xwith parameterpz: • No. of independent lassos until success. • Probability mass function: • p(N) = P[X = N] = qzN-1 pz • Cumulative Distribution Function: • F(N) = P[X  N] = ∑i  Np(i) = 1 – qzN= 1 – (1- pz)N

  12. How Many Lassos? • Requiring1- (1-pz)N = 1- δ yields: N = ln (δ) / ln (1- pz) • Lower bound on number of trials N needed to achieve success with confidence ratioδ.

  13. What If pz Unknown? • Requiringpz  εyields: M = ln (δ) / ln (1- ε)  N = ln (δ) / ln (1- pz) and therefore P[X  M]  1- δ • Lower bound on number of trials M needed to achieve success with confidence ratioδ and error marginε .

  14. Statistical Hypothesis Testing • Null hypothesisH0:pz  ε • Alternative hypothesisH1:pz <ε • If no success after N trials, then rejectH0 • Type I error:α= P[ X > M | H0] <δ • Since:P[ X  M | H0 ]  1- δ

  15. Monte Carlo Model Checking (MC2) input:B=(Σ,Q,Q0,δ,F), ε, δ N = ln (δ) / ln (1- ε) for (i = 1; i  N; i++) if (RL(B) == 1) return (1, error-trace); return (0, “reject H0 with α = Pr[ X>N | H0 ] < δ”); where RL(B) performs a uniform random walk through B to obtain a random lasso.

  16. GCC MC2 (GMC2) • Input:a set of CFGs. • Main function: A specifically designated CFG. • Random walks in the Büchi automaton: generated on-the-fly. • Initial state:of the main routine + bookkeeping information. • Next state: choose process + call interpreter on its CFG. • Processes:created by using the fork primitive. • Optimization: interpreter returns only upon context switch. • Lassos: detected by usingahierarchic hash table. • Local variables: removed upon return from a procedure.

  17. Program State Shared Variables Valuation (channels & semaphores) List Of Process states p2 p3 p1 … Control State Data State CFG Name Statement #

  18. Program State Shared Variables Valuation (channels & semaphores) List Of Process states p1 p2 p3 … Control State Data State Heap Global Variables Valuation Frame Stack f1 f2 … Return Control State Local Variables Valuation

  19. Interpreter • Interprets GIMPLE statements: according to their semantics. Interesting: • Inter-procedural: call(), return(). Manipulate the frame stack. • Catches and interprets: function calls to various modeling and concurrency primitives: • Modeling: toss(), assert(). Nondeterminism and checks. • Processes:fork(), … Manipulate the process list. • Communication: send(), recv(). Manipulate shared vars. May involve a context switch.

  20. Results: TCAS

  21. DPh: Symmetric Fair Version (Deadlock freedom)

  22. Needham-Schroeder Protocol • Quite sophisticated C implementation. • However, of a sequential nature: • Essentially executes only one round of a • reactive system

  23. Related Work • Software model checkers for concurrent C/C++: • VeriSoft, Spin, Blast (Slam), Magic, C-Wolf. Bogor? • Cooperative Bug Isolation [Liblit, Naik & Zheng]: • Compile-time instrumentation. Distribute binaries/collect bugs. • Statistical analysis to isolate erroneous code segments. • Random interpretation [Gulvany & Necula]: • Execute random pathsand merge with random linear operators. • Monte Carlo and abstract interpretation [Monniaux]: • Analyze programs with probabilistic and nondeterministic input.

  24. Conclusions • Presented GMC2: a software MC for GCC based on Monte Carlo MC: • At Tree-SSA level: applicable to C, C++, Ada, Java, etc. • Open source: freely available for usage/critique/extension. • Ongoing and Future Work: Create a software MCbranch of GCC, which also includes: • Automated abstraction/refinement/interpolation techniques. • Currently we manually apply a form of bounded-range abstraction (e.g. in TCAS).

  25. Talk Outline • Model Checking • Randomized Algorithms • LTL Model Checking • Probability Theory Primer • Monte Carlo Model Checking • Implementation & Results • Conclusions & Open Problem

  26. Linear Temporal Logic • LTL formula: made up inductively of • atomic propositions p, boolean connectives, ,  • temporal modalities X (neXt) and U (Until). • Safety: “nothing bad ever happens” • E.g. G( (pc1=cs  pc2=cs)) where G is a derived modality (Globally). • Liveness: “something good eventually happens” • E.g. G( req  F serviced ) where F is a derived modality (Finally).

  27. Model Checking • S is anondeterministic/concurrent system. •  is atemporal logic formula. • in our case Linear Temporal Logic (LTL). • Basic idea: intelligently explore S’s state space in attempt to establish S|=.

  28. LTL Model Checking • Every LTL formula can be translated to a BüchiautomatonB such that L()= L(B) • Automata-theoretic approach: • S|=iff L(BS)  L(B ) iffL(BS  B )= • Checking non-emptiness is equivalent to finding a reachableaccepting cycle(lasso).

  29. sn sk+3 sk+2 sk+1 DFS2 DFS1 s1 s2 s3 sk-2 sk-1 sk Emptiness Checking • Checking non-emptiness is equivalent to finding an accepting cycle reachable from initial state (lasso). • Double Depth-First Search (DDFS) algorithm can be used to search for such cycles, and this can be done on-the-fly!

  30. Randomized Algorithms Huge impacton CS: (distributed) algorithms, complexity theory, cryptography, etc. Takes of next step algorithm may depend on random choice(coin flip). Benefitsof randomization include simplicity,efficiency, and symmetry breaking.

  31. Lassos Probability Space • Sample Space: lassos in BS  B • Bernoulli random variable Z : • Outcome = 1 if randomly chosen lasso accepting • Outcome = 0 otherwise • pZ= ∑ pi Zi(expectation of an accepting lasso) where pi is lasso prob. (uniform random walk)

  32. Bernoulli Random Variable(coin flip) • Value of Bernoulli RV Z: Z = 1 (success) & Z = 0 (failure) • Probability mass function: p(1) = Pr[Z=1] = pz p(0) = Pr[Z=0] = 1- pz= qz • Expectation: E[Z] = pz

  33. Statistical Hypothesis Testing • Example: Given a fair and a biased coin. • Null hypothesisH0- fair coin selected. • Alternative hypothesisH1- biased coin selected. • Hypothesis testing: Perform N trials. • If number of heads is LOW, rejectH0. • Else fail to rejectH0.

  34. Statistical Hypothesis Testing

  35. Random Lasso (RL) Algorithm

  36. Correctness of MC2 Theorem: Given aBüchi automaton B, error margin ε, and confidence ratio δ, if MC2rejects H0, then its type I error has probability α= P[ X > M | H0] <δ

  37. Complexity of MC2 Theorem: Given aBüchi automaton B having diameter D, error margin ε, and confidence ratio δ, MC2 runsin timeO(N∙D) and uses spaceO(D), whereN = ln(δ) / ln(1- ε) Cf. DDFS which runs in O(2|S|+|φ|) time for B= BS B.

  38. 0 1 n-1 n Alternative Sampling Strategies • Multilasso sampling: ignores backedges that do not lead to an accepting lasso. Pr[Ln]= O(2-n) • Probabilistic systems: there is a natural way to assign a probability to a RL. • Input partitioning: partition input into classes that trigger the same behavior (guards).

More Related