1 / 24

DDEC: Data-Driven Equivalence Checking

Rahul Sharma, Eric Schkufza , Berkeley Churchill, Alex Aiken. DDEC: Data-Driven Equivalence Checking. Equivalence checking. Prove two programs are equivalent Compiler optimizations Validate refactorings Cross checking different implementations Old and well studied problem

aure
Download Presentation

DDEC: Data-Driven Equivalence Checking

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Rahul Sharma, Eric Schkufza, Berkeley Churchill, Alex Aiken DDEC: Data-Driven Equivalence Checking

  2. Equivalence checking • Prove two programs are equivalent • Compiler optimizations • Validate refactorings • Cross checking different implementations • Old and well studied problem • Undecidable in general • Major challenge: prove equivalence of loops • Straight line programs relatively easy

  3. Motivating applications • Prove equivalence of two binaries Trustworthy Compiler CompCert, gcc –O0 … while … … Confidence of , Performance of Optimizing Compiler gcc –O3, icc –O3

  4. Stochastic superoptimization Straight Line Code … while … … Trustworthy Compiler CompCert, gcc –O0 STOKE (ASPLOS 13) Random mutations

  5. Previous work • Do not support “while” loops: [CHR00], [FH02], [FH05], [AEF+05], [SBC+05], [MSF06] • Do not reason about termination: [SDE+08], [GS09], [RE11], [LHM+12], [PY13], [LMS+13] • Translation validation: [Nec00],[GZB05], … • Need information from the compiler

  6. Simulation relation • Decompose proof Rewrite a’ Target movq 8(rsp), r9 a movq 8(rsp), rdi #rdi != 0 #r9 != 0 b b’ decq r9 retq movq 8(rsp), rdi decqrdi movqrdi, 8(rsp) retq c c’ : states equal : 8(rsp)=rdi=r9’ : live out equal

  7. Inference • Given a simulation relation, proofs for loops reduce to proofs for loop free fragments • Use decision procedures • Main challenge: infer a simulation relation • Infer synchronization points • Infer invariants • We use compilers as black boxes • Mine relations from concrete executions

  8. Runtime information • Run some tests to get data • From executions, unit tests, random tests, etc.

  9. Runtime information • Ensure the loops iterate for equal iterations • Use data to align and Target Rewrite 2n n n B B;B B’ retq retq

  10. Runtime information • Attempt to detect synchronization points • Number of times program points are executed • Values align Rewrite n Target movq 8(rsp), r9 movq 8(rsp), rdi #rdi != 0 #r9 != 0 n+1 n+1 decq r9 retq movq 8(rsp), rdi decqrdi movqrdi, 8(rsp) retq 1 n n

  11. Invariants • Invariants are restricted to equalities • Infer invariants from observed data values Target movq 8(rsp), rdi #rdi != 0 movq 8(rsp), rdi decqrdi movqrdi, 8(rsp) retq

  12. Invariants • Invariants are restricted to equalities • Infer invariants from observed data values Rewrite movq 8(rsp), r9 #r9 != 0 decq r9 retq

  13. Linear algebra • Mine all equalities • Find all s.t. • Nullspace or kernel

  14. Check simulation relation • The executions are synchronized • The invariants are maintained Rewrite a’ States equal Target movq 8(rsp), r9 a movq 8(rsp), rdi #rdi != 0 #r9 != 0 b b’ decq r9 retq movq 8(rsp), rdi decqrdi movqrdi, 8(rsp) retq c c’ Live outs equal

  15. Check simulation relation • The executions are synchronized • The invariants are maintained • Queries in quantifier free bitvector arithmetic • Complete SMT solvers! • Incorporate counter-examples in relations • Sound but not complete • If checking succeeds then equivalent • Can fail to infer a sound simulation relation

  16. Limitations • Insufficient data to infer a sound relation • Expressiveness of invariants • Inequalities, quantifiers, etc. • Expressiveness of SMT solver • Floating point, multiply, divide, etc.

  17. Implementation • Run tests and generate data • https://github.com/eschkufz/x64asm • Nullspace computation • libIML: integer matrix library • SMT solver: Z3

  18. Case studies • Compute kernel inside OpenSSL • Validating CompCert against gcc • Stochastic optimization for loops

  19. OpenSSL • Multiplication kernel • Extensive performance tests • Run the kernel ~15 million times • Choose 16 random tests for inference • Compile with gcc –O0 and gcc –O3 • Successfully prove equivalence

  20. Cross compiler validation

  21. STOKE

  22. Optimization results

  23. Conclusion • Prove equivalence of loops in two stages • Infer simulation relation • Check the inferred relation using SMT solvers • Use runtime data for inference • No change required to the compilers • Better verifiers lead to better optimizers

  24. Inference from concrete states • M. D. Ernst, J. H. Perkins, P. J. Guo, S. McCamant, C. Pacheco, M. S. Tschantz, and C. Xiao. The Daikon system for dynamic detection of likely invariants. Sci. Comput. Program., 69(1-3):35–45, 2007 • T. Nguyen, D. Kapur, W. Weimer, and S. Forrest. Using dynamic analysis to discover polynomial and array invariants. ICSE 2012 • P. Garg, C. Löding, P. Madhusudan, D. Neider: Learning Universally Quantified Invariants of Linear Data Structures. CAV 2013 • R. Sharma, S. Gupta, B. Hariharan, A. Aiken, P. Liang, A. V. Nori: A Data Driven Approach for Algebraic Loop Invariants. ESOP 2013 • R. Sharma, S. Gupta, B. Hariharan, A. Aiken, A. V. Nori: Verification as Learning Geometric Concepts. SAS 2013 • A.V. Nori, R. Sharma: Termination proofs from tests. ESEC/SIGSOFT FSE 2013

More Related