1 / 36

Translation Validation for an Optimizing Compiler

Translation Validation for an Optimizing Compiler. Guy Erez. Based on George C. Necula article (ACM SIGPLAN 2000). Advanced Programming Languages Seminar, Winter 2000. In a Nutshell. The Problem : Verify that the optimized and source code are equivalent

dino
Download Presentation

Translation Validation for an Optimizing Compiler

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Translation Validation for an Optimizing Compiler Guy Erez Based on George C. Necula article (ACM SIGPLAN 2000) Advanced Programming Languages Seminar, Winter 2000

  2. In a Nutshell • The Problem: Verify that the optimized and source code are equivalent • Partial (heuristic) Solution: Independently prove the validity of each translation pass • Motivation: Optimizer Testing

  3. Outline • Introduction • Intermediate Language • An extensive example • Simulation Relation • Execution Pair • Equivalence Checking • Branch Navigation • Results and Limitations

  4. Methods of Proving Compiler Correctness • Prove compiler general correctness: • absolute • tedious • impractical for large programs • very dependent of compiler code

  5. Methods of Proving Compiler Corr. (cont.) • Show that each translation phase was valid • weaker • proof per program • applicable for large programs • independent of compiler code

  6. Compilation Process SourceCode IntermediateLanguage(IL) TargetCode

  7. Optimization Process Optimize Pass ILCode0 ILCode1 ILCoden Validator

  8. The IL in GNU C (subset) • Instructions:Expressions: • Operators:

  9. An Example extern int g;extern int a[…];main(){ int n=… /* n contains the length of the array */ int i; for (i=0; i<n; i++) a[i]=g*i+3; return i;}

  10. And in IL… for (i=0;i<n; i++) a[i]=g*i+3;return i;

  11. After Transformation… Use registers Transform while to a repeat loop ?<==> ?<==>

  12. Equivalence • x1,…,xn– variables in source • y1,…,ym– variables in target • Variable Equivalence:x1 = y3 • Expression Equivalence:x1+x2 = y3+6

  13. Simulation Relation • A set of equivalences between a source block and a target block

  14. Execution Pair • Definition: An execution path in the source and its corresponding path in the target Source Target

  15. Checking Equivalence • Equivalence is checked at the end of a specific execution pair • A variable value after the run is marked with a prime Symbolic Substitution x’=x+1 x y y’=y*3

  16. Equivalence Simplification • An equivalence can be simplified using: • Arithmetic rules • Already proven equivalences • Example: If x’=x+1 and y’=y*5 then:3*x’=y’3*(x+1)=y*53*x+3=y*5 • An equivalence holds if it can be simplified to an already proven equivalence

  17. Checking Simulation Relations • A relation is correct if for each execution pair entering it, all of its equivalences hold x x=y+1 y

  18. Something fishy • What’s the point of proving something using the same rules that created it? • Simpler • Provides an independent perspective on the final code

  19. A. Element #1 holds B. There is only oneexecution pair (no cond.) Showtime… C. Prove elem. #2 (Trivial)

  20. b3-b1-b2 and b7-b5 Element #5 • Two execution pairs:

  21. b3-b1-b3 and b7-b7 Element #5 (cont.) • The other pair:

  22. Known Equivalences • Equivalences from the start of the run: • Equivalences at the end of run:

  23. Need to Prove • The path condition is correct: • The equivalences hold, mainly:

  24. Elem #5: Path Cond.

  25. distributivity commutativity Elem #5: The Equivalence Q.E.D

  26. Algorithm Parts • Inferring Simulation Relations • Finding execution pairs • Solving Constraints

  27. Navigating Branches • An optimizer might eliminate or reverse branches • Problem: did branch B’ originate from branch B in the source • Solution: Use heuristics

  28. A Typical Case

  29. Similarity • The similarity between two branches depend on the similarity of their: • preceding instruction sequence • boolean conditions • the twobranching sequences

  30. Similarity (cont.) • Formally: • ~ is a numeric relation(0..1) • “and” is multiplication • “or” is maximum

  31. Boolean Similarity • Branches are similar if: • one can be simplified into the other using simple transforms, such as:

  32. Instruction Similarity • Instructions similarity • amount of function calls • lead to already related branches (in that case, similarity is 1.0)

  33. Instruction Similarity… • gcc specific features • IL instructions serial number • source line number information (for code duplication detection)

  34. Results • Detected a known bug in gcc 2.7.2.2 • Used on large programs: • Increased compile time x4

  35. Limitations • Cannot handle loop unrolling • Cannot resolve all types of equivalences • Produces several false alarms (i.e. the gcc bug was accompanied by 3 false alarms)

  36. Conclusion • Automatically infer equivalences • Uses: • simple rules and substitution • heuristics • Good results • Problems: • false alarms • runtime overhead

More Related