1 / 23

CS412/413

bytecode interpreter. src-to-src translator. bytecode compiler. CS412/413. JIT. Introduction to Compilers and Translators May 3, 1999 Lecture 34: Compiler-like Systems. Administration.

Download Presentation

CS412/413

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. bytecode interpreter src-to-src translator bytecode compiler CS412/413 JIT Introduction to Compilers and Translators May 3, 1999 Lecture 34: Compiler-like Systems

  2. Administration • Project code must be turned in Friday even if you haven’t finished the programming assignment -- at minimum, turn in something that can compile programs. • If parts of program are incomplete, include explanation of design, and components that are working CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  3. A Compiler Lexical analysis source-to- source translator Parsing bytecode compiler front end HIR Optimization “Compiler Classic” Intermediate Code Generation MIR Optimization Code generation JIT compiler back end LIR optimization interpreter Linking & Loading CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  4. Other options • Compile from source to another high-level language (source-to-source translator), let other compiler deal with code gen, etc. • Compile from source to an intermediate code format for which a back end already exists (ucode, RTF, LCC, ...) • Compile from source to an intermediate code format executable by an interpreter • abstract syntax tree • bytecodes • threaded code (call- or pointer-threaded) • Advantage: Architecture independence CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  5. Source-to-source translator • Idea: choose well-supported high-level language (e.g. C, C++, Java) as target • Translate AST to high-level language constructs instead of to IR, pass translated code off to underlying compiler • Advantage: easy, can leverage good underlying compiler technology. Examples: C++ (to C), PolyJ (to Java), Toba (JVM to C) • Disadvantages: target language won’t support all features, optimization harder in target language, language may impose extra checks CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  6. Compiling to C • C doesn’t impose extra checks, is reasonably close to assembly (but can’t support static exception tables!), widely available • Mismatch: doesn’t allow statements underneath expressions; need to translate and canonicalize in one step • Translation of an expression into C is: • a sequence of statements to be executed • an expression to be evaluated afterward E S1;…;Sn , E’ CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  7. Translation rules • Translation still can be performed by recursive traversal of AST • IotaC rules: E  S , E’ id = E; S , id = E’; Si  Si’, Ei’ { S1;…; Sn }  S1’;…; Sn’ , En’ assignment block CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  8. Translating to Java • Same problems as C, plus: Java is type-safe (good in a HLL, not so good in an intermediate language!) • May need to use cast and instanceof expressions in generated code: slow in Java CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  9. Back Ends • Several standard intermediate code formats exist with back ends for various architectures -- can reuse back ends • p-code: very old stack machine format • UCODE: old Stanford/MIPS stack machine format • Java bytecode: new stack machine format • RTL: GNU gcc, etc. • SUIF: Stanford format for optimization • LCC: Lightweight C compiler CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  10. Intermediate code formats • Quadruples • compact, similar to machine code, good for standard optimization techniques • Stack-machine • easy to generate code for • not compact (contrary to popular opinion) • can be converted back into quadruples • used by some (sort of) high-level languages: FORTH, PostScript, HP calculators CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  11. Stack machine format • Code is a sequence of stack operations (not necessarily the same stack as the call stack) push CONST: add CONST to the top of stack pop : discard top of stack store : in memory location specified by top of stack store element just below. load : replace top of stack with memory location it points to +, *, /, -, … : replace top two elems w/ result of operation CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  12. Generating code • Stack operations mostly don’t name operands (implicit): can code in 1 byte • Expression is translated to code that leaves its value on the top of stack • Translation of E1 + E2: E1 + E2  E1; E2; + • Translation of id = E :  id = E    E ; push addr(id); store • Good for very simple code generation CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  13. Verbosity • Values get “trapped” down low on stack especially with subexpression elim. • Need instruction that re-pushes element at known stack index on top; instruction is used often • Might as well have abstract register operands! • Result: less compact than a register-based format; extra copies of data too CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  14. Stack machine  register machine • At each point in code, keep track of stack depth (if possible) • Assign temporaries according to depth • Replace stack operands with quadruples using these temporaries 0 1 2 push a ; 0 push b ; 1 push c ; 2 * ; 1 + ; 0 t0 = a t1 = b t2 = c t1 = t1 * t2 t0 = t0 + t1 a b c a b*c a + b*c a+b*c CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  15. JIT compilers • Particularly widely available back end with well-defined intermediate code (JVM bytecode) • Generate code by reconstructing registers as discussed • Compilation is done on-the-fly: generating code quickly is essential  generated code quality is usually low • HotSpot: new Sun JIT. High-quality optimization (esp. inlining and specialization), but used sparingly CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  16. Interpreters • “Why generate machine code at all? Just run it. Processors are really fast” • Options: • token interpreters (parsing on the fly) -- reallly slow (>1000x) • AST interpreters -- 300x • threaded interpreters -- 20-50x • bytecode interpreters -- 10-30x CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  17. AST interpreters “Yet another recursive traversal” • For every node type in AST, add method Object evaluate(RunTimeContext r) • Evaluate method is implemented recursively Object PlusNode.evaluate(r) { return left.evaluate(r).plus(right.evaluate(r)); } • Variables, etc. looked up in r; some help from AST yields big speed-ups (e.g. pre-computed variable locations) CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  18. Threaded interpreters • Code is represented as a bunch of blocks of memory connected by pointers. Blocks are either code blocks or pointer blocks. • Execution is (mostly) depth-first traversal • Each block starts with a single jump instruction JMP JMP next machine code RET CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  19. Threaded interpreters • Can call blocks directly regardless of kind • Conditional branches, other control structure embedded in special machine code blocks that munge stack • Interpreter proper is ~10 instructions of code: very space-efficient! • Long favored by FORTH fans: whole language runtime fits in 4K memory CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  20. Call-threaded interpreters • Instead of pointers in pointer blocks, store CALL instructions • Depth-first traversal done by processor automatically • Faster, trivial code generation required machine code CALL CALL CALL CALL RET CALL RET RET CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  21. Bytecode interpreters • Problem with threaded interpreters: operations with immediate arguments require multiple words (code bloat) • Observation: most interpreter instructions can be encoded in less than a word, many even in just a byte (esp. with stack-machine model) • Tradeoff: interpreter size vs. executable code size CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  22. Implementing a bytecode interpreter • Bytecode interpreter simulates a simple architecture (either stack or register machine) • Interpreter state: • current code pointer • current simulated return stack • current registers or stack & stack pointer • Interpreter code is a big loop containing a switch over kinds of bytecode instructions • one big function: optimization techniques work well. Ideally: no recursion on function calls • Result: about 10x slowdown if done right CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

  23. Summary • Building a new system for executing code doesn’t require construction of a full compiler • Cost-effective strategies: source-to-source translation or translation to an existing intermediate code format • Most material covered in this course is still helpful for these approaches • High performance: translate to C • Portability, extensibility: translate to Java or JVM CS 412/413 Introduction to Compilers and Translators -- Spring '99 Andrew Myers

More Related