Compiler Architecture: Interpretation vs Compilation

Course Overview PART I: overview material 1 Introduction 2 Language processors (tombstone diagrams, bootstrapping) 3 Architecture of a compiler PART II: inside a compiler 4 Syntax analysis 5 Contextual analysis 6 Runtime organization 7 Code generation PART III: conclusion • Interpretation 9 Review

Why Interpretation? • Compiler: Large overhead before the code can be run • Alternative: Direct interpretation of the code (immediate execution, no time-consuming compilation) • Applications: • Interactive systems (SQL, shell, etc) • Simple programming languages (Basic, etc) • Scripting languages (Perl, Python, etc) • Programming languages with special requirements (Scheme, Prolog, Smalltalk, etc) • Write once, run once

Two Kinds of Interpreters • Iterative interpretation: Well suited for quite simple languages, and fast (at most 10 times slower than compiled languages) • Recursive interpretation: Well suited for more complex languages, but slower (up to 100 times slower than compiled languages)

Tetris Java-->JVM JVM x86 Tetris Tetris Java JVM Compilation and Interpretation • Due to the slow speed of recursive interpretation, complex languages (such as Java) are often compiled to simpler languages (such as JVM) that can be interpreted iteratively JVM PPC x86 PPC

Iterative Interpretation of Machine Code • General pattern for iterative interpreters: while (true) { fetch( ); analyze( ); execute( ); } • Simulate machine with: • Memory (use arrays for storing code and data) • I/O (directly) • CPU (use variables for registers)

Iterative Interpretation of Machine Code • Fetch: get the next instruction from the code store array at the position pointed to by the instruction pointer; also increment instruction pointer • Analyze: separate the instruction into an opcode and its operands • Execute: use a switch statement with one case per each opcode; update memory and registers as specified by the particular instruction

Hypo: a Hypothetical Abstract Machine • 4096-word code store and 4096-word data store • PC: program counter (register), initially 0 • ACC: general purpose accumulator (register), initially 0 • 4-bit opcode and 12-bit operand • Instruction set: OpcodeInstructionMeaning 0 STORE d word at address d := ACC 1 LOAD d ACC := word at address d 2 LOADL d ACC := d 3 ADD d ACC := ACC + word at address d 4 SUB d ACC := ACC – word at address d 5 JUMP d PC := d 6 JUMPZ d if ACC = 0 then PC := d 7 HALT stop execution

Implementation of Hypo in Java public class HypoInstruction { public byte op; // opcode field public short d; // operand field public static final byte // possible opcodes STOREop=0, LOADop=1, LOADLop=2, ADDop =3, SUBop =4, JUMPop=5, JUMPZop=6, HALTop=7; }

Implementation of Hypo in Java public class HypoState { public static final short CODESIZE=4096; public static final short DATASIZE=4096; public HypoInstruction[ ] code=new HypoInstruction[CODESIZE]; public short[ ] data=new short[DATASIZE]; public short PC; public short ACC; public byte status; public static final byte RUNNING=0, HALTED=1, FAILED=2; }

Implementation of Hypo in Java public class HypoInterpreter extends HypoState { public void load( ) {...} // load program into memory public void emulate( ) { PC=0; ACC=0; status=RUNNING; do { // fetch: HypoInstruction instr=code[PC++]; // analyze: byte op=instr.op; byte d=instr.d; // execute: switch (op) { ... // see details on next page } } while (status==RUNNING); } }

Implementation of Hypo in Java // execute switch (op) { case STOREop: data[d]=ACC; break; case LOADop: ACC=data[d]; break; case LOADLop: ACC=d; break; case ADDop: ACC+=data[d]; break; case SUBop: ACC-=data[d]; break; case JUMPop: PC=d; break; case JUMPZop: if (ACC==0) PC=d; break; case HALTop: status=HALTED; break; default: status=FAILED; }

Iterative Interpretation of Mini-Basic • Programming languages can be interpreted iteratively unless they have recursive syntactic structures such as Command ::= if Expression then Command else Command • EBNF for Mini-Basic: Program ::= Command* Command ::= Variable = Expression | read Variable | write Variable | go Label | if Expression RelationalOp Expression go Label | stop

Iterative Interpretation of Mini-Basic • EBNF for Mini-Basic (continued): Expression ::= PrimaryExpression | Expression ArithmeticOp PrimaryExpression PrimaryExpression ::= Numeral | Variable | ( Expression ) ArithmeticOp ::= + | – | * | / RelationalOp ::= = | \= | < | > | =< | >= Variable ::= a | b | c | … | z Label ::= Digit Digit* Numeral ::= • The symbol Numeral denotes floating-point literals • 26 predefined variables: a, b, c, …, z

Mini-Basic Interpreter • Mini-Basic example code: 0 read a • b=a/2 • go 4 • b=(a/b+b)/2 • d=b*b–a • if d>=0 go 7 • d=0–d • if d>=0.01 go 3 • write b • stop

Mini-Basic Interpreter • Mini-Basic abstract machine: • Data store: array of 26 floating-point values • Code store: array of commands • Possible representations for each command: • Character string (yields slowest execution) • Sequence of tokens (good compromise) • AST (yields slowest response time)

Implementing a Mini-Basic Interpreter in Java class Token { byte kind; String spelling; } class ScannedCommand { Token[ ] tokens; } public abstract class Command { public void execute (MiniBasicState state); }

Implementing a Mini-Basic Interpreter in Java public class MiniBasicState { public static final short CODESIZE=4096; public static final short DATASIZE=26; public ScannedCommand[ ] code=new ScannedCommand[CODESIZE]; public float[ ] data=new float[DATASIZE]; public short PC; public byte status; public static final byte RUNNING=0, HALTED=1, FAILED=2; }

Implementing a Mini-Basic Interpreter in Java public class MiniBasicInterpreter extends MiniBasicState { public void load( ) {...} // load program into memory public static Command parse(ScannedCommand scannedCom) {...} // return a Command AST public void run( ) { PC=0; status=RUNNING; do { // fetch: ScannedCommand scannedCom=code[PC++]; // analyze: Command analyzedCom=parse(scannedCom); // execute: analyzedCom.execute((MiniBasicState) this); } while (status==RUNNING); } }

Implementing a Mini-Basic Interpreter in Java public class AssignCommand extends Command { byte V; // left side Expression E; // right side public void Execute(MiniBasicState state) { state.data[V]=E.evaluate(state); } } public class GoCommand extends Command { short L; // destination label public void Execute(MiniBasicState state) { state.PC=L; } } // ReadCommand, WriteCommand, IfCommand, StopCommand, Expression, etc.

Recursive Interpretation • Recursively defined languages cannot be interpreted iteratively (fetch-analyze-execute), because each command can contain any number of other commands • Both analysis and execution must be recursive (similar to the parsing phase when compiling a high-level language) • Hence, the entire analysis must precede the entire execution: • Step 1: Fetch and analyze (recursively) • Step 2: Execute (recursively) • Execution is a traversal of the decorated AST, hence we can use a new visitor class • Values (variables and constants) are handled internally

Recursive Interpretation of Mini-Triangle public abstract class Value { } public class IntValue extends Value { public short i; } public class BoolValue extends Value { public boolean b; } public class UndefinedValue extends Value { }

Recursive Interpretation of Mini-Triangle public class MiniTriangleState { public static final short DATASIZE=...; Program program; // code store is the decorated AST Value[ ] data=new Value[DATASIZE]; public byte status; public static final byte RUNNING=0, HALTED=1, FAILED=2; }

Recursive Interpretation of Mini-Triangle public class MiniTriangleProcessor extends MiniTriangleState implements Visitor { public void fetchAnalyze( ) { Parser parser=new Parser(...); Checker checker=new Checker(...); StorageAllocator allocator= new StorageAllocator( ); program=parser.parse( ); checker.check(program); allocator.allocateAddresses(program); } public void run( ) { program.C.visit(this,null); } }

Recursive Interpretation of Mini-Triangle public Object visitIfCommand( IfCommand com, Object arg) { BoolValue val=(BoolValue) com.E.visit(this,null); if (val.b) com.C1.visit(this,null); else com.C2.visit(this,null); return null; }

Recursive Interpretation of Mini-Triangle public Object visitConstDeclaration( ConstDeclaration decl, Object arg) { KnownAddress entity=(KnownAddress) decl.entity; Value val=(Value)decl.E.visit(this,null); data[entity.address]=val; return null; }

Case Study: TAM Interpreter • Variable for each register • Array for code store (Instruction data type) • Array for data store (short type; used for both stack and heap) • Iterative interpretation similar to Hypo • Addressing: private static short relative ( short d, byte r) { switch(r) { case SBr: return d+SB; case LBr: return d+LB; case L1r: return d+data[LB]; case L2r: return d+data[data[LB]]; ... }

Case Study: TAM Interpreter • Use of addressing: switch(op) { case LOADop: { // push onto stack short addr=relative(d,r); data[ST++]=data[addr]; break; } case STOREop: { // pop from stack short addr=relative(d,r); data[addr]=data[––ST]; break; } ... }

Usage of the TAM Interpreter • First write a Triangle program. Assume it is stored in a file “example.tri” within the same folder that contains the Triangle and TAM subfolders. • Next compile your Triangle program. The command shown below produces an equivalent TAM program in the default file “obj.tam”: java Triangle/Compiler example.tri • To run this TAM program: java TAM/Interpreter obj.tam • To view the TAM program in human-readable form: java TAM/Disassembler obj.tam

For further information • More details about all these interpreters (Hypo, Mini-Basic, Mini-Triangle, TAM) can be found in the textbook

Compiler Architecture: Interpretation vs Compilation

Compiler Architecture: Interpretation vs Compilation

Presentation Transcript

Course Overview

Course Overview

Course Overview

Course Overview

Course Overview

Course Overview

Course Overview

COURSE OVERVIEW

Course overview

Course overview

Course overview

Course overview

Course Overview

Course Overview

Course Overview

Course Overview

Course Overview

Course Overview

Course Overview

Course Overview