340 likes | 450 Views
6.001 SICP Compilation. Context: special purpose vs. universal machines Compilation vs. Interpretation Evolution from ec-eval to compiler Why is compiled code more efficient than interpreter? Some ideas in compiler implementation Target, linkages, & register management.
E N D
6.001 SICPCompilation • Context: special purpose vs. universal machines • Compilation vs. Interpretation • Evolution from ec-eval to compiler • Why is compiled code more efficient than interpreter? • Some ideas in compiler implementation • Target, linkages, & register management
Example: gcd register machine a b a, b hold integers as input a holds gcd(a,b) rem Register Machine t Special-Purpose Machines Input Output
Universal Machine (gcd 12 8) 4 MIT Scheme (define (gcd a b) (if (= b 0) a (gcd b (remainder a b)))) Scheme Evaluator as Universal Machine • Able to reconfigure (or program) our universal machine to perform the desired computation
Universal Machine (gcd 12 8) 4 MIT Scheme mc-eval (define (gcd a b) (if (= b 0) a (gcd b (remainder a b)))) MC-Eval is a Universal Machine
C Language version of Scheme Evaluator Alternative Language versions possible Universal Machine MIT Scheme is a C program! 4 (gcd 12 8) (define (gcd a b) (if (= b 0) a (gcd b (remainder a b))))
Universal Machine Universal Register Machine (gcd 12 8) 4 ec-eval: Stored Machine Instructions (define (gcd a b) (if (= b 0) a (gcd b (remainder a b)))) EC-Eval – Register Machine Code
12 8 4 compiler gcd: Stored Machine Instructions Not necessarily a universal machine (define (gcd a b) (if (= b 0) a (gcd b (remainder a b)))) Compiler Universal Register Machine
Compilation – A Plan • Use same register machine as ec-eval... except we don't hard-wire in the ec-eval controller • Look at the machine instructions that ec-eval used • save these instructions for later execution • Compiler implementation: • ev-xxx serve as initial template for compile-xxx • We'll see that this is grossly inefficient • many machine instructions are executed during interpretation that are not necessary during compilation • so we will examine compiler ideas that lead us toward more efficient generated machine code
Register Machine for Compiler • 7 registers • exp temporary register (rarely used) • env current environment • continue return point • val resulting value • unev temporary register (rarely used) • proc operator value • argl argument values • Abstract operations • environment model, primitive procedures • syntax operations now needed by compiler but not in output compiled code
prepare to eval operator (save env, operands, continue) figure out what operator is... ah, it's a variable so look up ;ev-appl-did-operator store operator in proc Watching ec-eval on (f x) – pg. 1 • (save continue) • (save env) • (assign unev (op operands) (reg exp)) • (save unev) • (assign exp (op operator) (reg exp)) • (assign continue (label ev-appl-did-operator)) • (goto (label eval-dispatch)) • (test (op self-evaluating?) (reg exp)) • (branch (label ev-self-eval)) • (test (op variable?) (reg exp)) • (branch (label ev-variable)) • (assign val (op lookup-variable-value) (reg exp) (reg env)) • (goto (reg continue)) • (restore unev) • (restore env) • (assign argl (op empty-arglist)) • (assign proc (reg val)) • (test (op no-operands?) (reg unev))
branch never taken eval each of the operands... figure out how to evalfirst operand goto accum-last-arg at line 33 add to arg list have proc and argl; ready to apply Watching ec-eval on (f x) – pg. 2 • (branch (label apply-dispatch)) • (save proc) • (save argl) • (assign exp (op first-operand) (reg unev)) • (test (op last-operand?) (reg unev)) • (branch (label ev-appl-last-arg)) • (assign continue (label ev-appl-accum-last-arg)) • (goto (label eval-dispatch)) • (test (op self-evaluating?) (reg exp)) • (branch (label ev-self-eval)) • (test (op variable?) (reg exp)) • (branch (label ev-variable)) • (assign val (op lookup-variable-value) (reg exp) (reg env)) • (goto (reg continue)) • (restore argl) • (assign argl (op adjoin-arg) (reg val) (reg argl)) • (restore proc); computation proceeds at apply-dispatch
Key difference: compiling vs. interpreting • Interpreter/evaluator • See expression for first time at run-time • Have to test to determine what kind of expression • Compiler: • We see expression at compile-time • Can pre-determine what registers needed or changed • can be much more efficient in saves/restores!
save assign restore reference Register Usage – (f x) exp env val continue unev proc argl • eliminated syntax instructions used in ec-eval which can bepre-computed by the compiler • (save continue) • (save env) • (assign unev (op operands) (reg exp)) • (save unev) • (assign exp (op operator) (reg exp)) • (assign continue (label ev-appl-did-operator)) • (assign val (op lookup-variable-value) (reg exp) (reg env)) • (restore unev) • (restore env) • (assign argl (op empty-arglist)) • (assign proc (reg val)) • (save proc) • (save argl) • (assign exp (op first-operand) (reg unev)) • (assign continue (label ev-appl-accum-last-arg)) • (assign val (op lookup-variable-value) (reg exp) (reg env)) • (restore argl) • (assign argl (op adjoin-arg) (reg val) (reg argl)) • (restore proc)
save assign restore reference Don't need continuations: exp env val continue unev proc argl • no internal branches; don't need to save or assign continue register • (save continue) • (save env) • (assign unev (op operands) (reg exp)) • (save unev) • (assign exp (op operator) (reg exp)) • (assign continue (label ev-appl-did-operator)) • (assign val (op lookup-variable-value) (reg exp) (reg env)) • (restore unev) • (restore env) • (assign argl (op empty-arglist)) • (assign proc (reg val)) • (save proc) • (save argl) • (assign exp (op first-operand) (reg unev)) • (assign continue (label ev-appl-accum-last-arg)) • (assign val (op lookup-variable-value) (reg exp) (reg env)) • (restore argl) • (assign argl (op adjoin-arg) (reg val) (reg argl)) • (restore proc)
save assign restore reference (const f) (const x) Recognize form of (f x) exp env val continue unev proc argl • compiler knows operator & operands are variables (ec-eval had to discover this at runtime) • (save env) • (assign unev (op operands) (reg exp)) • (save unev) • (assign exp (op operator) (reg exp)) • (assign val (op lookup-variable-value)(reg exp) (reg env)) • (restore unev) • (restore env) • (assign argl (op empty-arglist)) • (assign proc (reg val)) • (save proc) • (save argl) • (assign exp (op first-operand) (reg unev)) • (assign val (op lookup-variable-value) (reg exp) (reg env)) • (restore argl) • (assign argl (op adjoin-arg) (reg val) (reg argl)) • (restore proc)
save assign restore reference Remove extras saves and restores exp env val continue unev proc argl • detect when no change to reg between save and restore • (save env) • (assign val (op lookup-variable-value) (const f) (reg env)) • (restore env) • (assign argl (op empty-arglist)) • (assign proc (reg val)) • (save proc) • (save argl) • (assign val (op lookup-variable-value) (const x) (reg env)) • (restore argl) • (assign argl (op adjoin-arg) (reg val) (reg argl)) • (restore proc)
save assign restore reference proc Optimize Register Target exp env val continue unev proc argl • assign to proc directly (no needto assign to val, then from val to proc) • (assign val (op lookup-variable-value) (const f) (reg env)) • (assign argl (op empty-arglist)) • (assign proc (reg val)) • (save argl) • (assign val (op lookup-variable-value) (const x) (reg env)) • (restore argl) • (assign argl (op adjoin-arg) (reg val) (reg argl))
save assign restore reference Optimize argument list exp env val continue unev proc argl • recognize that we have only one argument (adjoin directly onto empty list argl) • (assign proc (op lookup-variable-value) (const f) (reg env)) • (assign argl (op empty-arglist)) • (save argl) • (assign val (op lookup-variable-value) (const x) (reg env)) • (restore argl) • (assign argl (op list) (reg val))
Our compiler: (compile '(f x) 'val 'next) results ; Same three lines from our register usage exercise: (assign proc (op lookup-variable-value) (const f) (reg env)) (assign val (op lookup-variable-value) (const x) (reg env)) (assign argl (op list) (reg val)) ; standard pattern for <apply-dispatch> (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch9)) compiled-branch8 (assign continue (label after-call7)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch9 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) after-call7 ; falls through to "next" code
Machine Instruction Sequence (compile exp target linkage) Compiler Implementation Concepts • InstructionSequences • compile a given expression • produce and append instruction sequences together as needed • Target • the register desired to hold value at end of instruction sequence • Linkages • how to connect instruction sequence to other instruction sequences (next, return, named label) (f x)
Examples – Targets & Linkages (compile <exp> <target><linkage>) • Different Targets: (compile 'x 'val'next) (assign val (op lookup-variable-value) (const x) (reg env)) (compile 'x 'proc'next) (assign proc (op lookup-variable-value) (const x) (reg env)) • Linkage 'return: • (compile 'x 'val 'return) • (assign val (op lookup-variable-value) (const x) (reg env)) • (goto (reg continue)) • Linkage '<named-label>: • (compile 'x 'val 'after-x-lookup) • (assign val (op lookup-variable-value) (const x) (reg env)) • (goto (label after-x-lookup))
compiler – dispatch on expression type (define (compile exp target linkage) (cond ((self-evaluating? exp) (compile-self-evaluating exp target linkage)) ((quoted? exp) (compile-quoted exp target linkage)) ((variable? exp) (compile-variable exp target linkage)) ((assignment? exp) (compile-assignment exp target linkage)) ((definition? exp) (compile-definition exp target linkage)) ((if? exp) (compile-if exp target linkage)) ((lambda? exp) (compile-lambda exp target linkage)) ((begin? exp) (compile-sequence (begin-actions exp) target linkage)) ((application? exp) (compile-application exp target linkage)) (else (error "Unknown exp type -- COMPILE” exp))))
Look at a basic case: • Back in ec-eval, we had: ev-self-eval (assign val (reg exp)) (goto (reg continue)) • Compiled code template: (assign <target> (const <expression>)) <linkage> • Compiler procedure (partial) that handles linkage: (define (compile-self-evaluating exp targetlinkage) (end-with-linkage linkage `((assign ,target (const ,exp)))))
Compiling Linkages (define (end-with-linkage linkage instruction-sequence) (preserving '(continue) ;ignore this for now instruction-sequence (compile-linkage linkage))) (define (compile-linkage linkage) (cond ((eq? linkage 'return) (make-instruction-sequence '(continue) '() '((goto (reg continue))))) ((eq? linkage 'next) (empty-instruction-sequence)) (else (make-instruction-sequence '() '() `((goto (label ,linkage)))))))
Compiling Application – More General Case • In general, may need to save/restore registers;e.g. eval of operator may itself modify env, continue [(save continue)] [(save env)] <evaluate operator; result in proc> [(restore env)] [(save proc)] <evaluate operands; result in argl> [(restore proc)] [(restore continue)] <apply procedure in proc to args in argl, and link> • Paranoid/simple approach: save & restore anyway • More efficient approach: manage register usage carefully
registers modified by sequence registers needed (used by sequence) machine statements (machine code) Managing Registers – Encode the Contract! ; needs: exp expression to evaluate ; env environment ; continue return point ; output: val value of expression ; modifies: unev ; stack: unchanged (make-instruction-sequence needsmodifiesstatements) instructionsequence • "instruction sequence" data structure: • remember registers needed and modified in addition to code
needs-1 UNION (needs-2 – modifies-1) code-1 code-2 code-1 code-2 modifies-1 modifies-2 modifies-1 UNION modifies-2 needs-1 needs-2 Appending Machine Instructions • If we know code-1 and code-2 don't interfere, just append code • Update merged "needs" and "modifies" data for combined sequence (append-instruction-sequences seq-1 seq-2)
needs-1 UNION (needs-2 – modifies-1) restore code-1 code-2 code-1 code-2 modifies-1 modifies-2 modifies-1 UNION modifies-2 needs-1 needs-2 save regs (needs-2 ANDmod-1) Preserving Needed Registers • A "smart" append of two code sequences which "preserves" registers • save any of specified registers if (and only if) seq-1 clobbers them (preserving regs seq-1 seq-2)
Now Understand Compile of Application [(save continue)] [(save env)] <evaluate operator; result in proc> [(restore env)] [(save proc)] <evaluate operands; result in argl> [(restore proc)] [(restore continue)] <apply procedure in proc to args in argl, and link> (define (compile-application exp target linkage) (let ((proc-code (compile (operator exp) 'proc 'next)) (arg-codes (map (lambda (arg) (compile arg 'val 'next)) (operands exp)))) (preserving '(env continue) proc-code (preserving '(proc continue) (construct-arglist arg-codes) (compile-procedure-call target linkage))))) Same template as before
decompile Machine Instruction Sequence Scheme expression De-compilation & Optimization (assign proc (op lookup-variable-value) (const +) (reg env)) (assign val (const 10)) (assign argl (op list) (reg val)) (test (op primitive-procedure?) (reg proc)) (branch (label primitive-branch9)) compiled-branch8 (assign continue (label after-call7)) (assign val (op compiled-procedure-entry) (reg proc)) (goto (reg val)) primitive-branch9 (assign val (op apply-primitive-procedure) (reg proc) (reg argl)) after-call7 (+ 10) • Optimization: constant '(10) directly into argl • Optimization?: replace whole thing with value 10
Summary • Compilation vs. Interpretation • Interpretation: evaluate expressions at run-time • Compilation: analyze program code and generate register machine instructions for later execution • Some ideas in compilation – towards efficient code • Templates, targets & linkages • Strategies for register management • Many further optimizations possible!
May 3, 2000 Recitation Problem 6.001 • Decompile the following register code (what expression was compiled to produce these machine instructions)? (assign proc (op lookup-variable-value) (const list) (reg env)) (assign val (op lookup-variable-value) (const y) (reg env)) (assign argl (op list) (reg val)) (assign val (op lookup-variable-value) (const x) (reg env)) (assign argl (op cons) (reg val) (reg argl)) (assign val (op lookup-variable-value) (const +) (reg env)) (assign argl (op cons) (reg val) (reg argl)) ;; proceed with apply-dispatch pattern
Watching ec-eval on (f x) – pg. 1 • (save continue) • (save env) • (assign unev (op operands) (reg exp)) • (save unev) • (assign exp (op operator) (reg exp)) • (assign continue (label ev-appl-did-operator)) • (goto (label eval-dispatch)) • (test (op self-evaluating?) (reg exp)) • (branch (label ev-self-eval)) • (test (op variable?) (reg exp)) • (branch (label ev-variable)) • (assign val (op lookup-variable-value) (reg exp) (reg env)) • (goto (reg continue)) • (restore unev) • (restore env) • (assign argl (op empty-arglist)) • (assign proc (reg val)) • (test (op no-operands?) (reg unev)) prepare to eval operator (save env, operands, continue) figure out what operator is... ah, it's a variable so look up store operand in proc
Watching ec-eval on (f x) – pg. 2 branch never taken • (branch (label apply-dispatch)) • (save proc) • (save argl) • (assign exp (op first-operand) (reg unev)) • (test (op last-operand?) (reg unev)) • (branch (label ev-appl-last-arg)) • (assign continue (label ev-appl-accum-last-arg)) • (goto (label eval-dispatch)) • (test (op self-evaluating?) (reg exp)) • (branch (label ev-self-eval)) • (test (op variable?) (reg exp)) • (branch (label ev-variable)) • (assign val (op lookup-variable-value) (reg exp) (reg env)) • (goto (reg continue)) • (restore argl) • (assign argl (op adjoin-arg) (reg val) (reg argl)) • (restore proc); computation proceeds at apply-dispatch eval each of the operands... figure out how to evalfirst operand goto accum-last-arg at line 33 add to arg list have proc and argl; ready to apply