1 / 32

컴파일러 입문

컴파일러 입문. 제 9 장 중 간 언어. Contents. Introduction Polish Notation Three Address Code Tree Structured Code A bstract M achine C ode Concluding Remarks. Lexical Analyzer. tokens. Syntax Analyzer. AST. Back-End. Semantic Analyzer. Intermediate Code Generator.

abe
Download Presentation

컴파일러 입문

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compiler Lecture Note, Intermediate Language 컴파일러 입문 제 9 장 중 간 언어

  2. Compiler Lecture Note, Intermediate Language Contents Introduction Polish Notation Three Address Code Tree Structured Code Abstract Machine Code Concluding Remarks

  3. Lexical Analyzer tokens Syntax Analyzer AST Back-End Semantic Analyzer Intermediate Code Generator Code Optimizer IC Target Code Generator IL Compiler Lecture Note, Intermediate Language Introduction • Compiler Model Source Program Object Program Front-End Front-End - language dependant part Back-End - machine dependant part

  4. Compiler Lecture Note, Intermediate Language • IL의 필요성 • Modular Construction • Automatic Construction • Easy Translation • Portability • Optimization • Bootstrapping • IL의 분류 • Polish Notation --- Postfix, IR • Three Address Code --- Quadruple, Triple, Indirect triple • Tree Structured Code --- PT, AST, TCOL • Abstract Machine Code --- P-code, EM-code, U-code, Bytecode

  5. Compiler Lecture Note, Intermediate Language • Two level Code Generation • ILS • 소스로부터 자동화에 의해 얻을 수 있는 형태 • 소스 언어에 의존적이며 high level이다. • ILT • 후단부의 자동화에 의해 목적기계로의 번역이 매우 쉬운 형태 • 목적기계에 의존적이며 low level이다. • ILS to ILT • ILS에서 ILT로의 번역이 주된 작업임. Source Front-End ILS ILS-ILT ILT Back-End Target

  6. Compiler Lecture Note, Intermediate Language Polish Notation ☞ Polish mathematician Lucasiewiezinvented the parenthesis-free notation. • Postfix(Suffix) Polish Notation • earliest IL • popular for interpreted language - SNOBOL, BASIC • general form : e1 e2 ... ekOP (k ≥ 1) where, OP : k_ary operator ei : any postfix expression (1 ≤ i ≤ k)

  7. Compiler Lecture Note, Intermediate Language • example : if a then if c-d then a+c else a*c else a+b 〓〉a L1 BZ c d - L2 BZ a c + L3 BR L2: a c * L3 BR L1: a b + L3: • note 1) high level: source to IL - fast & easy translation IL to target - difficulty 2) easy evaluation - operand stack 3) optimization 부적당- 다른 IL로의 translation 필요 4) parentheses free notation - arithmetic expression • interpretive language에 적합 Source Translator Postfix Evaluator Result

  8. Compiler Lecture Note, Intermediate Language • Internal Representation(IR) • low-level prefix polish notation - addressing structure of target machine • compiler-compiler IL - table driven code generation • IR program - a sequence of root-level IR expression • IR expression: OP e1 e2 ... ... ek (k ≥ 1)where, OP: k-ary operator - 1-1 correspondence with target machine instruction. ┌─ root-level operator - not appear in an operand │ ⇒ root-level IR expression. └─ internal operator - appear in an operand ⇒ internal IR expression. ei : operand --- single symbol or internal IR expression.

  9. Compiler Lecture Note, Intermediate Language • example D := E ⇔ := + d r ↑ + e r where, r : local base register d, e : location of variable D and E + : additive operator ↑ : unary operator giving the value of the location := : assignment operator(root-level) • example FOR D := E TO F DO Loop body; D := E; TEMP := F; GOTO 2 1: Loop body D := D + 1; 2: IF D <= TEMP THEN GOTO 1; := + d r ↑+ e r := + temp r ↑+ f r j L2 :L1 Loop body := + d r + ↑+ d r 1 :L2 <= L1 ? ↑+ d r ↑+ temp r

  10. Compiler Lecture Note, Intermediate Language • Note 1) Shift-reduce parser --- prefix : fewer states than postfix 2) Several addressing mode ┌─ prefix : operator만 보고 결정(no backup) └─ postfix : backup 필요 ex) assumption: first operand computed in register r. r.1 ::= (/ d. 1 r. 2) r.1 ::= (+ r. 1 r. 2) ┌ prefix - [r -> / . d r] │ first operand changed to d and continue └ postfix - [r -> . d r /] [r -> . r r +] shift r, shift r and block([r -> r r . +]) ⇒ backup 3) Easy translation IR to target - easy source to IR - difficulty

  11. Compiler Lecture Note, Intermediate Language Three Address Code • most popular IL, optimizing compiler • General form: A := B op C where, A : result address B, C : operand addresses op : operator (1) Quadruple - 4-tuple notation <operator>,<operand1>,<operand2>,<result> (2) Triple - 3-tuple notation <operator>,<operand1>,<operand2> (3) Indirect triple - execution order table & triples

  12. Compiler Lecture Note, Intermediate Language • example • A ← B + C * D / E • F ← C * D

  13. Compiler Lecture Note, Intermediate Language • Note • Quadruple vs. Triple • quadruple - optimization 용이 • triple - removal of temporary addresses ⇒ Indirect Triple • extensive code optimization 용이 • IL rearrange 가능 (triple 제외) • easy translation - source to IL • difficult to generate good code • quadruple to two-address machine • triple to three-address machine

  14. Compiler Lecture Note, Intermediate Language • Abstract Syntax Tree • parse tree에서 redundant한 information 제거. • ┌ leaf node --- variable name, constant └ internal node --- operator • [예제 8] --- Text p.377 { x = 0; y = z + 2 * y; while ((x<n) and (v[x] != z)) x = x+1; return x; } Tree Structured Code

  15. Compiler Lecture Note, Intermediate Language • Tree Structured Common Language(TCOL) • Variants of AST - containing the result of semantic analysis. • TCOL operator - type & context specific operator • Context ┌ value ----- rhs of assignment statement ├ location ----- lhs of assignment statement ├ boolean ----- conditional control statement └ statement ----- statement ex) . : operand --- location result --- value while : operand --- boolean, statement result --- statement

  16. AST: assign b add a 1 Compiler Lecture Note, Intermediate Language Example)int a; float b; ... b = a + 1; Example)int a; float b; ... b = a + 1; • Representation ----- graph orientation ┌ internal notation ------ efficient └ external notation ------ debug, interface linear graph notation TCOL: assign b float addi . 1 a

  17. Compiler Lecture Note, Intermediate Language • Note • AST ----- automatic AST generation(output of parser) ParserGenerator ┌ leaf node specification └ operator node specification • TCOL ----- automatic code generation : PQCC (1) intermediate level: high level --- parse tree like notation control structure low level --- data access (2) semantic specification: dereferencing, coercion, type specific operator dynamic subscript and type checking (3) loop optimization ----- high level control structure easy reconstruction (4) extensibility ----- define new TCOL operator

  18. M front-ends + M compilers for N target machines N back-ends Compiler Lecture Note, Intermediate Language • Motivation • ┌ rapid development of machine architectures └ proliferation of programming languages • portable & adaptable compiler design --- P_CODE • porting --- rewriting only back-end • compiler building system --- EM_CODE Abstract Machine Code

  19. Compiler Lecture Note, Intermediate Language • Model target code interface source program front-end back-end target machine abstract machine code abstract machine interpreter

  20. Compiler Lecture Note, Intermediate Language • Pascal-P Code • Pascal P Compiler --- portable compiler producing P_CODE for an abstract machine(P_Machine). • P_Machine ----- hypothetical stack machine designed for Pascal language. (1) Instruction --- closely related to the PASCAL language. (2) Registers ┌ PC --- program counter │ NP --- new pointer │ SP --- stack pointer └ MP --- mark pointer (3) Memory ┌ CODE --- instruction part └ STORE --- data part(constant area, stack, heap)

  21. PC MP current activation record SP NP stack heap Compiler Lecture Note, Intermediate Language CODE STORE constant area

  22. Compiler Lecture Note, Intermediate Language Ucode • Ucode • the intermediate form used by the Stanford Portable Pascal compiler. • stack-based and is defined in terms of a hypothetical stack machine. • Ucode Interpreter : Appendix B. • Addressing • stack addressing ===> a tuple : (B, O) • B : the block number containing the address • O : the offset in words from the beginning of the block, offsets start at 1. • label • to label any Ucode instruction with a label field. • All targets of jumps and procedures must be labeled. • All labels must be unique for the entire program.

  23. Compiler Lecture Note, Intermediate Language • Example : • Consider the following skeleton : program main procedure P procedure Q var i : integer; j : integer; • block number • main : 1 • P : 2 • Q : 3 • variable addressing • i : (3,1) • j : (3,2)

  24. Compiler Lecture Note, Intermediate Language • Ucode Operations(35개) • Unary --- notop, neg • Binary --- add, sub, mult, divop, modop, swp andop, orop, gt, lt, ge, le, eq, ne • Stack Operations --- lod, str, ldr, ldp • Immediate Operation --- ldc • Control Flow --- ujp, tjp, fjp, cal, ret • Range Checking --- chkh, chkl • Indirect Addressing --- ixa, sta • Procedure Specification --- proc, endop • Program Specification --- bgn • Procedure Calling Sequence --- cal • Symbol Table Information --- sym

  25. Compiler Lecture Note, Intermediate Language • Example : • x = a + b * c; lod 1 1 /* a */ lod 1 2 /* b */ lod 1 3 /* c */ mult add str 1 4 /* x */ • if (a>b) a = a + b; lod 1 1 /* a */ lod 1 2 /* b */ gt fjp next lod 1 1 /* a */ lod 1 2 /* b */ add str 1 1 /* a */ next

  26. Compiler Lecture Note, Intermediate Language • Indirect Addressing • is used to access both array elements and var parameters. • ixa --- indirect load • replace stacktop by the value of the item at location stacktop. • to retrieve A[i] : lod i /* actually (Bi, Oi)) */ ldr A /* also (block number, offset) */ add /* effective address */ ixa /* indirect load gets contents of A[i] */ • to retrieve var parameter x : lod x /* loads address of actual - since x is var */ ixa /* indirect load */

  27. Compiler Lecture Note, Intermediate Language • sta --- indirect store • sta stores stacktop into the address at stack[stacktop-1], both items are popped. • A[i] = j; lod i ldr A add lod j sta • x := y, where x is a var parameter lod x lod y sta

  28. Compiler Lecture Note, Intermediate Language • Procedure Calling Sequence • procedure definition : • procedure A(var a : integer; b,c : integer); • procedure call : • A(x, expr1, expr2); • calling sequence : ldp ldr x /* load the address of actual for var parameter */ … /* code to evaluate expr1 --- left on the stack */ … /* code to evaluate expr2 --- left on the stack */ cal A

  29. Compiler Lecture Note, Intermediate Language • Ucode Interpreter • The Ucode interpreter is called ucodei, it’s source is on plac.dongguk.ac.kr. • The interpreter uses the following files : • *.ucode : file containing the Ucode program. • *.lst : Ucode listing and output from the program. • Ucode format label-field op-code operand-field 1-10 12-m m+2 • m is exactly enough to hold opcode. • label field --- a 10 character label(make sure its 10 characters pad with blanks) • op-code --- starts at 12 column.

  30. Compiler Lecture Note, Intermediate Language Programming Assignment #3 • 부록 B에 수록된 Ucode 인터프리터를 각자 PC에 설치하고 100이하의 소수(prime number)를 구하는 프로그램을 Ucode로 작성하시오. • 다른 문제의 프로그램을 작성해서 제출해도 됨. • Ucode 인터프리터 출력 리스트를 제출. • 참고 : • #1 : recursive-decent parser • #2 : MiniPascal LR parser

  31. Compiler Lecture Note, Intermediate Language • IL criteria • intermediate level • input language --- high level • output machine --- low level • efficient processing • translation --- source to IL, IL to target • interpretation • optimization • extensibility • external representation • clean separation • language dependence & machine dependence Concluding Remarks

  32. A : 좋다 B : 보통이다 C : 나쁘다 Compiler Lecture Note, Intermediate Language

More Related