1 / 38

CSC 415: Translators and Compilers Spring 2009

CSC 415: Translators and Compilers Spring 2009. Chapter 6 Run-time Organization. Run-time Organization. Marshal the resources of the target machine (instructions, storage, and system software) in order to implement the source language. Chapter 6: Run-time Organization. Data Representation

norina
Download Presentation

CSC 415: Translators and Compilers Spring 2009

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC 415: Translators and CompilersSpring 2009 Chapter 6 Run-time Organization

  2. Run-time Organization • Marshal the resources of the target machine (instructions, storage, and system software) in order to implement the source language

  3. Chapter 6: Run-time Organization • Data Representation • How should we represent the values of each source-language type in the target machine? • Expression Evaluation • How should we organize the evaluation of expressions, taking care of intermediate results? • Static Storage Allocation • How should we organize storage for variables, taking into account the different lifetimes of global, local, and heap variables? • Stack Storage Allocation • Routines • How should we implement procedures, functions, and parameters, in terms of low-level routines? • Heap Storage Allocation • Run-time Organization for Object-oriented Languages • How should we represent objects and methods? • Case Study: The Abstract Machine TAM

  4. Data Representation • How should we represent the values of each source-language type in the target machine? • High-Level Data Types • Truth values • Integers • Characters • Records • Arrays • Operations over these types • Machine Data Types • Bits • Bytes • Words • Double-words • Low-level arithmetic and logical operations Need to bridge the semantic gap between high-level types and machine level types

  5. Data Representation -- Fundamental Principles • Non-confusion • Different values of a given type should have different representations • If two different values are confused, i.e., have the same representation, then comparison of these values will incorrectly treat the values as equal • Example: approximate representation of real numbers • Real numbers that are slightly different mathematically might have the same approximate representation • Difficult to avoid – need to take care during compiler design • Must avoid confusion in the representations of discrete types such as truth values, characters, and integers • For statically typed languages need only be concedrned with values of the same type • 00…002 may represent false, the integer 0, the real number 0.0 • Compile time type checks will denote the values of different types

  6. Data Representation -- Fundamental Principles • Uniqueness • Each value should always have the same representation • Example of non-uniqueness • Ones-complement representation of integers in which zero is represented both by 00...002 and 11…112 (+0 and –0) • A simple bit-string co0parison would incorrectly treat these values as unequal • More specialized integer comparison must be used • Alternative twos-complement representation gives us unique representations of integers

  7. Data Representation – Pragmatic Issues • Constant-size representation • The representations of all values of a given type should occupy the same amount of space • Make possible for compiler to plan the allocation of storage • Knowing the type of variable but not the actual value, the compiler will know exactly how much storage space the variable will occupy

  8. Data Representation – Pragmatic Issues • Direct representation vs. indirect representation • Should the values of a given type be represented directly, or indirectly through pointers? • Direct representation • Just the binary representation of the value consisting of one or more bits, bytes, words • Indirect representation • A handle that points to the storage area which has the binary representation of the value • Essential for types whose values vary greatly in size • List or dynamic array

  9. Same type as x but requiring more space x x handle handle y Direct representation vs. indirect representation

  10. Notation • #T: cardinality of type T • Number of distinct values of type T • #[[Boolean]] = 2 • Size T: amount of space (in bits, bytes, or words) occupied by each value of type T • For indirect representation only handle is counted • For direct representation of type T • size T  log2 (#T) or 2(size T) #T • size T is represented in bits • In n bits we can represent at most 2n distinct values if we are to avoid confusion  non-confusion requirement

  11. Primitive Types • Cannot be decomposed into simpler values • Most programming languages provide these primitive types • Boolean, Char, Integer • Also provide elementary logical and arithmetic operations • Machines typically support the above primitive types, so choice of representation is straightforward

  12. Primitive Types Representation • Boolean • true and false • Since #[[Boolean]] = 2 then size[[Boolean]]  1 bit • Can represent Boolean with one bit, one bye, or one word • For single bit: 0 for false and 1 for true • For byte or word: 00…002 for false and either 00…012 or 11…112 for true • Negation, conjunction, disjunction  NOT, AND, OR

  13. Primitive Types Representation • Char • Source language can specify character set • Ada: ISO-Latin1 character set (28 distinct characters) • Java: Unicode character set (216 distinct characters) • Most do not • Allows compiler writers to choose the machine’s native character set (27 or 28 distinct characters) • ISO defines character representation for “A” to be 010000012 • Can represent a character by one byte or one word

  14. Primitive Types Representation • Integer • Denotes an implementation-defined bounded range of integers • Defined by the individual language processor • Binary representation determined by target machine’s arithmetic unit and almost always occupies one word • Can implement language’s integer operations with machine's integer operations • Pascal and Triangle • -maxint, …, -1, 0, +1, …, +maxint • maxint is implementation defined • #[[Integer]] = 2 X maxint + 1 • 2size[[Integer]] 2 X maxint + 1 • For word size of w bits, size[[Integer]] = w, maxint = 2w-1 – 1 • Java • Int denotes –231, …, -1, 0, +1, …, +231 – 1 • #[[Int]] = 232

  15. Record Type • Consists of several fields, each of which has an identifier • All records of a particular type have fields with the same identifiers and types • Fundamental operation on records is fieldselection • Use one field identifier to access the corresponding field • Simple representation • Juxtapose the fields to make them occupy consecutive positions in storage • Allows us to predict total sized of each record and the position of each field relative to the base of the record

  16. Record Type • Consider the following type T = record I1: T1, …, In: Tn end; var r: T • size T = size T1 + … + size Tn • If size T1, .., and size Tn are all constant, then size T is also constant • Implementation of field selection • Address[[r.Ii]] = address r + (size T1 + … + size Ti-1) Some machines have alignment restrictions, which force unused space to be left between record fields; cannot use these equations Value of type T1 r.I1 r.I2 Value of type T2 … … r.In Value of type Tn

  17. Disjoint Unions • Tag and a variant part • Value of tag determines type of variant part • T = T1 + … + Tn • In each value of type T, the variant part is a value chosen from one of the types T1, …, or Tn; the tag indicates which one • Size T = size Ttag + max(sizeT1, …, size Tn) • Address[[u.Itag]] = address u + 0 • Address[[u.Ii]] = address u + size Ttag value of type Ttag Will have wasted space u.Itag u.Itag u.Itag value of type T2 value of type Tn value of type T1 u.I2 u.I1 u.In … or … or Max(sizeT1,…,sizeTn) Wasted space

  18. Static Arrays • Consists of several elements, all of the same type • Bounded range of indices – usually integers • Each index has exactly one element • Fundamental operation on arrays is indexing • Access an individual element by giving its index • Index evaluated at run-time • Static Array • Index bounds are known at compile-time • Direct representation is to juxtapose the array elements, in order of increasing indices. • Implemented by run-time address computation

  19. Static Arrays (lower index bound is 0) • Consider the following example Type T = array n of Telem; Var a: T • Size T = n X size Telem • The number of elements n is constant, so size Telem is constant, then size T is also constant • Address[[a[i] ]] = address a + (i X size Telem) • Since i is known only at run-time, an array indexing implies a run-time address computation a[0] a[1] a[2] values of type Telem a[n-1]

  20. Static Arrays (programmer chooses lower and upper array bounds) a[l] a[l+1] • Consider the following example Type T = array [l..u] of Telem; Var a: T • size T = (u - l + 1) X size Telem • The number of elements (u – l + 1) is constant, so size Telem is constant, then size T is also constant • address[[a[i] ]] = address a + (i – l) X size Telem) = address a – (l X size Telem) + (i X size Telem) • Address[[a[0] ]] = address a – (l X size Telem) • Address[[a[i] ]] = address[[a[0] ]] + (i X size Telem) • Since i is known only at run-time, an array indexing implies a run-time address computation • Index check must ensure that l  i  u a[l+2] values of type Telem a[u]

  21. Dynamic Arrays • An array whose index bounds are not know until run-time • Different dynamic arrays of the same type may have different index bounds, and therefore different numbers of elements • Need to satisfy constant-size requirement • Create array descriptor or handle • Pointer to the array’s elements • Index bounds • Handle has constant size

  22. Dynamic Arrays • Ada example Type T is array [Integer range <>) of Telem; a: T (E1 .. E2); • size T = address:size + 2 X size[[Integer]] • Address:size is the amount of space required to store an address – usually one word. • Satisfies constant-size requirement • Declaration of array variable a: • E1 and E2 are evaluated to yield a’s index bounds (say l and u) • Space is allocated for (u – l + 1) elements, juxtaposed and separate from a’s handle • Address[[a(0)]] = address[[a(l)]] – (l X size Telem) • Values for address[[a(0)]], l, and u are stored in a’s handle • The element with index i will be address as follows: • Address[[a(i)]] = address[[a(0)]] + (i X size Telem) = content(address[[a]]) + (i X size Telem) • Index check is l  i  u where l = content(address[[a]] + address:size) and u = content(address[[a]]+ address:size + size[[Integer]]

  23. Dynamic Arrays a[l] a[l+1] a[l+2] a[0] origin a lower bound l upper bound u handle a[u] elements of type Telem

  24. Status • Chapter 6: Run-time Organization • Data Representations • Primitive types • Record types • Disjoint unions • Static arrays • Dynamic arrays • Recursive types • Expression Evaluation • Register machine • Stack machine • Static Storage Allocation • Global variables • Stack Storage Allocation • Local variables

  25. Recursive Types • Defined in terms of itself • Values of recursive type T have components that are themselves of type T • Examples • List with tail being itself a list • Tree with the sub-trees themselves being trees

  26. Recursive Types • Consider the Pascal declaration type IntList = ^IntNode; IntNode = record head: Integer; tail: IntList end; var primes: IntList • Size[[IntList]] = address:size (usually 1 word) primes handle Always use pointers to represent values of the recursive type

  27. Expression EvaluationRegister Machine • How should we organize the evaluation of expressions • The problem is the need to keep intermediate results somewhere • Consider the expression a * b + (1 – (c * 2)) • Will have intermediate results for a * b, c * 2, and 1 – (c * 2) • For a register based machine (non-stack machine) • Use the registers to store intermediate results • Problem arises when there are not enough registers for all intermediate results

  28. Expression EvaluationExample a * b + (1 – (c * 2)) LOAD R1 a MULT R1 b LOAD R2 #1 LOAD R3 c MULT R3 #2 SUB R2 R3 ADD R1 R2 a, b, c are memory addresses for the values of a, b, c

  29. Expression EvaluationStack Machine • The machine provides a stack for holding intermediate results • For the expression a * b + (1 – (c * 2)) LOAD a LOAD b MULT LOADL 1 LOAD c LOADL 2 MULT SUB ADD

  30. Expression EvaluationStack Machine Example a * b + (1 – (c * 2)) (1) After LOAD a (2) After LOAD b (3) After MULT (4) After LOAD 1 value of a value of a value of a*b value of a*b value of b 1 unused space (5) After LOAD c (6) After LOAD 2 (7) After MULT (8) After SUB value of a*b value of a*b value of a*b value of a*b 1 value of 1-(c*2) 1 1 value of c value of c*2 value of c 2 (9) After ADD value of (a*b)+(1-(c*2)) Operands of different types (and therefore different sizes) can be evaluated in just the same way. E.g., AND, OR, function, etc. Each operation takes values from top of stack and places results onto top of stack

  31. Static Storage AllocationGlobal Variables • Each variable in source program requires enough storage to contain any value that might be assigned to it • As a consequence of constant-size representation, the compiler knows how much storage needs to be allocated to variable, based on type of variable (size T) • Global variables • Variables that exist and take up storage throughout the program’s run-time. • Static storage allocation: Compiler locates these variables at some fixed positions in storage (decides each global variable’s address relative to the base of the storage region in which global variables are located)

  32. a(0) a a(1) a(2) b c t.y t t.m t.d unused space Static Storage AllocationGlobal Variables: Example let type Date = record y: Integer, m: Integer; d: Integer end; var a: array 3 of Integer; var b: Boolean; var c: Char; var t: Date in . . .

  33. Stack Storage AllocationLocal Variables • A local variable v is one that is declared inside a procedure (or function). • Lifetime of v: the variable v exists (occupies storage) only during an activation of that procedure • If same procedure is activated several times • v will have several lifetimes • Each activation creates a distinct variable

  34. Stack Storage AllocationLocal Variables: An Example let var a: array 3 of Integer; var b: Boolean; var c: Char; proc Y () ~ let var d: Integer; var e: record c: Char, n: Integer end in . . . proc Z () ~ let var f: Integer in begin …; Y(); … end in begin …; Y(); …; Z(); … end

  35. time Lifetime of global variables Lifetime of variables local to Y Lifetime of variables local to Z Lifetime of variables local to Y Program calls Z Z calls Y Return from Y Return from Z Program calls Y Return from Y Program stops Stack Storage AllocationLocal Variables: An Example • Observations: • Global variables are the only ones that exist throughout the program’s run-time • Use static allocation for global variables • Lifetimes of local variables are properly nested • Use a stack for local variables

  36. Stack Storage AllocationStack Frames: An Example (3) After return from Y (2) After program calls Y (1) After program starts (4) After program calls Z SB SB SB SB globals globals globals globals LB ST ST LB frame for Z frame for Y ST ST (7) After return from Z (6) After return from Y (5) After Z calls Y SB SB SB dynamic links globals globals globals ST LB frame for Z frame for Z Registers SB: Stack Base – Location of global variables LB: Local Base – Local variables of currently running procedure ST: Stack Top – Very top of stack LB frame for Y ST ST

  37. Stack Storage Allocation • The stack varies in size • For example, the frames for each of Y’s activation are at two different locations • The position of a frame within a stack cannot be predicted in advance • Need registers dedicated to point to the frames • Registers (find address of variables relative to these registers) • SB: stack base – is fixed, pointing to the base of the stack. This is where the global variables are located. • LB: local base – points to the base of the topmost frame in the stack. This frame always contains the variables of the currently running procedure. • ST: stack top – points to the very top of the stack. ST keeps track of the frame boundary as expressions are evaluated and the top of the stack expands and contracts.

  38. Stack Storage Allocation • Frame contents • Space for local variables • Link data • Return address – code address to which control will be returned at the end of the procedure activation. It is the address of the instruction following the call instruction that activated the procedure in the first place. • Dynamic link – the pointer to the base of the underlying fram e in the stack. It is the old content of LB and will be restored at end of procedure activation Since there are two words of link data, local variable addresses are offset by 2 dynamic link link data return address This only considers access to local or global variables, not nested variables. local data

More Related