Symbol Tables in Compiler Design

Compilers 8. Symbol Table Chih-Hung Wang References 1. C. N. Fischer, R. K. Cytron and R. J. LeBlanc. Crafting a Compiler. Pearson Education Inc., 2010. 2. D. Grune, H. Bal, C. Jacobs, and K. Langendoen. Modern Compiler Design. John Wiley & Sons, 2000. 3. Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers: Principles, Techniques, and Tools. Addison-Wesley, 1986. (2nd Ed. 2006)

Symbol Table • In any case, the symbol table is a useful abstraction to aid the compiler to ascertain and verify the semantics, or meaning, of a piece of code. It will keep track of the names, types, locations and properties of the symbols encountered in the program. • It makes the compiler more efficient, since the file doesn’t need to be re-parsed to discover previously processed information.

What Should be Stored? • Constants • Variables • Types (user defined) • Sub-programs • Classes • Inheritance • Arrays • Records • Module

Common Symbol Table

Complex Type

Symbol Table Organization (1) • 1) Find a record by name (e.g. access a variable as in "x = 5") • 2) Find a record by name in a specific scope (e.g. access a field of a record as in "a.x = 5") • 3) Find a record by its relationship with another record (e.g. get the type of a variable or the parameters of a function) • 4) Insert a new record • 5) Update an existing record

Symbol Table Organization (2) • We can split the symbol table implementation into two layers where each layer represents a certain data abstraction. • The bottom layer is an associative array, and is concerned with storing records and retrieving records by name. • The top layer organizes the records into groups and determines how the scoping of the language is maintained.

Symbol Table Organization (3) • A library

Symbol Table Organization (4) • Linked list and binary tree

Symbol Table Organization (5) • Hash table

Scoping (1) • Example

Scoping (2) • Scope-by-number

Scoping (3) • Scope-by-location

Filling the Symbol Table (1) • In the first case, information about a construct may appear in the source code before the type of the construct can be identified. • Consider the declaration int x,y; in C. Until the parser encounters the semicolon, the compiler cannot tell whether x is a variable or a function. • By the time it has encountered the variable, the data type for that variable (int) has long since been processed. • This means that some information will have to be stored in a temporary location until it can be used.

Filling the Symbol Table (2) • In the second case, additional information about a construct may not appear until after the record for that construct has been created. • In SAL and Pascal the syntax for variable declarations is var x,y :int;. We know that x is a variable as soon as its name is encountered, and its symbol table record should be created immediately. • The data type will be encountered later in the source code, and x’s record will need to be updated with its type information. • In this case, a reference to the variable should be stored so that it can be updated when more information becomes available.

Implementation in Lex & Yacc (0) -Simple

Implementation in Lex & Yacc (1) • Declaration (1)

Implementation in Lex & Yacc (4) • Parser (in Yacc) (1)

Implementation in Lex & Yacc (5) • Parser (in Yacc) (2)

Implementation in Lex & Yacc (6) • Token

Implementation in Lex & Yacc (7) • Grammar rule (in Yacc) (1)

Implementation in Lex & Yacc (8) • Scanner (in Lex)

Symbol Tables in Compiler Design

Symbol Tables in Compiler Design

Presentation Transcript

Advanced Compilers

Honors Compilers

Compilers

Compilers

Compilers:

Honors Compilers

Optimizing Compilers

Compilers

COMPILERS

Compilers

COMPILERS

Compilers

Compilers

Advanced Compilers

Compilers

Compilers

Compilers

Compilers