1 / 18

CS 3304 Comparative Languages

CS 3304 Comparative Languages. Lecture 7: Syntax Tree 7 February 2012. Introduction. We can tie this discussion back into the earlier issue of separated phases versus on-the-fly semantic analysis and/or code generation.

manaj
Download Presentation

CS 3304 Comparative Languages

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS 3304Comparative Languages • Lecture 7:Syntax Tree • 7 February 2012

  2. Introduction • We can tie this discussion back into the earlier issue of separated phases versus on-the-fly semantic analysis and/or code generation. • If semantic analysis and/or code generation are interleaved with parsing, then the translation scheme we use to evaluate attributes must be L-attributed. • If we break semantic analysis and code generation out into separate phase(s), then the code that builds the parse/syntax tree must still use a left-to-right (L-attributed) translation scheme. • However, the later phases are free to use a fancier translation scheme if they want.

  3. Translation Scheme • There are automatic tools that construct a semantic analyzer (attribute evaluator) for a given attribute grammar. In other words, they generate translation schemes for context-free grammars or tree grammars (which describe the possible structure of a syntax tree): • These tools are heavily used in syntax-based editors and incremental compilers. • Most ordinary compilers, however, use ad-hoc techniques. • Most production compilers use an ad hoc, handwritten translation scheme: • Interleave parsing with at least the initial construction of a syntax tree. • Possibly all of semantic analysis and intermediate code generation. • Since the attributes of each production are evaluated as the production is parsed, there is no need for the full parse tree.

  4. Action Routines I • An ad-hoc translation scheme that is interleaved with parsing takes the form of a set of action routines: • An action routine is a semantic function that we tell the compiler to execute at a particular point in the parse. • If semantic analysis and code generation are interleaved with parsing, then action routines can be used to perform semantic checks and generate code. • LL parser generator: an action routine can appear anywhere within a right-hand side. • Implementation: when the parser predicts a production, the parser pushes all of the right hand side onto the stack.

  5. Action Routines II • If semantic analysis and code generation are broken out as separate phases, then action routines can be used to build a syntax tree: • A parse tree could be built completely automatically. • We wouldn't need action routines for that purpose. • Later compilation phases can then consist of ad hoc tree traversal(s), or can use an automatic tool to generate a translation scheme. • The PL/0 compiler uses ad-hoc traversals that are almost (but not quite) left-to-right.

  6. Action Routines Example (Figure 4.9) • For our LL(1) attribute grammar (Figure 4.6), we could put in explicit action routines:

  7. Constructing Syntax Tree • Productions 2-5: term_tail procedure for a syntax tree: • The parameter is a pointer to the syntax tree fragment in TT1. • Determines the upcoming symbol input. • Calls add_op to parse that symbol • Calls term to parse the attribute grammar’s T. • Calls make_bin_op to create a new tree node. • Passes that node to to term_tail that parses TT2. • Returns the result. procedure term_tail(lhs : tree_node_ptr) case input_token of +, - : op : string := add_op return term_tail(make_bin_op(op, lhs, term)) --term is a recursive call with no arguments ), id, read, write, $$ : -- epsilon production return lhs otherwise parse_error

  8. Bottom-Up Evaluation • LR parser does not in general know what production it is in until it has seen all or most of the yield: action routines cannot be embedded at arbitrary places in a right hand side. • Action routines allowed only after the point at which the production is identified unambiguously (trailing part of the right-hand side). • The ambiguous part is the left corner. • If the attribute flow is strictly bottom up then the execution at the end of the right-hand side is all that is needed. • If the action routines are doing a lot of semantic analysis, they need some contextual information. • That requires access to inherited attributes or to information outside the current production.

  9. Space Management for Attributes • If there is a parse tree, the attributes can be stored in nodes. • For a bottom-up parser with an S-attributed grammar, maintain an attribute stack mirroring the parse stack: • Next to every state number is an attribute record for the symbol shifted when entering the state. • Entries are pushed and popped automatically. • For a top-down parser with an L-attributed grammar: • Automatic: an attribute stack that does not mirror the parse stack. • Short-cutting copy rules: action routines allocate and deallocate space for attributes explicitly. • Contextual information: • Symbol table that always represents the current referencing environment.

  10. Example: Calculator Language • A calculator language with types and declarations. • Declarations are intermixed with statements. • Differentiate between integer and real constants. • Explicit conversion between integer and real operands is required. • Every identifier should be declared before use. • The types should not be mixed in computations. • Constructing the syntax tree: adding semantic functions or action routines to the context free grammar for the calculator language.

  11. Context Free Grammar

  12. Example: Syntax Tree • Syntax tree for a simple program to print an average of an integer and a real (Figure 4.12).

  13. Tree Grammar • Represents the possible structure of syntax trees. • A tree grammar production represents possible relationship between a parent and its children in the tree. • No need for parsing. • Tree grammars provide a framework for the decoration of syntax trees. • Can be used to perform static semantic checking.

  14. Example: Tree Grammar • Tree grammar representing structure of syntax tree in Figure 4.12 • The notation A : B on the left hand side of a production mans that A is one variant of B, may appear anywhere a B is expected on a right hand side.

  15. Example: Complete Tree Grammar • A sample from a complete tree grammar representing structure of syntax tree in Figure 4.12. • Constructed using node classes, variants, and attributes (inherited and synthesized), Figure 4.13. • Classes: program, item, and expr. • item variants: int_decl, real_decl, read, write, :=, and null.

  16. Using Tree Grammar • The program node at the root of the syntax tree contains a list (synthesized attribute) of all static semantic errors. • Each item or exprnode has an inherited attribute symtab that contains a list (with types) of all identifiers declared to the left in the tree. • Each itemnode has: • An inherited attribute errors_in that lists all static semantic errors found to its left in the tree. • A synthesized attribute errors_out to propagate the final error list back to the root. • Each exprnode has: • A synthesized attribute that indicates its type. • A synthesized attribute that contains a list of any semantic errors found inside.

  17. Decorating Syntax Tree (Figure 4.15) • Symbol table information flows along the chain of items and down into expr trees. • Type information is synthesized at id:expr leaves (symbol table). • The information then propagates upward within an expression tree and is used to type-check operators and assignments. • Error messages flow along the chains of items via the error_in attributes. • Error messages flow back to root via the error_out attributes. • Messages also flow up out of expr trees. • Whenever a type check is performed, the type attribute may be used to help create a new message and append to a list.

  18. Summary • Most compilers rely on action routines that evaluate attribute rules at specific points in a parse. • The automatic approach is easier to maintain; the ad hoc approach is slightly faster and more flexible. • In a one-pass compiler semantic functions or action actions are responsible for all of semantic analysis and code generation, i.e. the build a syntax tree.

More Related