1 / 15

Environment-Passing Interpreters

This appendix discusses the SLLGEN parsing system scanning, which divides character sequences into units such as whitespace, comments, identifiers, and numbers. The scanner returns tokens with lexical class, descriptive data, and input position indication.

awilliams
Download Presentation

Environment-Passing Interpreters

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Environment-Passing Interpreters • Programming Language Essentials • 2nd edition • Appendix: The SLLGEN Parsing System

  2. Scanning • divide character sequence into units such as whitespace, comments, identifiers, numbers, … • commonly expressed through regular expressions, i.e., patterns. • scanner should return token: lexical class, descriptive data, and input position indication: • class Scheme data • identifier symbol • number numerical value • literal string value

  3. Scanner • (define the-lexical-spec • '((white-sp (whitespace) skip ) • (comment • ("%" (arbno (not #\newline))) • skip • ) • (id • (letter (arbno (or letter digit "?"))) • symbol • ) • (number (digit (arbno digit)) number) • ) )

  4. Lexical Specification • ((class(regexp ..)outcome) .. ) • class a symbol that will be used in the grammar specification. • regexp a pattern to be matched. • outcome one of • skip to ignore the token, • symbol to return a Scheme symbol as data, • number to return a Scheme number as data, • string to return a Scheme string as data. • longest match; ties are string rather then symbol.

  5. regexp • regexp: string • : letter | digit | whitespace | any • : (notcharacter) • : (orregexp ..) • : (arbnoregexp) • : (concatregexp ..) • this is more than grep and less than egrep or lex.

  6. Parsing • organize token sequence into abstract syntax tree over a defined datatype based on context-free grammar • grammar representation • nonterminal datatype • each alternative rhs variant in datatype • identifier in rhs field with symbol • number field with number • nonterminal field with AST value • string [not collected]

  7. Parser • (define the-grammar • '((program (expression) a-program) • (expression (number) lit-exp) • (expression (id) var-exp) • (expression (primitive • "(" (separated-list expression ",") ")") • primapp-exp) • (primitive ("+") add-prim) • (primitive ("-") subtract-prim) • (primitive ("*") mult-prim) • (primitive ("add1") incr-prim) • (primitive ("sub1") decr-prim) • ) )

  8. Grammar Specification • ((nonterminal(item ..)variant) .. ) • nonterminal a symbol representing a nonterminal; the first one is the start symbol. • item.. a sequence defining one alternative right hand side. • variant a symbol to be used as the variant name for the datatype representing the nonterminal. • alternative right hand sides are specified by repeating the nonterminal and using a different variant.

  9. item • item: nonterminal | class | string • : (arbnoitem ..) • : (separated-listitem .. string) • this represents extended BNF, without notations for optional items, items to be repeated at least once, or alternatives. • SLLGEN checks that the grammar is LL(1).

  10. Operations • (sllgen:list-define-datatypes scan parse) • (sllgen:make-define-datatypes scan parse) • display or create the AST datatype • (sllgen:make-string-scanner scan parse) • (sllgen:make-string-parser scan parse) • return functions accepting a string and returning a token list or an AST • (sllgen:make-rep-loop prompt eval (sllgen:make-stream-parser scan parse) • ) • returns parameterless function running a read-evaluate-print loop

  11. AST Mapping • ((nonterminal(item ..)variant) .. ) • item: nonterminal | class | string • : (arbnoitem ..) • : (separated-listitem .. string) • (define-datatypenonterminalnonterminal? • (variant • (field-namenonterminal?) • .. • ) .. • )

  12. AST Mapping • ((nonterminal(item ..)variant) .. ) • item: nonterminal | class | string • : (arbnoitem ..) • : (separated-listitem .. string) • (define-datatypenonterminalnonterminal? • (variant • (field-namesymbol?) • .. • ) .. • )

  13. AST Mapping • ((nonterminal(item ..)variant) .. ) • item: nonterminal | class | string • : (arbnoitem ..) • : (separated-listitem .. string) • string, i.e., a keyword, is not represented in the AST — if necessary, the grammar has to map a string to a nonterminal for representation.

  14. AST Mapping • ((nonterminal(item ..)variant) .. ) • item: nonterminal | class | string • : (arbnont class ..) • : (separated-listnt class .. string) • (define-datatypenonterminalnonterminal? • (variant • (field-name1(list-ofnt?)) • (field-name2(list-ofsymbol?)) • .. • ) .. • )

  15. AST Mapping • ((nonterminal(item ..)variant) .. ) • item: (arbno • (separated-listnt class .. string)) • (define-datatypenonterminalnonterminal? • (variant • (name1(list-of (list-ofnt?))) • (name2(list-of (list-ofsymbol?))) • .. • ) .. • ) • the symbol sequence is flattened into the variant

More Related