130 likes | 260 Views
This chapter explores Chomsky's formal model of language from the 1950s, introducing a hierarchy of grammars classified by complexity: Regular, Context-free, Context-sensitive, and Unrestricted. It highlights the development of Backus-Naur Form (BNF) by John Backus and Peter Naur in 1960 to define context-free grammars, emphasizing its superiority over regular expressions. Additionally, the text delves into lexical analysis, defining tokens and grammatical categories essential for programming languages, alongside the role of regular expressions in specifying language at the lexical level.
E N D
C H A P T E R T W O Syntax Programming Languages – Principles and Paradigms by Allen Tucker, Robert Noonan
Chomsky’s formal model: • Naom Chomsky developed a formal model of language types in the late 1950s. • Grammars were classified in a hierarchy of ascending complexity. • The original four classifications were: • Regular • Context-free • Context-sensitive • Unrestricted
Backus-Naur Form (BNF): • Created by John Backus and Peter Naur in 1960 to assist them in creating the programming language Algol. • Based on Chomsky’s work. • BNF works to describe context-free grammars. • More powerful than regular expressions.
Backus-Naur Form (BNF): BNF rules are used to rewrite nonterminal symbols into terminal symbols. Example rules: Integer -> Digit | Integer Digit Digit -> 0 | 1| 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
Parse Tree for 352 As an Integer Figure 2.1
BNF and Lexical Analysis: • A programming language’s lexicon is the set of all grammatical categories that define strings (tokens) from which a program may be written. • Typical categories include: • Operators (+, -, *, /, etc.) • Separators (;, ., {, }, etc.) • Keywords ( for, else, if, int, etc.) • Literals (numbers and string constants) • Identifiers (variable and function names)
A Simple Lexical Syntax for a Small Language, Jay Figure 2.3
Major Stages in the Compiling Process Figure 2.4
Skeleton Lexical Analysis Method That Returns Tokens Figure 2.5
Regular Expressions and Lexical Analysis: • An alternative to BNF for specifying a language at the lexical level. • Widely used as a tool for lexical scanners (routines that create token strings). • Not powerful enough for more complex syntactic structures. • Regular expressions can describe regular languages, BNF can describe context-free languages.
Conventions for Writing Regular Expressions Figure 2.6
Next time… Syntactic Analysis