1 / 58

Compilation 0368-3133 (Semester A, 2013/14)

Compilation 0368-3133 (Semester A, 2013/14). Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky. Slides credit: Roman Manevich , Mooly Sagiv , Eran Yahav. What is a Compiler?.

raleigh
Download Presentation

Compilation 0368-3133 (Semester A, 2013/14)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Compilation 0368-3133 (Semester A, 2013/14) Lecture 6a: Syntax (Bottom–up parsing) Noam Rinetzky Slides credit: Roman Manevich, MoolySagiv, EranYahav

  2. What is a Compiler? “A compiler is a computer program that transforms source code written in a programming language (source language) into another language (target language). The most common reason for wanting to transform source code is to create an executable program.” --Wikipedia

  3. Source text txt Executable code exe Conceptual Structure of a Compiler Compiler Frontend Semantic Representation Backend LexicalAnalysis Syntax Analysis Parsing Semantic Analysis IntermediateRepresentation (IR) Code Generation words sentences

  4. Op(*) Op(+) Id(b) Num(23) Num(7) From scanning to parsing program text ((23 + 7) * x) Lexical Analyzer token stream Grammar: E ... | Id Id  ‘a’ | ... | ‘z’ Parser valid syntaxerror Abstract Syntax Tree

  5. Top-Down vs Bottom-Up AaAb|c aacbb$ A already read… to be read… a b A a A c b Top-down (predict match/scan-complete )

  6. Top-Down vs Bottom-Up AaAb|c aacbb$ already read… to be read… c • Bottom-up (shift reduce) A a b A a A b Top-down (predict match/scan-complete )

  7. Model of an LR parser Remainder of text to be processed Input • State controls decisions Top LR Parser state Output (AST/Error) symbol Terminals and Non-terminals Initial stack contains q0 Stack Control Tables

  8. LR(0) parser tables Empty cell =error move ACTION Table GOTO Table

  9. LR(0) parser tables • Shift action row • Tells which state to GOTO for current token • Blank entry indicates an error • Reduce action row • Tells which rule to reduce with • Independent of current token • GOTO entries are blank

  10. Shift Move shift id + id $ + id $ input input stack stack q0 q0 id q5 Remove first token from input Push it on the stack Compute next state based on GOTO table Push new state on the stack If new state is error – report error

  11. Reduce Move (using Nα) Reduce:T  id + id $ + id $ input input stack stack q0 id q5 q0 T Symbols in α and their following states are removed from stack New state computed based on GOTO table (using top of stack, before pushing N) N is pushed on the stack New state pushed on top of N

  12. Reduce Move (using Nα) Reduce:T  id + id $ + id $ input input stack stack q0 id q5 q0 T q6 Symbols in α and their following states are removed from stack New state computed based on GOTO table (using top of stack, before pushing N) N is pushed on the stack New state pushed on top of N

  13. GOTO/ACTION table Z  E $ E  T E  E + T T  i T ( E ) Warning: numbers mean different things! rn = reduce using rule number n sm = shift to state m

  14. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$ Initialize with state 0

  15. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$ Initialize with state 0

  16. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$

  17. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$ pop id 5

  18. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$ push T 6

  19. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$

  20. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$

  21. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$

  22. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$

  23. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$

  24. (1) S E $ (2) E  T (3) E  E+ T (4) T id (5) T  (E) Parsing id+id$

  25. Constructing an LR parsing table • Construct a transition diagram (deterministic FSM) • States = sets of LR(0) items • Transitions = one-step derivation • If there are conflicts – stop • Fill table entries from diagram

  26. Terminology: Reductions & Handles • The opposite of derivation is called reduction • Let Aα be a production rule • Derivation: βAµβαµ • Reduction: βαµβAµ • A handleis the reduced substring • αis the handles for βαµ

  27. LR(0) Items Grammar LR(0) items (1) S E $ (2) E  T (3) E  E + T (4) T id (5) T  (E ) The items of a grammar are obtained by placing a dot at every position in every production

  28. LR(0) Item - Intuition To be matched Already matched Input N  αβ Hypothesis about αβ is the rule being reduced, and so far we’ve matched α and we expect to see β

  29. Types of LR(0) items N  αβ Shift Item N  αβ Reduce Item

  30. LR(0) automaton example reduce state q6 shift state E  T T T q7 q0 T  (E) E  T E  E + T T  i T  (E) Z  E$ E  T E  E + T T  i T  (E) ( q5 i i T  i E E ( ( i q1 q8 q3 Z  E$ E  E+ T T  (E) E  E+T E  E+T T  i T  (E) + + $ ) q9 q2 Z  E$ T  (E)  T q4 E  E + T

  31. Computing item sets • Initial set • Z is in the start symbol • -closure({ Zα | Zα is in the grammar } ) • Next set from a set S and the next symbol X • step(S,X) = { NαXβ | NαXβ in the item set S} • nextSet(S,X) = -closure(step(S,X))

  32. Operations for transition diagram construction Initial = {S’S$} For an item set IClosure(I) = Closure(I) ∪ {Xµ is in grammar| NαXβ in I} Goto(I, X) = { NαXβ | NαXβ in I}

  33. Initial example Grammar (1) S E $ (2) E  T (3) E  E + T (4) T id (5) T  ( E ) Initial = {S E $}

  34. Closure example Grammar (1) S E $ (2) E  T (3) E  E + T (4) T id (5) T  ( E ) Initial = {S E $} Closure({S E $}) = { S E $ E T E E + T T id T  ( E ) }

  35. Goto example Grammar (1) S E $ (2) E  T (3) E  E + T (4) T id (5) T  ( E ) Initial = {S E $} Closure({S E $}) = { S E $ E T E E + T T id T  ( E ) } Goto({S E $ , E E + T, T id}, E) = {S  E $, E  E + T}

  36. Constructing the transition diagram • Start with state 0 containing itemClosure({S E $}) • Repeat until no new states are discovered • For every state p containing item set Ip, and symbol N, compute state q containing item setIq = Closure(goto(Ip, N))

  37. LR(0) automaton example reduce state shift state q6 E  T T T q7 q0 T  (E) E  T E  E + T T  i T  (E) Z  E$ E  T E  E + T T  i T  (E) ( q5 i i T  i E E ( ( i q1 q8 q3 Z  E$ E  E+ T T  (E) E  E+T E  E+T T  i T  (E) + + $ ) q9 q2 Z  E$ T  (E)  T q4 E  E + T

  38. Automaton construction example (1) S E $ (2) E  T (3) E  E + T (4) T id (5) T  ( E ) q0 S  E$ Initialize

  39. Automaton construction example (1) S E $ (2) E  T (3) E  E + T (4) T id (5) T  ( E ) q0 S  E$ E  T E  E + T T  i T  (E) applyClosure

  40. Automaton construction example (1) S E $ (2) E  T (3) E  E + T (4) T id (5) T  ( E ) q6 E  T T q0 T  (E) E  T E  E + T T  i T  (E) S  E$ E  T E  E + T T  i T  (E) ( q5 i T  i E q1 S  E$ E  E+ T

  41. Automaton construction example (1) S E $ (2) E  T (3) E  E + T (4) T id (5) T  ( E ) q6 E  T q7 T q0 T T  (E) E  T E  E + T T  i T  (E) S  E$ E  T E  E + T T  i T  (E) non-terminal transition corresponds to goto action in parse table ( q5 i i T  i E E ( ( i q1 q8 q3 Z  E$ E  E+ T T  (E) E  E+T E  E+T T  i T  (E) terminal transition corresponds to shift action in parse table + + $ ) q9 q2 S  E$ T  (E)  T q4 E  E + T a single reduce item corresponds to reduce action

  42. Are we done? Can make a transition diagram for any grammar Can make a GOTO table for every grammar Cannot make a deterministic ACTION table for every grammar

  43. LR(0) conflicts … T q0 Z  E$ E  T E  E + T T  i T  (E) T  i[E] ( … q5 i T  i T  i[E] E Shift/reduce conflict … Z  E $ E  T E  E + T T  i T  ( E ) T  i[E]

  44. LR(0) conflicts T q0 Z  E$ E  T E  E + T T  I V  i T  (E) T  i[E] … ( … q5 i T  i V  i E reduce/reduce conflict … Z  E $ E  T E  E + T T  i V  iT  ( E )

  45. LR(0) conflicts • Any grammar with an -rule cannot be LR(0) • Inherent shift/reduce conflict • A  – reduce item • P αAβ – shift item • A  can always be predicted from P αAβ

  46. Conflicts Can construct a diagram for every grammar but some may introduce conflicts shift-reduce conflict: an item set contains at least one shift item and one reduce item reduce-reduce conflict: an item set contains two reduce items

  47. LR variants • LR(0) – what we’ve seen so far • SLR(0) • Removes infeasible reduce actions via FOLLOW set reasoning • LR(1) • LR(0) with one lookahead token in items • LALR(0) • LR(1) with merging of states with same LR(0) component

  48. LR (0) GOTO/ACTIONS tables GOTO table is indexed by state and a grammar symbol from the stack ACTION Table GOTO Table ACTION table determined only by state, ignores input

  49. SLR parsing • A handle should not be reduced to a non-terminal N if the lookahead is a token that cannot follow N • A reduce item N  α is applicable only when the lookahead is in FOLLOW(N) • If b is not in FOLLOW(N) we just proved there is no derivation S * βNb. • Thus, it is safe to remove the reduce item from the conflicted state • Differs from LR(0) only on the ACTION table • Now a row in the parsing table may contain both shift actions and reduce actions and we need to consult the current token to decide which one to take

  50. Lookahead token from the input SLR action table vs. SLR – use 1 token look-ahead LR(0) – no look-ahead … as before… T  i T  i[E]

More Related