1 / 58

CSC 3315 Lexical and Syntax Analysis

CSC 3315 Lexical and Syntax Analysis. Hamid Harroud School of Science and Engineering, Akhawayn University http://www.aui.ma/~H.Harroud/csc3315/. Lexical Analysis. Convert source file characters into token stream. Remove content-free characters (comments, whitespace, ...)

axel-chen
Download Presentation

CSC 3315 Lexical and Syntax Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSC 3315Lexical and Syntax Analysis HamidHarroud School of Science and Engineering, Akhawayn University http://www.aui.ma/~H.Harroud/csc3315/

  2. Lexical Analysis • Convert source file characters into token stream. • Remove content-free characters (comments, whitespace, ...) • Detect lexical errors (badly-formed literals, illegal characters, ...) • Output of lexical analysis is input to syntax analysis. • Idea: Look for patterns in input character sequence, convert to tokens with attributes, and pass them to parser in stream.

  3. Lexical Analysis Example

  4. Specifying Lexical Analysers • Can define lexical analyzer via list of pairs: (regular expression, action) where regular expression describes token pattern and action is a piece of code, parameterized by the matching lexeme, that returns a (token, attribute) pair • Example • (digit+, {return new Token(NUM,parseInt(lexeme));}) • (alpha(alpha|digit)∗, {return new Token(ID,lexeme);}) • (space|tab|newline, {}) • (.,.) • So R.E’s can help us specify scanners.

  5. Regular Expressions • A regular expression (R.E.) is a concise formal characterization of a regular language. • Example: The regular language containing all IDENTs is described by the regular expression letter (letter | digit)∗ where “| ” means “or” and “e∗” means “zero or more copies of e.” • Regular languages are one particular kind of formal languages.

  6. Finite Automaton Input String Output “Accept” or “Reject” Finite Automaton

  7. Transition Graph initial state accepting state transition state

  8. Initial Configuration Input String

  9. Reading the Input

  10. Reading the Input

  11. Reading the Input

  12. Reading the Input

  13. Reading the Input Input finished accept

  14. Rejection

  15. Rejection

  16. Rejection

  17. Rejection

  18. Rejection Input finished reject

  19. Another Rejection

  20. Another Rejection reject

  21. Another Example

  22. Another Example

  23. Another Example

  24. Another Example

  25. Another Example accept

  26. Rejection Example

  27. Rejection Example

  28. Rejection Example

  29. Rejection Example

  30. Rejection Example Input finished reject

  31. Languages Accepted by FAs • FA • The language contains all input strings accepted by = { strings that bring to an accepting state}

  32. Example accept

  33. Example accept accept accept

  34. Formal Definition Finite Automaton (FA) : set of states : input alphabet : transition function : initial state : set of accepting states

  35. Input Alphabet

  36. Set of States

  37. Initial State

  38. Set of Accepting States

  39. Transition Function

  40. Transition Function

  41. Transition Function

  42. Transition Function

  43. Transition Function

  44. Extended Transition Function

  45. Extended Transition Function

  46. Example = { all strings with prefix } accept

  47. Example = { all strings without substring }

  48. Example

  49. Regular Languages • Definition: • A language is regular if there is • FA such that • Observation: • All languages accepted by FAs form the family of regular languages

  50. Examples of Regular Languages There exist automata that accept these Languages (see previous slides). { all strings with prefix } { all strings without substring } There exist languages which are not Regular: There is no FA that accepts such a language.

More Related