compiler construction n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Compiler Construction PowerPoint Presentation
Download Presentation
Compiler Construction

Loading in 2 Seconds...

play fullscreen
1 / 34

Compiler Construction - PowerPoint PPT Presentation


  • 172 Views
  • Uploaded on

Compiler Construction. Sohail Aslam Lecture 5. Lexical Analysis. Recall: Front-End. Output of lexical analysis is a stream of tokens. tokens. source code. IR. scanner. parser. errors. Tokens. Example: if( i == j ) z = 0; else z = 1;. Tokens.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

Compiler Construction


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
    Presentation Transcript
    1. Compiler Construction Sohail Aslam Lecture 5

    2. Lexical Analysis

    3. Recall: Front-End • Output of lexical analysis is a stream of tokens tokens sourcecode IR scanner parser errors

    4. Tokens Example: if( i == j ) z = 0; else z = 1;

    5. Tokens • Input is just a sequence of characters:

    6. Tokens Goal: • partition input string into substrings • classify them according to their role

    7. Tokens • A token is a syntactic category • Natural language: “He wrote the program” • Words: “He”, “wrote”, “the”, “program”

    8. Tokens • Programming language: “if(b == 0) a = b” • Words: “if”, “(”, “b”, “==”, “0”, “)”, “a”, “=”, “b”

    9. Tokens • Identifiers: x y11 maxsize • Keywords: if else while for • Integers: 2 1000 -44 5L • Floats: 2.0 0.0034 1e5 • Symbols: ( ) + * / { } < > == • Strings: “enter x” “error”

    10. Ad-hoc Lexer • Hand-write code to generate tokens. • Partition the input string by reading left-to-right, recognizing one token at a time

    11. Ad-hoc Lexer • Look-ahead required to decide where one token ends and the next token begins.

    12. Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

    13. Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

    14. Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

    15. Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

    16. Ad-hoc Lexer class Lexer { Inputstream s; char next;//look ahead Lexer(Inputstream _s) { s = _s; next = s.read(); }

    17. Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString(); ... ...

    18. Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString(); ... ...

    19. Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString(); ... ...

    20. Ad-hoc Lexer Token nextToken() { if( idChar(next) ) return readId(); if( number(next) ) return readNumber(); if( next == ‘”’ ) return readString(); ... ...

    21. Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) returnnew Token(TID,id); id = id + string(c); } }

    22. Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) returnnew Token(TID,id); id = id + string(c); } }

    23. Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) returnnew Token(TID,id); id = id + string(c); } }

    24. Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) returnnew Token(TID,id); id = id + string(c); } }

    25. Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) returnnew Token(TID,id); id = id + string(c); } }

    26. Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) returnnew Token(TID,id); id = id + string(c); } }

    27. Ad-hoc Lexer Token readId() { string id = “”; while(true){ char c = input.read(); if(idChar(c) == false) returnnew Token(TID,id); id = id + string(c); } }

    28. Ad-hoc Lexer boolean idChar(char c) { if( isAlpha(c) ) return true; if( isDigit(c) ) return true; if( c == ‘_’ ) return true; return false; }

    29. Ad-hoc Lexer Token readNumber(){ string num = “”; while(true){ next = input.read(); if( !isNumber(next)) returnnew Token(TNUM,num); num = num+string(next); } }

    30. Ad-hoc Lexer Token readNumber(){ string num = “”; while(true){ next = input.read(); if( !isNumber(next)) returnnew Token(TNUM,num); num = num+string(next); } }

    31. Ad-hoc Lexer Token readNumber(){ string num = “”; while(true){ next = input.read(); if( !isNumber(next)) returnnew Token(TNUM,num); num = num+string(next); } }

    32. Ad-hoc Lexer Problems: • Do not know what kind of token we are going to read from seeing first character.

    33. Ad-hoc Lexer Problems: • If token begins with “i”, is it an identifier “i” or keyword “if”? • If token begins with “=”, is it “=” or “==”?

    34. Ad-hoc Lexer • Need a more principled approach • Use lexer generator that generates efficient tokenizer automatically.