1 / 68

Chap 6: Type Checking/Semantic Analysis

Chap 6: Type Checking/Semantic Analysis. Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT 06269-3155. steve@engr.uconn.edu http://www.engr.uconn.edu/~steve (860) 486 - 4818.

abrial
Download Presentation

Chap 6: Type Checking/Semantic Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chap 6: Type Checking/Semantic Analysis Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371 Fairfield Way, Unit 2155 Storrs, CT 06269-3155 steve@engr.uconn.edu http://www.engr.uconn.edu/~steve (860) 486 - 4818 Material for course thanks to: Laurent Michel Aggelos Kiayias Robert LeBarre

  2. Overview • Type Checking and Semantic Analysis is a Critical Features of Compilers and Compilation • Passing a Syntax Check (Parsing) not Sufficient • Type Checking Provides Vital Input • Software Engineers Assisted in Debugging Process • We’ll Focus on Classical Type Checking Issues • Background and Motivation • Type Analysis • The Notion of a Type System • Examining a Simple Type Checker • Other Key Typing Concepts • Concluding Remarks/Looking Ahead

  3. Background and Motivation • Recall....

  4. Background and Motivation • What we have achieved • All the “words” (Tokens) are known • The tree is syntactically correct • What we still do not know... • Does the program make sense ? • What we will not try to find out • Is the program correct ? • This is Impossible! • Our Concern: • Does it Compile? • Are all Semantic Errors Removed? • Do all Types and their Usage Make Sense?

  5. Background and Motivation • The program makes “sense” • Programmer’s intent is clear • Program is semantically unambiguous • Data-wise • We know what each name denotes • We know how to represent everything • Flow-wise • We know how to execute all the statements • Structure-wise • Nothing is missing • Nothing is multiply defined • The program is correct • It will produce the expected input

  6. Tasks To Perform • Scope Analysis • Figure out what each name refers to • Understand where Scope Exists (See Chapter 7) • Type Analysis • Figure out the type of each name • Names are functions, variables, types, etc. • Completeness Analysis • Check that everything is defined • Check that nothing is multiply defined

  7. Output ? • What the analysis produce • Data structures “on the side” • To describe the types(resolve the types) • To describe what each name denotes (resolve the scopes) • A Decorated tree • Add annotations in the tree • Possibly.... Semantic Errors!

  8. Pictorially

  9. Type Analysis • Purpose • Find the type of every construction • Local variables • Actuals for calls • Formals of method calls • Objects • Methods • Expressions • Rationale • Types are useful to catch bugs!

  10. Type Analysis • Why Bother ? A type system is a tractablesyntactic method for proving the absence of certain programbehaviors by classifying phrases according to the kind of values they compute.

  11. Uses • Many! • Error detection • Detect early • Detect automatically • Detect obvious and subtle flaws • Abstraction • The skeleton to provide modularity • Signature/structure/interface/class/ADT/.... • Documentation • Program are easier to read • Language Safety guarantee • Memory layout • Bound checking • Efficiency

  12. How It works ? • Classify programs according to the kind of values computed Set of All Programs Set of All Reasonable Programs Set of All Type-Safe Programs Set of All Correct Programs

  13. How do we do this ? • Compute the type of every sentence • On the tree • With a tree traversal • Some information will flow up (synthesized) • Some information will flow down (inherited) • Questions to answer • What is a type ? • How do I create types ? • How do I compute types ?

  14. Types... • Types form a language! • With • Terminals... • Non-terminals.... • And a grammar! • Alternatively • Types can be defined inductively • Base types (a.k.a. the terminals) • Inductive types (a.k.a. grammatical productions)

  15. Base Types • What are the base types ? • int • float • double • char • void • bool • error

  16. Inductive Type Definition • Purpose • Define a type in terms of other simple/smaller types • Example • array • pointer • reference • Pair (products in the book) • structure • function • methods • classes • ...

  17. Relation to Grammar ? Type → array ( Type ,Type ) → pair( Type , Type ) → tuple( Type+ ) → struct( FieldType+) → fun ( Type ) : Type → method ( ClassType , Type ) : Type → pointer( Type ) → reference( Type ) → ClassType → BasicType ClassType → class ( name [ , Type] ) FieldType → name: Type BasicType → int | bool | char | float | double | void | error

  18. Type Terms • What is that ? • It is a sentence in the type language • Example • int • pair(int,int) • tuple(int,bool,float) • array(int,int) • fun(int) : int • fun(tuple(int,char)) : int • class(“Foo”) • method(class(“Foo”), tuple(int,char)) : int

  19. So... fun(tuple(int,char)) : int • If • we have Type Term • we have a Type Language • We can parse it and obtain.... • Type Trees!

  20. The Notion of a Type System • Logical Placement of Type Checker: • Role of Type Checker is to Verify Semantic Contexts • Incompatible Operator/Operands • Existence of Flow-of Control Info (Goto/Labels) • Uniqueness w.r.t. Variable Definition, Case Statement Labels/Ranges, etc. • Naming Checks Across Blocks (Begin/End) • Function Definition vs. Function Call • Type Checking can Occur as “Side-Effect” of Parsing via a Judicious Use of Attribute Grammars! • Employ Type Synthesis Interm. Repres. Token Stream Parser Syntax Tree Type Checker Syntax Tree Int. Code Generator

  21. Example of type synthesis • Assume the program if (1+1 == 2) then 1 + 3 else 2 * 3

  22. Yes... But.... • What about identifiers ? • Key Idea • Type for identifiers are inherited attributes! • Inherits • From the definition • To the use site. int n; .... if (n == 0) then 1 else n

  23. Example int n; .... if (n == 0) then 1 else n

  24. The Notion of a Type System • Type System/Checker Based on: • Syntactic Language Construct • The Notion of Types • Rules for Assigning Types to Language Constructs • Strength of Type Checking (Strong vs. Weak) • Strong vs. Weak • Dynamic vs. Static • OODBS/OOPLS Offer Many Variants • All Expression in Language MUST have Associated Type • Basic (int, real, char, etc.) • Constructed (from basic and constructed types) • How are Type Expression Defined and Constructed?

  25. Type Expressions • A Basic Type is a Type Expression • Examples: Boolean, Integer, Char, Real • Note: TypeError is Basic Type to Represent Errors • A Type Expression may have a Type Name which is also a Type Expression • A Type Constructor Applied to Type Expression Results in a Type Expression • Array(I,T): I is Integer Range, T is Type Expr. • Product: T1T2 is Type Expr if T1, T2 Type Exprs. • Record: Tuple of Field Names & Respective Type • Pointer(T): T is a Type Expr., Pointer(T) also

  26. Type Expressions • A Type Constructor Applied to Type Expression Results in a Type Expression (Continued) • Functions: • May be Mathematically Characterized with Domain Type D and Range Type R • F: D  R • int  int  int • char  char  pointer(int) • A Type Expression May Contain Variables whose Values are Type Expressions • Called Type Variables • We’ll Omit from our Discussion …

  27. Key Issues for Type System • Classical Type System Approaches • Static Type Checking (Compile Time) • Dynamic Type Checking (Run Time) • How is each Handled in C? C++? Java? • Language Level Issues: • Sound Type System (ML) • No Dynamic Type Checking is Required • All Type Errors are Determined Statically • Strongly Typed Language (Java, Ada) • Compiler Guarantees no Type Errors During Execution • Weakly Typed Language (C, LISP) • Allows you to Break Rules at Runtime • What about Today’s Web-based Languages?

  28. The Notion of a Type System • Types System: Rules Used by the Type Checker to Asign Types to Expressions and Verify Consistency • Type Systems are Language/Compiler Dependent • Different Versions of Pascal have Different Type Systems • Same Language Can have Multiple Levels of Type Systems (C Compiler vs. Lint in Unix) • Different Compilers for Same Language May Implement Type Checking Differently • GNU C++ vs. MS Studio C++ • Sun Java vs. MS Java (until Sun forced off market) • What are the Key Issues?

  29. First Example: Simple Type Checker • Consider Simplistic Language: • What does this Represent? P → D ; E D → D ; D | id: T T → char | int | array [ num] of T | T E → literal | number | id | E mod E | E [ E ] | E  Key: integer; Key MOD 999; X: character; B: integer; B MOD X; A: array [100] of char; A[20] A[200] Are all of these Legal?

  30. First: Add Typing into Symbol Table P → D ; E D → D ; D D → id: T {addtype(id.entry, T.type)} T → char {T.type:= char} T → int {T.type:= int} T → array [ num] of T1 {T.Type:= array(1..num.val, T1.type} T → T {T.type:= pointer(T1.type)} Notes: • Assume Lexical recognition of id (in Lexical Analyzer) Adds id to Symbol Table • Thus – we Augment this with T.Type E → literal {E.type:= char} E → number {E.type:= integer} E → id {E.type:= lookup(id.entry) }

  31. Remaining Typing More Complex • E1 E2 May be Mod, Array, or Ptr Expression • Useful Extensions would Include Boolean Type and Extending Expression with Rel Ops, AND, OR, etc. E → E1mod E2 {E.type := if E1.type = integer and E2.type = integer then integer else type_error} E → E1[E2] {E.type := if E2.type = integer and E1.type = array(s, t) then t else type_error} E → E1 {E.type := if E1.type = pointer(t) then t else type_error}

  32. Extending Example to Statements • These Extensions are More Complex from a Type Checking Perspective • Right Now, only Individual Statements are Checked P → D ; S S → id =: E {S.type:=if id.type =E.type then void else type_error)} S → if E then S1 {S.type:=if E.type = boolean then S1.type else type_error)} S → while E do S1 {S.type:=if E.type = boolean then S1.type else type_error)} S → S1 ; S2 {S.type:=if S1.type = void and S2.type = void then void else type_error)}

  33. What are Main Issues in Type Checking? • Type Equivalence: • Conditions under which Types are the Same • Tracking of Scoping – Nested Declarations • Type Compatibility • Conversion/casting, Nonconverting casts, Coercion • Type Inference • Determining the Type of a Complex Expression • Reviewing Remaining Concepts of Note • Overloading, Polymorphism, Generics • In OO Case: • Classes are OO version of a type • Issues Need to Consider Way Program in OO • In Older Languages like C, these are Critical

  34. Structural vs. Name Equivalence of Types • Two Types are “Structurally Equivalent” iff they are Equivalent Under Following 3 Rules: • SE1: A Type Name is Structurally Equivalent to Itself • SE2: T1 and T2are Structurally Equivalent if they are Formed by Applying the Same Type Constructors to Structurally Equivalent Types • SE3: After a Type Declaration: Type n=T, the Type Name n is Structurally Equivalent to T • SE3 is “Name Equivalence” • What Do Programming Languages Use? • C: All Three Rules • Pascal: Omits SE2 and Restricts SE3 to be a Type Name can only be Structurally Equivalent to Other Type Names

  35. Type Equivalence • Structural equivalence: equivalent if built in the same way (same parts, same order) • Name equivalence: distinctly named types are always different • Structural equivalence questions • What parts constitute a structural difference? • Storage: record fields, array size • Naming of storage: field names, array indices • Field order • How to distinguish between intentional vs. incidental structural similarities? • An argument for name equivalence: “They’re different because the programmer said so; if they’re

  36. Type Equivalence Records and Arrays • Would record types with identical fields, but different name order, be structurally equivalent? • When are arrays with the same number of elements structurally equivalent? type PascalRec = record a : integer; b : integer end; val MLRec = { a = 1, b = 2 }; val OtherRec = { b = 2, a = 1 }; type str = array [1..10] of integer; type str = array [1..2 * 5] of integer; type str = array [0..9] of integer;

  37. Consider Name Equivalence in Pascal • How are Following Compared: • By Rules SE1, SE2, SE3, allare Equivalent! • However: • Some Implementations of Pascal • next, last – Equivalent • p, q, r, - Equivalent • Other Implementations of Pascal • next, last – Equivalent • q, r, - Equivalent • How is Following Interpreted? type link = cell; var next : link; last : link; p : cell; q, r : cell; type link = cell; np = cell; npr = cell; var next : link; last : link; p : np; q, r : npr;

  38. What about Classes and Equivalence? • Are these SE1? SE2? Or SE3? • What Does Java Require? public class person { private String lastname, firstname; private String loginID; private String password; }; public class user { private String lastname, firstname; private String loginID; private String password; };

  39. Checking Structural Equivalence • Employ a Recursive Algorithm to Check SE2: • Algorithm Adaptable for Other Versions of SE • Constructive Equivalence Means Following are Same: • X: array[1..10] of int; • Y: array[1..10] of int;

  40. Alias Types and Name Equivalence • Alias types are types that purely consist of a different name for another type • Is Integer assignable to a Stack_Element? Levels? • Can a Celsius and Fahrenheit be assigned to each other? • Strict name equivalence: aliased types are distinct • Loose name equivalence: aliased types are equivalence • Ada allows additional explicit equivalence control: TYPE Stack_Element = INTEGER; TYPE Level = INTEGER; TYPE Celsius = REAL; TYPE Fahrenheit = REAL; subtype Stack_Element is integer; type Celsius is new real; type Fahrenheit is new real;

  41. Why is Degree of Type Equivalence Critical? • Governs how Software Engineers Develop Code… Why? • SE2 Alone Doesn’t Promote Well Designed, Thought Out, Software … Why? • Impacts on Team-Oriented Software Development… How? • With SE2 Alone, Errors are Harder to Locate and Correct… Why? • Increases Compilation Time with SE2 Alone … Why?

  42. Scoping • What is the problem ? • Consider this example program class Foo { int n; Foo() { n = 0;} int run(int n) { int i; int j; i = 0; j = 0; while (i < n) { int n; n = i * 2; j = j + n; } return j; } };

  43. Resolving the Issue • Observation • Scopes are always properly nested • Each new definition could have a different type • Idea • Make the typing environment sensitive to scopes • New operations on typing env. • Entering a scope • Effect: New declarations overload previous one • Leaving a scope • Effect: Old declarations become current again • What are the Issues? • Activating and Tracking Scopes!

  44. Scoping • The Scopes class Foo { int n; Foo() { n = 0;} int run(int n) { int i; int j; i = 0; j = 0; while (i < n) { int n; n = i * 2; j = j + n; } return j; } }; Class Scope Method Scope Body Scope Block Scope Key point: Non-shadowed names remain visible

  45. Handling Scopes • From a declarative standpoint • Introduce a new typing environment • Initially equal to the copy of the original • Then augmented with the new declarations • Discard environment when leaving the scope • From an implementation point of view • Environment directly accounts for scoping • How ? • Scope chaining!

  46. Scope Chaining • Key Ideas • One scope = One hashtable • Scope chaining = A linked list of scopes • Abstract Data Type • Semantic Environment • pushScope • Add a new scope in front of the linked list • popScope • Remove the scope at the front of the list • lookup(name) • Search for an entry for name. If nothing in first scope, start scanning the subsequent scopes in the linked list.

  47. Scope Chaining • Advantages • Updates are non-destructive • When we pop a scope, the previous list is unchanged since addition are only done in the top scope • The current list of scopes can be saved (when needed)

  48. Entering & Leaving Scopes • Easy to find out... • Use the tree structure! • Entering scope • When entering a class • When entering a method • When entering a block • Leaving scope • End of class • End of method • End of block • We’ll Revisit in Chapter 7 on Runtime Environment!

  49. Type Conversion • Certain contexts in certain languages may require exact matches with respect to types: • aVar := anExpression • value1 + value2 • foo(arg1, arg2, arg3, … , argN) • Type conversion seeks to follow these exact match rules while allowing programmers some flexibility in the values used • Using structurally-equivalent types in a name-equivalent language • Types whose value ranges may be distinct but intersect (e.g. subranges) • Distinct types with sensible/meaningful corresponding values (e.g. integers and floats)

  50. Type Conversion • Refers to the Conversion Between Different Types to Carry out Some Action in a Program • Often Abused within a Programming Language (C) • Typically Used in Arithmetic/Boolean Expressions • r := i + r; (Pascal) • f := i + c; (C) • Two Kinds of Conversion: • Implicit: Automatically done by Compiler • Explicit: Type-Casts: Programmer Initiated (Ord, Chr, Trunc) • If X is a real array, which works faster? Why • for I:=1 to N do X[I] := 1; • for I:=1 to N do X[I] := 1.0; • A Good Optimizing Compiler will Convert 1st option!

More Related