Programming Languages and Compilers (CS 421)
310 likes | 329 Views
Learn about Type Inference Algorithm for programming, unification problem, examples, and regular expressions explained in OCamllex. Dive into BNF derivations and grammars to enhance programming skills.
Programming Languages and Compilers (CS 421)
E N D
Presentation Transcript
Programming Languages and Compilers (CS 421) Munawar Hafiz 2219 SC, UIUC http://www.cs.illinois.edu/class/cs421/ Based in part on slides by Mattox Beckman, as updated by Vikram Adve and Gul Agha
Type Inference - Example • Eliminate : [f : ; x : ] |- f : [f : ; x : ] |- x : [f : ; x : ] |- (f x) : [x : ] |- (fun f -> f x) : [ ] |- (fun x -> fun f -> f x) : • (); ( ); ( );
Type Inference Algorithm Let has_type (, e, ) = S • is a typing environment • e is an expression • is a (generalized) type, • S is a set of equations between generalized types • Idea: S is the constraints on type variables necessary for |- e : • LetUnif(S) be a substitution of generalized types for type variables solving S • Solution: Unif(S)() |- e : Unif(S)()
Type Inference Algorithm has_type (, exp, ) = • Case exp of • Var v --> return { (v)} • Const c --> return { } where |- c : by the constant rules • fun x -> e --> • Let , be fresh variables • Let S = has_type ([x: ] + , e, ) • Return { } S
Type Inference Algorithm (cont) • Case exp of • App (e1e2) --> • Let be a fresh variable • Let S1 = has_type(, e1, ) • Let S2 = has_type(, e2, ) • Return S1 S2
Type Inference Algorithm (cont) • Case exp of • If e1 then e2 else e3 --> • Let S1 = has_type(, e1, bool) • Let S2 = has_type(, e2, ) • Let S2 = has_type(, e2, ) • Return S1 S2 S3
Unification Problem Given a set of pairs of terms (“equations”) {(s1, t1), (s2, t2), …, (sn, tn)} (theunification problem) does there exist a substitution (the unification solution) of terms for variables such that (si) = (ti), for all i = 1, …, n?
Unification Algorithm • Let S = {(s1, t1), (s2, t2), …, (sn, tn)} be a unification problem. • Case S = { }: Unif(S) = Identity function (ie no substitution) • Case S = {(s, t)} S’): Four main steps
Unification Algorithm • Delete: if s = t (they are the same term) then Unif(S) = Unif(S’) • Decompose: if s = f(q1, … , qm) and t =f(r1, … , rm) (same f, same m!), then Unif(S) = Unif({(q1, r1), …, (qm, rm)} S’) • Orient: if t = x is a variable, and s is not a variable, Unif(S) = Unif ({(x,s)} S’)
Unification Algorithm • Eliminate: if s = x is a variable, and x does not occur in t (the occurs check), then • Let = x | t • Let = Unif((S’)) • Unif(S) = {x | (t)} o • Note: {x | a} o {y | b} = {y | ({x | a}(b)} o {x | a} if y not in a
Example S = {(f(x), f(g(y,z))), (g(y,f(y)),x)} Solved by {x | g(y,f(y))} o {(z | f(y))} f(g(y,f(y))) = f(g(y,f(y))) x z and g(y,f(y)) = g(y,f(y)) x
Example of Failure • S = {(f(x,g(y)), f(h(y),x))} • Decompose • S -> {(x,h(y)), (g(y),x)} • Orient • S -> {(x,h(y)), (x,g(y))} • Substitute • S -> {(h(y), g(y))} with {x | h(y)} • No rule to apply! Decompose fails!
Example Regular Expressions • (01)*1 • The set of all strings of 0’s and 1’s ending in 1, {1, 01, 11,…} • a*b(a*) • The set of all strings of a’s and b’s with exactly one b • ((01) (10))* • You tell me • Regular expressions (equivalently, regular grammars) important for lexing, breaking strings into recognized words
Start State Example FSA 1 0 1 Final State 0 0 1 1 Final State 0
Ocamllex Regular Expression • Single quoted characters for letters: ‘a’ • _: (underscore) matches any letter • Eof: special “end_of_file” marker • Concatenation same as usual • “string”: concatenation of sequence of characters • e1 | e2: choice - what was e1 e2
Ocamllex Regular Expression • [c1 - c2]: choice of any character between first and second inclusive, as determined by character codes • [^c1 - c2]: choice of any character NOT in set • e*: same as before • e+: same as e e* • e?: option - was e1
Ocamllex Regular Expression • e1 # e2: the characters in e1 but not in e2; e1 and e2 must describe just sets of characters • ident: abbreviation for earlier reg exp in let ident = regexp • e1 as id: binds the result of e1 to id to be used in the associated action
Sample Grammar • Language: Parenthesized sums of 0’s and 1’s • <Sum> ::= 0 • <Sum >::= 1 • <Sum> ::= <Sum> + <Sum> • <Sum> ::= (<Sum>)
BNF Derivations • Pick a rule and substitute: • <Sum> ::= <Sum> + <Sum> <Sum> => <Sum> + <Sum >
Example cont. • 1 * 1 + 0: <exp> <factor> <bin> * <exp> 1 <factor> + <factor> <bin> <bin> 1 0 Fringe of tree is string generated by grammar
Example: Ambiguous Grammar • 0 + 1 + 0 <Sum> <Sum> <Sum> + <Sum> <Sum> + <Sum> <Sum> + <Sum> 0 0 <Sum> + <Sum> 0 1 1 0
Two Major Sources of Ambiguity • Lack of determination of operator precedence • Lack of determination of operator assoicativity • Not the only sources of ambiguity
How to Enforce Associativity • Have at most one recursive call per production • When two or more recursive calls would be natural leave right-most one for right assoicativity, left-most one for left assoiciativity
Example • <Sum> ::= 0 | 1 | <Sum> + <Sum> | (<Sum>) • Becomes • <Sum> ::= <Num> | <Num> + <Sum> • <Num> ::= 0 | 1 | (<Sum>)
Operator Precedence • Operators of highest precedence evaluated first (bind more tightly). • Precedence for infix binary operators given in following table • Needs to be reflected in grammar
Predence in Grammar • Higher precedence translates to longer derivation chain • Example: <exp> ::= <id> | <exp> + <exp> | <exp> * <exp> • Becomes <exp> ::= <mult_exp> | <exp> + <mult_exp> <mult_exp> ::= <id> | <mult_exp> * <id>
Problems for Recursive-Descent Parsing • Left Recursion: A ::= Aw translates to a subroutine that loops forever • Indirect Left Recursion: A ::= Bw B ::= Av causes the same problem
Problems for Recursive-Descent Parsing • Parser must always be able to choose the next action based only only the next very next token • Pairwise Disjointedness Test: Can we always determine which rule (in the non-extended BNF) to choose based on just the first token
Pairwise Disjointedness Test • For each rule A ::= y Calculate FIRST (y) = {a | y =>* aw} { | if y =>* } • For each pair of rules A ::= y and A ::= z, require FIRST(y) FIRST(z) = { }
Factoring Grammar • Test too strong: Can’t handle <expr> ::= <term> [ ( + | - ) <expr> ] • Answer: Add new non-terminal and replace above rules by <expr> ::= <term><e> <e> ::= + <term><e> <e> ::= • You are delaying the decision point
Both <A> and <B> have problems: <S> ::= <A> a <B> b <A> ::= <A> b | b <B> ::= a <B> | a Transform grammar to: <S> ::= <A> a <B> b <A> ::-= b<A1> <A1> :: b<A1> | <B> ::= a<B1> <B1> ::= a<B1> | Example