120 likes | 135 Views
Symbol Tables. The symbol table contains information about variables functions class names type names temporary variables etc. Symbol Tables. What kind of information is usually stored in a symbol table? type storage class size scope stack frame offset register. Symbol Tables.
E N D
Symbol Tables • The symbol table contains information about • variables • functions • class names • type names • temporary variables • etc.
Symbol Tables • What kind of information is usually stored in a symbol table? • type • storage class • size • scope • stack frame offset • register
Symbol Tables • How is a symbol table implemented? • array • simple, but linear LookUp time • However, we may use a sorted array for reserved words, since they are generally few and known in advance. • tree • O(lgn) lookup time if kept balanced • hash table • most common implementation • O(1) LookUp time
Symbol Tables • Hash tables • use array of size m to store elements • given key k (the identifier name), use a function h to compute index h(k) for that key • collisions are possible • two keys hash into the same slot. • Hash functions • A good hash function • is easy to compute • avoids collisions (by breaking up patterns in the keys and uniformly distributing the hash values)
Symbol Tables • Hash functions • A common hash function is h(k) = m*(k*c-k*c), for some constant 0<c<1 • In English • multiply the key k by the constant c • Take the fractional part of k*c • Multiply that by size m • Take the floor of the result • A good value for c:
Resolving collisions • Chaining • Put all the elements that collide in a chain (list) attached to the slot. • Insert/Delete/Lookup in expected O(1) time • However, this assumes that the chains are kept small. • If the chains start becoming too long, the table must be enlarged and all the keys rehashed.
Resolving collisions • Open addressing • Store all elements within the table • The space we save from the chain pointers is used up to make the array larger. • If there is a collision, probe the table in a systematic way to find an empty slot. • If the table fills up, we need to enlarge it and rehash all the keys. • Open addressing with linear probing • Probe the slots in a linear manner • Simple but Bad: results in clustering (long sequences of used slots build up very fast)
Resolving collisions • Open addressing with double hashing • Use a second hash function. • The probe sequence is:(h(k) + i*h2(k) ) mod m, with i=0, 1, 2, ... • Good performance • Since we use a second function, keys that originally collide will subsequently have different probe sequences. • No clustering • A good choice for h2(k) is p-(k mod p) where p is a prime less than m
Scope issues • Block-structured languages allow nested name scopes. • Usual visibility rules • Only names created in the current or enclosing scopes are visible • When there is a conflict, the innermost declaration takes precedence.
Scope issues • One idea is to have a global hash table and save the scope information for each entry. • When an identifier goes out of scope, scan the table and remove the corresponding entries • We may even link all same-scope entries together for easier removal. • Careful: deleting from a hash table that uses open addressing is tricky • We must mark a slot as Deleted, rather than Empty, otherwise later LookUp operations may fail.
Scope issues • Another idea is to maintain a separate, local hash table for each scope. • We may store the tables in a tree or a stack (that mirrors the stack frames).
Structure tables • Where should we store struct field names? • Separate mini symbol table for each struct • Conceptually easy • Separate table for all struct field names • We need to somehow uniquely map each name to its structure (e.g. by concatenating the field name with the struct name) • No special storage • struct field names are stored in the regular symbol table. • Again we need to be able to map each name to its structure.