Bernd Fischer RW713: Compiler and Software Language Engineering
Contextual Analysis
Appel’s view of a compiler
Contextual Analysis “Semantic” Analysis is a misnomer –context-sensitive syntax Determine whether program is well-formed –scope rules(i.e., visibility) –type rules(i.e., compatibility) Two phases –name resolution- what does an identifier refer to? –type checking- can you do that with it?
Name Resolution Nested block structure multiple scope levels Match identifier use with corresponding declaration: int i, j;... void f(int k) { char i;... for (int i = 0; i < 10; i++){... if (i > 5) {...} } i = “a”; }... j = i + 1;
Names vs. Symbols Operations on strings are expensive –comparison, hashing, … Instead: –enter all names into common area (lexeme pool) –can be done by lexer –work with symbols
Representing Symbols package Symbol; public class Symbol { private String name; private Symbol(String n) { name = n; } private static java.util.Dictionary pool = new java.util.Hashtable(); public String toString() { return name; } public static Symbol toSymbol(String n) { String u = n.intern(); Symbol s = (Symbol) pool.get(u); if (s == null) { s = new Symbol(u); pool.put(u,s); } return s; }
Symbol Tables Mapping symbols (identifiers) ↦ attributes –constant:type, value, … –variable:type, pointer to declaration, … –method:return and argument types, modifiers, … –class: pointer to (public) symbol table, … Also called environments = set of bindings Must cope with –nested scopes:block structure –parallel scopes:multiple name spaces Efficiency is important (but not paramount…) –lookup is most common operation
Symbol Table Interface public class Table { public Table(); public void put(Symbol key, Object value); public Object get(Symbol key); public void beginScope(); public void endScope(); public java.util.Enumeration keys(); }
Symbol Table Implementation Use hash table for efficiency –allow multiple entries with same key (external chaining) New binding for identifier x added to head of list for hash table bucket –“hides” earlier occurrences Uses auxiliary stack to implement “undo”: –beginScope() pushes marker onto stack –subsequent entries recorded on stack –endScope() pops symbols from stack, removes binding from table.
Hyperscope Standard Environment –standard collection of types, functions, etc. –can be used without declaration / import –e.g., built-in Pascal-functions, java.lang Initialize symbol table with these –at outermost scope level –watch for re-definitions (if required by language)
Multiple Name Spaces In many languages, a single symbol table is not sufficient: Solutions: –change symbol table type: Tag*Symbol public void put(Tag tbl, Symbol key, Object value); public Object get(Tag tbl, Symbol key); –separate tables for types and variables let type a = int var a : a = 5 var b : a = a in... end Type and variable declaration must be visible at same time!
Multiple Name Spaces (2) Modules/classes have separate name spaces –accessible by qualification / import –separate tables for each compilation unit class A { public static int x = 2; } class B { public static int x = A.x; } class Test { public static void main(String[] args) { System.out.println(""+(A.x + B.x)); }
Multiple Name Spaces (2) Modules/classes have separate name spaces –accessible by qualification / import –separate tables for each compilation unit Separate compilation requires persistent symbol tables –Modula-2:.sym -files –C: delayed to linker, no real error check
Type Checking Checking that identifiers are used correctly (i.e. in accordance with their [declared] type) Statically-typed languages –(Fortran, C), Ada, C++, Java, ML, Haskell,… –types always known at compile time Dynamically-typed languages –Lisp, Scheme, Smalltalk, Python, … –types determined at run time int x; void f() { f = x(2); }
Type Checking (2) Recursive walk over tree Uses current environment | AssignmentStmt | | Variable Expression –look up type of Variable –check and determine type of Expression –check for compatibility Compatibility rules –subtyping –coercion: insert operations to enforce compatibility
Type Checking Function Calls Type of greater is int x int boolean To check applied occurrence: –look up greater in symbol table –check types of actual parameters match –result type of function (boolean) becomes result type of function call boolean greater (int x, int y){ return (x > y); }... if (greater(a, b)) {... }
Name-Binding Languages
Type Systems
Structural Operational Semantics
Reference Attribute Grammars
NaBL (Name Binding Language) domain-specific language to describe scope rules part of compiler-generation tool suite (Spoofax) compiled into name resolution algorithm semantics based on scope graphs
Definitions and References
Unique and Non-Unique Definitions
Namespaces
Scope
C# Namespaces are Scopes
Imports
Interaction with Type System (1)
Interaction with Type System (2)
Interaction with Type System (3)
NaBL use cases static checking editor services transformation refactoring code generation
NaBL use cases – reference resolution
NaBL use cases – code completion