COMPILERS Semantic Analysis hussein suleman uct csc3003s 2009.

Slides:



Advertisements
Similar presentations
Symbol Table.
Advertisements

CSC 4181 Compiler Construction Scope and Symbol Table.
Programming Languages and Paradigms
CSI 3120, Implementing subprograms, page 1 Implementing subprograms The environment in block-structured languages The structure of the activation stack.
Subprograms The basic abstraction mechanism. Functions correspond to the mathematical notion of computation input output procedures affect the environment,
Chapter 9 Subprogram Control Consider program as a tree- –Each parent calls (transfers control to) child –Parent resumes when child completes –Copy rule.
ISBN Chapter 10 Implementing Subprograms.
5. Semantic Analysis Prof. O. Nierstrasz Jorge Ressia Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and.
1 Pertemuan 20 Run-Time Environment Matakuliah: T0174 / Teknik Kompilasi Tahun: 2005 Versi: 1/6.
Run time vs. Compile time
Context-sensitive Analysis, II Ad-hoc syntax-directed translation, Symbol Tables, andTypes.
The environment of the computation Declarations introduce names that denote entities. At execution-time, entities are bound to values or to locations:
Chapter 9: Subprogram Control
1 Run time vs. Compile time The compiler must generate code to handle issues that arise at run time Representation of various data types Procedure linkage.
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
Chapter 7: Runtime Environment –Run time memory organization. We need to use memory to store: –code –static data (global variables) –dynamic data objects.
5. Semantic Analysis Prof. O. Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture.
COMPILERS Semantic Analysis hussein suleman uct csc3005h 2006.
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Names and Scope. Scope Suppose that a name is used many times for different entities in text of the program, or in the course of execution. When the name.
Compiler Construction
COMPILERS Symbol Tables hussein suleman uct csc3003s 2007.
Basic Semantics Associating meaning with language entities.
Implementing Subprograms What actions must take place when subprograms are called and when they terminate? –calling a subprogram has several associated.
BİL 744 Derleyici Gerçekleştirimi (Compiler Design)1 Run-Time Environments How do we allocate the space for the generated target code and the data object.
COMP3190: Principle of Programming Languages
A.Alzubair Hassan Abdullah Dept. Computer Sciences Kassala University A.Alzubair Hassan Abdullah Dept. Computer Sciences Kassala University NESTED SUBPROGRAMS.
Concepts of programming languages Chapter 5 Names, Bindings, and Scopes Lec. 12 Lecturer: Dr. Emad Nabil 1-1.
Bernd Fischer RW713: Compiler and Software Language Engineering.
7. Runtime Environments Zhang Zhizheng
1 Structure of Compilers Lexical Analyzer (scanner) Modified Source Program Parser Tokens Semantic Analysis Syntactic Structure Optimizer Code Generator.
Names, Scope, and Bindings Programming Languages and Paradigms.
1 Compiler Construction Run-time Environments,. 2 Run-Time Environments (Chapter 7) Continued: Access to No-local Names.
Run-Time Environments Presented By: Seema Gupta 09MCA102.
Code Generation Instruction Selection Higher level instruction -> Low level instruction Register Allocation Which register to assign to hold which items?
Implementing Subprograms
Lecture 9 Symbol Table and Attributed Grammars
Chapter 10 : Implementing Subprograms
Implementing Subprograms Chapter 10
Implementing Subprograms
Context-Sensitive Analysis
Type Checking, and Scopes
Constructing Precedence Table
Compiler Construction (CS-636)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
CS 326 Programming Languages, Concepts and Implementation
Implementing Subprograms
Semantic Analysis with Emphasis on Name Analysis
Subprograms The basic abstraction mechanism.
Implementing Subprograms
Chapter 10: Implementing Subprograms Sangho Ha
Implementing Subprograms
Implementing Subprograms
CMPE 152: Compiler Design October 4 Class Meeting
Chapter 6 Intermediate-Code Generation
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
Implementing Subprograms
PZ09A - Activation records
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
UNIT V Run Time Environments.
Names and Binding In Text: Chapter 5.
Run Time Environments 薛智文
COMPILERS Semantic Analysis
Runtime Environments What is in the memory?.
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Implementing Subprograms
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Activation records Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Section
Implementing Subprograms
Chapter 10 Def: The subprogram call and return operations of
Presentation transcript:

COMPILERS Semantic Analysis hussein suleman uct csc3003s 2009

Semantic Analysis  The compilation process is driven by the syntactic structure of the program as discovered by the parser.  Semantic routines: interpret meaning of the program based on its syntactic structure it has two functions:  finish analysis by deriving context-sensitive information  begin synthesis by generating the IR or target code  Associated with individual productions of a context free grammar or subtrees of a syntax tree.

Context Sensitive Analysis - Why  What context-sensitive issues can be determined? Is X declared before it is used? Are any names declared but not used? Which declaration of X does this reference? Is an expression type-consistent? Do the dimensions of a reference match the declaration? Where can X be stored? (heap, stack,,,, ) Does *p reference the result of a malloc()? Is X defined before it is used? Is an array reference in bounds? Does function foo(…) produce a constant value?

Context Sensitive Analysis - How  How to check symbols and their semantics at various points in the program? Process program linearly (roughly, in-order tree traversal). Maintain a list of currently defined symbols and what they mean as the program is processed – this is called a Symbol Table.

Symbol Tables  Associate lexical names (symbols) with their attributes.  Can contain: variable names defined constants procedure/function/method names literal constants and strings source text labels compiler-generated temporaries subtables for structure layouts (types) (field offsets and lengths)

Symbol Table Attributes  Attributes are internal representations of declarations.  The following attributes would be kept in a symbol table: textual name data type dimension information (for aggregates) declaring procedure lexical level of declaration storage class (base address) offset in storage if record, pointer to structure table if parameter, by-reference or by-value? can it be aliased? to what other names? number and type of arguments to functions/methods

Attribute Differentiation  Names may have different attributes depending on their meaning: variables: type, procedure level, frame offset types: type descriptor, data size/alignment constants: type, value methods: formals (names/types), result type, block information (local decls.), frame size

Binding  As the declarations of types, variables, and functions are processed, identifiers are bound to “meanings” in the symbol table.  A symbol table is a set of bindings.  … But this binding is not static – it changes over the course of the program.

Scope  An identifier has scope when it is visible and can be referenced.  An out-of-scope identifier cannot be referenced.  Identifiers in open scopes may override older/outer scopes temporarily.  2 Types of scope: Static scope is when visibility is due to the lexical nesting of subprograms/blocks. Dynamic scope is when visibility is due to the call sequence of subprograms.

Basic Static Scope  Usually, a name begins life where it is declared and ends at the end of its block. void foo() { int k; ……. }

Why Scope?  Scope is not necessary. Languages such as assembler have exactly one scope: the whole program.  Modern programming languages have more than one scope. Information hiding and modularity.  Goal of any language is to make the programmer’s job simpler. One way: keep things isolated. Make each thing only affect a limited area. Make it hard to break something far away.

Changing Scope  Identifiers come into scope at the beginning of a subprogram/block and go out of scope at the end.  Example (in C++): void testfunc () { int a; // a enters scope; for ( int b=1; b<10; b++ ) // b in scope for for { int c; // c enters scope … } // b,c leave scope … } // a leaves scope

Types of Scope  Static Scope Accessibility of variables depends on lexical structure of program. Determined at compile time / “programming time”.  Dynamic Scope Accessibility of variables depends on call sequence of program. Determined at runtime.

Static Scope  Consider the Pascal program (which uses static scoping): program test; var a : integer; procedure proc1; var b : integer; begin end; procedure proc2; var a, c : integer; begin proc1; end; begin proc2; end. in scope: a (from test) in scope: a, c (from proc2) in scope: b (from proc1), a (from test)

Dynamic Scope  Consider the Pascal-like code (assume dynamic scoping): program test; var a : integer; procedure proc1; var b : integer; begin end; procedure proc2; var a, c : integer; begin proc1; end; begin proc2; end. in scope: a (from test) in scope: a, c (from proc2) in scope: b (from proc1) a, c (from proc2)

Static vs. Dynamic Scope  Dynamic scope makes it easier to access variables with lifetime, but it is difficult to understand the semantics of code outside the context of execution.  Static scope is more restrictive – therefore easier to read – but may force the use of more subprogram parameters or global identifiers to enable visibility when required.

Scope in a symbol table  Most modern programming languages have nested static scope. The symbol table must reflect this.  What additional information can reflect nested scope? A name query must access the most recent declaration, from the current scope or some enclosing scope. Innermost scope overrides declarations from outer scopes.

Scope and Symbol Table Operations  What symbol table operations do we need? void put (Symbol key, Object value)  binds key to value Object get(Symbol key)  returns value bound to key void beginScope()  remembers current state of table void endScope()  restores table to state at most recent scope that has not been ended

Symbol Table Implementation  Implemented as a collection of dictionaries in which each symbol is placed.  Many different possible data structures: array linked list hash table binary tree

Symbol Table Lookup  Basic operation is to find the entry for a given symbol.  Each symbol table may have a pointer to its parent scope.  Lookup: if symbol in current table, return it, otherwise look in parent.  Hash tables and binary trees can be used more efficiently.

Types of Implementation  Imperative Auxiliary data structures are modified as the analysis progresses, always reflecting only the current state.  Functional Auxiliary data structures are maintained intact as the analysis progresses, with new versions created when needed – thus previous and current states are all available at any time.

Hash Table - Imperative  Principle: Use a stack of scope pointers to lists of entries in each scope.  put Chain new entries to beginning of table, thus overriding older entries. New entries are linked together, headed by scope pointer.  beginScope Create new list pointer for new entries. Save old pointer on stack  endScope Using list pointer, remove entries from head of each linked list.

Hash Table - Functional  Principle: Create copies of array for every open scope.  put Chain new entries to beginning of table, thus overriding older entries.  beginScope Functional - Create copy of hash table array.  endScope Functional – Dispose of array.

Binary Tree - Functional  Principle: Duplicate node list to root.  put Insert new entries into a new subtree, duplicating nodes up to the root.  endScope Delete all nodes in new subtree. valueC 3 valueA 4 valueA 2 valueB 1 root before second “valueA” root after second “valueA”

Symbols vs. Names  Names are the textual entities found in the source code.  Symbols are entities assigned to each name for more efficient processing during compilation.  Example: Name: valueA  Symbol: V001 Name: valueB  Symbol: V002  Remember perfect hash functions?

Type Checking  Static semantics should be checked after/as the symbol table is populated. Is every name defined before it is used? Does the type of each subexpression conform to what is expected? Are the types on either side of an assignment compatible?  The tree can be walked/visited to perform these checks. May need multiple passes – so retain symbol table across passes.

Type Equivalence  Two approaches: Name equivalence: each type name is a distinct type. Structural equivalence: two types are equivalent iff. they have the same structure (after substituting type expressions for type names).  Example (structural): typedef int bignumber; int c; bignumber b =c;

Error Handling  If errors are detected, correct program representation and continue analysis to detect other errors.  Example: int a, b; String c; c = a; b = a;