6. Semantic Analysis. Semantic Analysis Phase – Purpose: compute additional information needed for compilation that is beyond the capabilities of Context-

Slides:



Advertisements
Similar presentations
CS3012: Formal Languages and Compilers Static Analysis the last of the analysis phases of compilation type checking - is an operator applied to an incompatible.
Advertisements

CPSC 388 – Compiler Design and Construction
Semantic Analysis and Symbol Tables
Symbol Table.
Semantics Static semantics Dynamic semantics attribute grammars
Intermediate Code Generation
Chapter 6 Type Checking. The compiler should report an error if an operator is applied to an incompatible operand. Type checking can be performed without.
Semantic Analysis Chapter 6. Two Flavors  Static (done during compile time) –C –Ada  Dynamic (done during run time) –LISP –Smalltalk  Optimization.
Programming Languages and Paradigms
Control Structures Any mechanism that departs from straight-line execution: –Selection: if-statements –Multiway-selection: case statements –Unbounded iteration:
1 Compiler Construction Intermediate Code Generation.
Names and Bindings.
INF 212 ANALYSIS OF PROG. LANGS Type Systems Instructors: Crista Lopes Copyright © Instructors.
Chapter 5: Elementary Data Types Properties of types and objects –Data objects, variables and constants –Data types –Declarations –Type checking –Assignment.
Type Checking.
Compiler Construction
Compiler Principle and Technology Prof. Dongming LU Mar. 28th, 2014.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Tutorial 6 & 7 Symbol Table
Hash Tables1 Part E Hash Tables  
Hash Tables1 Part E Hash Tables  
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
Introducing Hashing Chapter 21 Copyright ©2012 by Pearson Education, Inc. All rights reserved.
1 ES 314 Advanced Programming Lec 2 Sept 3 Goals: Complete the discussion of problem Review of C++ Object-oriented design Arrays and pointers.
Compiler Construction Semantic Analysis I Rina Zviel-Girshin and Ohad Shacham School of Computer Science Tel-Aviv University.
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
1. 2 Problem RT&T is a large phone company, and they want to provide enhanced caller ID capability: –given a phone number, return the caller’s name –phone.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
CSc 453 Semantic Analysis Saumya Debray The University of Arizona Tucson.
Symbol Tables Symbol tables are used by compilers to keep track of information about variables functions class names type names temporary variables etc.
CSC3315 (Spring 2009)1 CSC 3315 Programming Languages Hamid Harroud School of Science and Engineering, Akhawayn University
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Semantic Analysis1 Checking what parsers cannot.
Lesson 11 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
COMPILER CONSTRUCTION
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Compiler Construction Compiler Construction Semantic Analysis I.
1 Mooly Sagiv and Greta Yorsh School of Computer Science Tel-Aviv University Modern Compiler Design.
410/510 1 of 18 Week 5 – Lecture 1 Semantic Analysis Compiler Construction.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
Chapter 10 Hashing. The search time of each algorithm depend on the number n of elements of the collection S of the data. A searching technique called.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS536 Semantic Analysis Introduction with Emphasis on Name Analysis 1.
Chapter 3 Syntax, Errors, and Debugging Fundamentals of Java.
Chapter 1 Introduction Major Data Structures in Compiler
COSC 2007 Data Structures II Chapter 13 Advanced Implementation of Tables IV.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
1 CSCD 326 Data Structures I Hashing. 2 Hashing Background Goal: provide a constant time complexity method of searching for stored data The best traditional.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
CPS120: Introduction to Computer Science Variables and Constants.
Compiler Construction CPCS302 Dr. Manal Abdulaziz.
LECTURE 3 Compiler Phases. COMPILER PHASES Compilation of a program proceeds through a fixed series of phases.  Each phase uses an (intermediate) form.
CS412/413 Introduction to Compilers Radu Rugina Lecture 11: Symbol Tables 13 Feb 02.
Scope of Variable. 2 Scope and visibility Scope (visibility) of identifier = portion of program where identifier can be referred to Lexical scope = textual.
Compiler Principle and Technology Prof. Dongming LU Apr. 15th, 2015.
Compiler Design Lecture 10 Semantic Analysis. int aintegers int a[2][3]array(2, array(3, integer)) int f(int, float, char) int x float x char  int Int.
Semantic Analysis Chapter 6. Two Flavors  Static (done during compile time) –C –Ada  Dynamic (done during run time) –LISP –Smalltalk  Optimization.
Compiler Design Lecture 10 Semantic Analysis. int aintegers int a[2][3]array(2, array(3, integer)) int f(int, float, char) int x float x char  int Int.
Lecture 3: More Java Basics Michael Hsu CSULA. Recall From Lecture Two  Write a basic program in Java  The process of writing, compiling, and running.
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Context-Sensitive Analysis
CS 326 Programming Languages, Concepts and Implementation
Hashing CSE 2011 Winter July 2018.
Semantic Analysis Chapter 6.
Semantic Analysis Chapter 6.
Compiler Construction
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Compiler Construction
Presentation transcript:

6. Semantic Analysis

Semantic Analysis Phase – Purpose: compute additional information needed for compilation that is beyond the capabilities of Context- Free Grammars and Standard Parsing Algorithms – Static semantic analysis: Take place prior to execution (Such as building a symbol table performing type inference and type checking) Classification – Analysis of a program required by the rules of the programming language to establish its correctness and to guarantee proper execution – Analysis performed by a compiler to enhance the efficiency of execution of the translated program

Description of the static semantic analysis – Attribute grammar identify attributes of language entities that must be computed and to write attribute equations or semantic rules that express how the computation of such attributes is related to the grammar rules of the language. – Which is most useful for languages that obey the principle of Syntax-Directed Semantics – Abstract syntax as represented by an abstract syntax tree

Contents 6.1 Attributes and Attribute Grammars 6.3 The Symbol Table 6.4 Data Types and Type Checking

6.1 Attributes and Attribute Grammars

Attributes – Any property of a programming language construct such as The data type of a variable The value of an expression The location of a variable in memory The object code of a procedure The number of significant digits in a number Binding of the attribute – The process of computing an attribute and associating its computed value with the language construct Binding time – The time during the compilation/execution process when the binding of an attribute occurs – Based on the difference of the binding time, attributes is divided into Static attributes (be bound prior to execution) and Dynamic attributes (be bound during execution)

Example: The binding time and significance during compilation of the attributes. – Attribute computations are extremely varied – Type checker A Type Checker is a semantic analyzer that computes the data type attribute of all language entities for which data types are defined and verifies that these types conform to the type rules of the language. In a language like C or Pascal, is an important part of semantic analysis;. While in a language like LISP, data types are dynamic, LISP compiler must generate code to compute types and perform type checking during program execution. – The values of expressions Usually dynamic and the be computed during execution; But sometime can also be evaluated during compilation (constant folding).

6.1.1 Attribute Grammars

In syntax-directed semantics, attributes are associated directly with the grammar symbols of the language. X.a means the value of ‘ a ’ associated to ‘ X ’ – X is a grammar symbol and a is an attribute associated to X Syntax-directed semantics: – Attributes are associated directly with the grammar symbols of the language. – Given a collection of attributes a 1, …, a k, it implies that for each grammar rule X 0  X 1 X 2 … X n (X 0 is a nonterminal), – The values of the attributes X i.a j of each grammar symbol X i are related to the values of the attributes of the other symbols in the rule.

An attribute grammar for attributes a 1, a 2, …,a k is the collection of all attribute equations or semantic rules of the following form, – for all the grammar rules of the language. Xi.aj = f ij (X 0.a 1, …,X 0.a k, …,X 1.a l,….. X 1.a k …, X n-1.a 1, … X n.a k ) Where f ij is a mathematical function of its arguments Typically, attribute grammars are written in tabular form as follows: Grammar Rule Semantic Rules Rule 1 Associated attribute equations... Rule n Associated attribute equation

Example 6.1 consider the following simple grammar for unsigned numbers: Number  number digit | digit Digit  0|1|2|3|4|5|6|7|8|9 The most significant attribute: numeric value (write as val), and the responding attribute grammar is as follows:

The parse tree showing attribute computations for the number 345 is given as follows

Example 6.2 consider the following grammar for simple integer arithmetic expressions: Exp  exp + term | exp-term | term Term  term*factor | factor Factor  (exp)| number The principal attribute of an exp (or term or factor) is its numeric value (write as val) and the attribute equations for the val attribute are given as follows

Given the expression (34-3)*42, the computations implied by this attribute grammar by attaching equations to nodes in a parse tree is as follows

Example 6.3 consider the following simple grammar of variable declarations in a C-like syntax: Decl  type var-list Type  int | float Var-list  id, var-list |id Define a data type attribute for the variables given by the identifiers in a declaration and write equations expressing how the data type attribute is related to the type of the declaration as follows: (We use the name dtype to distinguish the attribute from the nonterminal type)

Note that there is no equation involving the dtype of the nonterminal decl. It is not necessary for the value of an attribute to be specified for all grammar symbols Parse tree for the string float x,y showing the dtype attribute as specified by the attribute grammar above is as follows:

Example 6.4 consider the following grammar, where numbers may be octal or decimal, suppose this is indicated by a one-character suffix o(for octal) or d(for decimal): Based-num  num basechar Basechar  o|d Num  num digit | digit Digit  0|1|2|3|4|5|6|7|8|9 In this case num and digit require a new attribute base, which is used to compute the val attribute. The attribute grammar for base and val is given as follows.

345o

6.3 Symbol Table

The symbol table is the major inherited attribute in a compiler and, after syntax tree, forms the major data structure. The operations of the Symbol table are insert: is used to store the information provided by name declarations when processing these declarations. lookup: is needed to retrieve the information associated to a name when that name is used to associate code delete: is needed to remove the information provided by a declaration when that declaration no longer applies

The structure of the symbol table The symbol table in a compiler is a typical dictionary data structure include linear lists :constant time insert operations, can be chosen if speed is not a concern various search tree structures( binary search trees, AVL trees, B trees):do not provide best case efficiency and complexity of delete operations hash tables: best choice because all three operations can be performed at constant time.

The best scheme for compiler construction is to open addressing, called separate chaining. In this method each bucket is actually a linear list, and collisions are resolved by inserting the new item into the bucket list.

separate chaining. – The size of the linear list ranges from few hundred to over a thousand – Size of the bucket array should be chosen to be a prime number. Hash function for a symbol table – Convert the character string in to an integer between 0…..size-1 Convert each character in to a non negative integer – C language uses ASCII values – Pascal uses ord function Combine all the integers in to a single integer – Programmers choice Resulting integer should be scaled to the range of 0……size-1

separate chaining. – Mod function is used to scale a number to fall in to the range of 0…size-1. – Combine all the integers in to a single integer One simple method is to consider first few characters or first middle and last characters and ignore the rest – Inadequate: as programmers can use variables like temp1,temp2… and will cause frequent collisions Chosen method should reduce the collisions One good method is to multiplying factor(some constant number ) – If ci is the numeric value of the (i+1)th character and hi is the partial is the partial hash value computed at the (i)th step then » h(i+1) = h i +c » This can be generalized to » H=( n-1 c 1 + n-2 c 2 +……. c n-1 +c n )mod size

separate chaining – The choice of the constant in this formula has significant effect on the out come A reasonable choice for the constant is a number which is power of 2 (like 2,4,8,16,32,64,128…)

Declarations. – There is no specific standard to create a symbol table for all compilers. Symbol table will vary from compiler to compiler and it depends on – How each compiler Calls insert and delete operations – How the variables are declared in each compiler. – How long the compiler wants the symbol table to exist – The behavior of the symbol table heavily depends on the properties of declarations of the language being translated. – There are 4 basic kinds of declarations Constant, – Const int SIZE=199 type – Structures and unions variable – Variable declarations :int a,b[100]

Declarations. – procedure/function Are constant declaration of procedure/function type (similar to other variables) These variables are explicit in nature (needs to be declared before use) In few compliers they are implicit in nature (need not be declared before use) – Fortran,basic – We can create one single symbol table for all above 4 declarations Constant declarations(value binding) associate values to names. – Saves them in the symbol table and their value will never change in symbol table – In the execution phase the compiler will change these constant variables names to values by picking them from the symbol table

Declarations. – We can create one single symbol table for all above 4 declarations – Type declarations, Variable declaration and function/procedure declarations also will be saved in the symbol table with minute difference between each other

Scope rules and Block structure: In the above program there are five blocks First block : Variables int i,j and the function f has scope for entire program(global). Second block: the variable size has one scope Third block: variables i and temp has different scope

Scope rules and Block structure: In the above program there are five blocks Third :variables i and temp has different scope » Variable i is of type char but in global scope it is of type int » Until we come out of this 3 rd block (function f) the type of i will be char. When we come out of 3 rd block again variable i will be of type int. » Compiler is giving more priority to local variables. When we come out of the block the variable i of type char will be deleted and again variable i of type int will come in to picture.

Scope rules and Block structure: In the above program there are five blocks Third :variables i and temp has different scope – When processing this 3 rd block the symbol table will look like

Scope rules and Block structure: Variable i is saved twice with both types Variable i of type char is saved first, so when try to look up for variable i, it will return the char variable only (but not int variable) When we come out of the block, i (char) and temp (char) will be deleted from the symbol table

Scope rules and Block structure: In the above program there are five blocks Fourth block :variables j of type double will be added to the table and the symbol table looks like

Scope rules and Block structure: In the above program there are five blocks At the end of the function f block all the variables declared in it are deleted and the symbol table will be

Scope rules and Block structure: One more type of scope declaration is using different symbol tables for each block (as seen in the previous slides) and the practical implementation will look like

37 class Foo { int value; int test() { int b = 3; return value + b; } void setValue(int c) { value = c; { int d = c; c = c + d; value = c; } class Bar { int value; void setValue(int c) { value = c; } scope of value scope of b scope of c scope of value scope of d Symbol table example block1

38 class Foo { int value; int test() { int b = 3; return value + b; } void setValue(int c) { value = c; { int d = c; c = c + d; value = c; } class Bar { int value; void setValue(int c) { value = c; } Symbol table example SymbolKindTypeProperties valuefieldint… testmethod-> int setValuemethodint -> void (Foo) SymbolKindTypeProperties bvarint… (test) SymbolKindTypeProperties cvarint… (setValue) SymbolKindTypeProperties dvarint… (block1)

39 SymbolKindTypeProperties valuefieldint… testmethod-> int setValuemethodint -> void SymbolKindTypeProperties bvarint… SymbolKindTypeProperties cvarint… SymbolKindTypeProperties dvarint… (Foo) (test) … Symbol table example cont. (setValue) (block1)

40 Checking scope rules SymbolKindTypeProperties valuefieldint… testmethod-> int setValuemethodint -> void SymbolKindTypeProperties bvarint… SymbolKindTypeProperties cvarint… SymbolKindTypeProperties dvarint… (Foo) (test)(setValue) (block1) void setValue(int c) { value = c; { int d = c; c = c + d; value = c; } lookup(value)

41 SymbolKindTypeProperties valuefieldint… testmethod-> int setValuemethodint -> void SymbolKindTypeProperties bvarint… SymbolKindTypeProperties cvarint… SymbolKindTypeProperties dvarint… (Foo) (test)(setValue) (block1) void setValue(int c) { value = c; { int d = c; c = c + d; myValue = c; } lookup(myValue) Error ! undefined symbol Catching semantic errors

6.4 Data Types and Type Checking

Principal Task of Compiler Type inference The computation and maintenance of information on data types Type checking Type checking uses logical rules to reason about the behavior of a program at run time. Specifically, it ensures that the types of the operands match the type expected by an operator. For example, the && operator in Java expects its two operands to be booleans; the result is also of type boolean. The use of the information to ensure that each part of the program makes sense under the type rules of the language. These two tasks are related, performed together, and referred as type checking.

Data Type Data type information may be either static or dynamic, or a combination of two. Ex: In LISP type inference and type checking will be done during the execution(dynamic) Ex: In Pascal, C and Ada type information is primarily static, and is used as the principal mechanism for checking the correctness of the program before execution(static)