CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 9 Ahmed Ezzat Semantic Analysis.

Slides:



Advertisements
Similar presentations
CPSC 388 – Compiler Design and Construction
Advertisements

Intermediate Code Generation
Chapter 6 Type Checking. The compiler should report an error if an operator is applied to an incompatible operand. Type checking can be performed without.
Chapter 5 Syntax-Directed Translation. Translation of languages guided by context-free grammars. Attach attributes to the grammar symbols. Values of the.
Semantic Analysis Chapter 4. Role of Semantic Analysis Following parsing, the next two phases of the "typical" compiler are – semantic analysis – (intermediate)
Type Checking Compiler Design Lecture (02/25/98) Computer Science Rensselaer Polytechnic.
Compiler Construction
Compiler Principle and Technology Prof. Dongming LU Mar. 28th, 2014.
1 Beyond syntax analysis An identifier named x has been recognized. Is x a scalar, array or function? How big is x? If x is a function, how many and what.
Semantic analysis Parsing only verifies that the program consists of tokens arranged in a syntactically-valid combination, we now move on to semantic analysis,
Honors Compilers Semantic Analysis and Attribute Grammars Mar 5th 2002.
1 Semantic Processing. 2 Contents Introduction Introduction A Simple Compiler A Simple Compiler Scanning – Theory and Practice Scanning – Theory and Practice.
Semantic analysis Enforce context-dependent language rules that are not reflected in the BNF, e.g.a function must have a return statement. Decorate AST.
Context-Free Grammars Lecture 7
Semantic analysis Enforce context-dependent language rules that are not reflected in the BNF, e.g.a function must have a return statement. Decorate AST.
Yu-Chen Kuo1 Chapter 2 A Simple One-Pass Compiler.
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
Copyright © 2005 Elsevier Chapter 4 :: Semantic Analysis Programming Language Pragmatics Michael L. Scott.
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Chapter 5 Syntax-Directed Translation Section 0 Approaches to implement Syntax-Directed Translation 1、Basic idea Guided by context-free grammar (Translating.
CSc 453 Semantic Analysis Saumya Debray The University of Arizona Tucson.
1 Abstract Syntax Tree--motivation The parse tree –contains too much detail e.g. unnecessary terminals such as parentheses –depends heavily on the structure.
Semantic Analysis Legality checks –Check that program obey all rules of the language that are not described by a context-free grammar Disambiguation –Name.
Semantic Analysis CS 671 February 5, CS 671 – Spring The Compiler So Far Lexical analysis Detects inputs with illegal tokens –e.g.: main$
Syntax-Directed Translation
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
COP4020 Programming Languages Semantics Prof. Xin Yuan.
Lesson 11 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
410/510 1 of 18 Week 5 – Lecture 1 Semantic Analysis Compiler Construction.
Overview of Previous Lesson(s) Over View  An ambiguous grammar which fails to be LR and thus is not in any of the classes of grammars i.e SLR, LALR.
Chapter 5: Syntax directed translation –Use the grammar to direct the translation The grammar defines the syntax of the input language. Attributes are.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
Review: Syntax directed translation. –Translation is done according to the parse tree. Each production (when used in the parsing) is a sub- structure of.
Overview of Previous Lesson(s) Over View  In syntax-directed translation 1 st we construct a parse tree or a syntax tree then compute the values of.
1 Static Checking and Type Systems Chapter 6 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2005.
1 Syntax-Directed Translation Part I Chapter 5 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Semantic Analysis II Type Checking EECS 483 – Lecture 12 University of Michigan Wednesday, October 18, 2006.
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
Copyright © 2009 Elsevier Chapter 4 :: Semantic Analysis Programming Language Pragmatics Michael L. Scott.
Chapter 8: Semantic Analyzer1 Compiler Designs and Constructions Chapter 8: Semantic Analyzer Objectives: Syntax-Directed Translation Type Checking Dr.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CSE 420 Lecture Program is lexically well-formed: ▫Identifiers have valid names. ▫Strings are properly terminated. ▫No stray characters. Program.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Chapter4 Syntax-Directed Translation Introduction : 1.In the lexical analysis step, each token has its attribute , e.g., the attribute of an id is a pointer.
Lecture 9 Symbol Table and Attributed Grammars
Semantic analysis Jakub Yaghob
Semantic Analysis Chapter 4.
Context-Sensitive Analysis
A Simple Syntax-Directed Translator
Constructing Precedence Table
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Syntax-Directed Translation Part I
CS 3304 Comparative Languages
Syntax-Directed Translation Part I
Syntax-Directed Translation Part I
Chapter 6 Intermediate-Code Generation
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chapter 4 Action Routines.
Syntax-Directed Translation Part I
SYNTAX DIRECTED DEFINITION
Compiler Construction
Syntax-Directed Translation Part I
COP4020 Programming Languages
COP4020 Programming Languages
Compiler Construction
Presentation transcript:

CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 9 Ahmed Ezzat Semantic Analysis

AST: Abstract Syntax Tree

CS 404Ahmed Ezzat 3 Semantic Analysis Semantic Analysis computes additional information related to the meaning of the program once the syntactic structure is known, i.e., try to understand the “meaning” of the program. Try to verify a program “make sense.” After parsing (and constructing AST)  next is semantics analysis & intermediate code generation In typed languages as C, semantic analysis involves adding information to the symbol table and performing type checking.

CS 404Ahmed Ezzat 4 Semantic Analysis The information to be computed is beyond the capabilities of standard parsing techniques, therefore it is not regarded as syntax. As for Lexical and Syntax analysis, also for Semantic Analysis we need both a Representation Formalism and an Implementation Mechanism. As representation formalism this lecture illustrates what are called Syntax Directed Translations.

CS 404Ahmed Ezzat 5 Semantic Analysis Goals Compiler must do more than recognize whether a sentence belongs to the language… Find remaining errors that would make program invalid  undefined variables, types  type errors that can be caught statically Figure out useful information for later phases  types of all expressions  data layout Terminology  Static checks – done by the compiler  Dynamic checks – done at run time

CS 404Ahmed Ezzat 6 Semantic Analysis: Kind of Checks Uniqueness checks  Certain names must be unique  Many languages require variable declarations Flow-of-control checks  Match control-flow operators with structures  Example: break applies to innermost loop/switch Type checks  Check compatibility of operators and operands Logical checks  Program is syntactically and semantically correct, but does not do the “correct” thing

CS 404Ahmed Ezzat 7 Semantic Analysis: Examples of Reported Errors Undeclared identifier Multiply declared identifier Index out of bounds Wrong number or types of args to call Incompatible types for operation Break statement outside switch/loop Goto with no label

CS 404Ahmed Ezzat 8 Semantic Analysis: Examples of Reported Errors How do these checks help compilers?  Allocate right amount of space for variables  Select right machine operations  Proper implementation of control structures Try compiling this code: void main() { int i=21, j=42; printf(“Hello World\n”); printf(“Hello World, N=%d\n”); printf(“Hello World\n”, i, j); printf(“Hello World, I=%d, J=%d\n”, i, j); }

CS 404Ahmed Ezzat 9 Semantic Analysis: Typical Semantics Errors Multiple declarations: a variable should be declared (in the same scope) at most once Undeclared variable: a variable should not be used before being declared Type mismatch: type of the LHS of an assignment should match the type of the RHS Wrong arguments: methods should be called with the right number and types of arguments

CS 404Ahmed Ezzat 10 Semantic Analysis: Type Checking and Code Generation You can type check and generate code as part of semantics actions:  Difficult to read/maintain  Compiler must analyze the program the same order parsed Instead, we split these 2 tasks (traversing the AST created by parser): 1. For each scope in the program  process the declarations  add new entries to the symbol table and  report any variables that are multiply declared  process the statements  find uses of undeclared variables, and  update the "ID" nodes of the AST to point to the appropriate symbol-table entry. 2. Process all of the statements in the program again  use the symbol-table information to determine the type of each expression, and to find type errors.

CS 404Ahmed Ezzat 11 Semantic Analysis: Examples I Check uniqueness of name declarations:  E.g., int f=2; int f() {return 1;} is not allowed in Pascal, but may be allowed in other languages Type checking  E,g, int i; i = 100; okay i = “abc”; not okay

CS 404Ahmed Ezzat 12 Semantic Analysis: Examples II Expression well-formed   Okay in C, not okay in some more strict languages  6 + “abc”  Not okay in C, okay in some other languages Same name must appear two or more times  Example: defining a block:  Begin B  End B

CS 404Ahmed Ezzat 13 Semantic Analysis: Examples III Function calls  Types of arguments match definitions  Number of arguments match definitions  Return type match definitions Flow-of-control checks  “break” only within loop  “return” only within function  “default” or “error” statement for “switch” in C

CS 404Ahmed Ezzat 14 Static vs. Dynamic Checking Semantic checking during compile time is called “static checking” Semantic checking during execution time is called “dynamic checking” The more is done at compile time, the fewer opportunities of error at run time

CS 404Ahmed Ezzat 15 Semantic Specifications Checking against what? Each language has a specification (spec) Specs gives the expected types (or other values) of a construct in its context:  C specs used to be “loose”, so it left rooms for different implementations  Java specs are very thorough

CS 404Ahmed Ezzat 16 How to Check? Together with parsing  Syntax Directed Translation (SDT) is one technique for implementing semantic analysis A separate pass  Between parsing and code generation  Syntax tree  checker  syntax tree

CS 404Ahmed Ezzat 17 Type Checking In most programming languages, there are basic types and “constructed/user-defined types,” e.g., struct in C-language Basic: atomic types  E.g., boolean, integer, string Constructed: from basic types  E.g., arrays, records, sets Not clearly defined – pointers can be either type depending on how implemented

CS 404Ahmed Ezzat 18 Type Expressions A basic type is a type expression A type name is a type expression Construct type expressions from other type expressions  Arrays: array(I,T), contains I elements of type T  Products: cartesian product T1XT2  Records or Structs: record(t), struct(t)  Pointers: pointer(t)

CS 404Ahmed Ezzat 19 Common Type Expressions (cont.) Function types: TD  TR Domain is the set of possible input values of the function. The type of domain is TD. Range is the set of possible output values of the function. The type of range is TR.  Example: Z = X mod Y  the type of mod is (int x int)  int

CS 404Ahmed Ezzat 20 Type System A type expression can be represented using a graph A type system is a collection of rules to assign type expressions Type checking: verify that operands have correct types

CS 404Ahmed Ezzat 21 Recovery from Type Errors Report errors Type coercion/casting: implicit change of types: e.g., int i, j; float f; i = j + f;  In C, f will be “coerced” to type int by truncate(f), and will not generate an error  In another language this may generate an error  Suggest to use explicit conversion i = (int) f + j;

CS 404Ahmed Ezzat 22 A Simple Type Checker Use syntax directed translation (SDT) method we talked about Types are defined as synthesized attributes Handles arrays, pointers, statements, functions, plus basic types char and int

CS 404Ahmed Ezzat 23 Equivalence of Types Structural equivalence: same base type, same constructions (maybe different name) Name equivalence:  Same type name  equivalent  No type name  equivalent only if declared together

CS 404Ahmed Ezzat 24 Overloading and Polymorphism Overloading: A symbol has different meaning depending on its context Polymorphism: a function whose arguments may have different types in different executions

CS 404Ahmed Ezzat 25 Attribute Grammars Both semantic analysis and intermediate code generation can be described in terms of annotation, or “decoration” of a sparse or syntax tree Attribute Grammars provide a formal framework for decorating such a tree Attribute Grammars is associated to Action Routines. Let us start with decoration of parse tree then consider syntax tree.

CS 404Ahmed Ezzat 26 Syntax-Directed Translation (SDT) Syntax Directed Translation (SDT) Syntax Directed Definition (SDD) Implementing SDD:  Dependency Graph  S-Attributed Definition  L-Attributed Definition Translation Schemes

CS 404Ahmed Ezzat 27 The Principle of Syntax Directed Translation states that the meaning of an input sentence is related to its syntactic structure, i.e., to its Parse-Tree. By Syntax Directed Translations we indicate those formalisms for specifying translations for programming language constructs guided by context-free grammars. – We associate Attributes to the grammar symbols representing the language constructs. – Values for attributes are computed by Semantic Rules associated with grammar productions. Syntax-Directed Translation (SDT): Overview

CS 404Ahmed Ezzat 28 Evaluation of Semantic Rules may: – Generate Code; – Insert information into the Symbol Table; – Perform Semantic Check; – Issue error messages; – etc. There are two notations for attaching semantic rules: 1. Syntax Directed Definitions. High-level specification hiding many implementation details (also called Attribute Grammars). 2. Translation Schemes. More implementation oriented: Indicate the order in which semantic rules are to be evaluated. Syntax-Directed Translation (SDT): Overview

CS 404Ahmed Ezzat 29 Syntax Directed Definition (SDD): Overview Syntax Directed Definitions are a generalization of context-free grammars in which: 1. Grammar symbols have an associated set of Attributes; 2. Productions are associated with Semantic Rules for computing the values of attributes. Such formalism generates Annotated Parse-Trees where each node of the tree is a record with a field for each attribute (e.g.,X.a indicates the attribute a of the grammar symbol X).

CS 404Ahmed Ezzat 30 Syntax Directed Definition (SDD): Overview The value of an attribute of a grammar symbol at a given parse-tree node is defined by a semantic rule associated with the production used at that node. We distinguish between two kinds of attributes: 1. Synthesized Attributes. They are computed from the values of the attributes of the children nodes. 2. Inherited Attributes. They are computed from the values of the attributes of both the siblings and the parent nodes.

CS 404Ahmed Ezzat 31 The process of evaluating attributes is called annotation, or DECORATION, of the parse tree  When a parse tree under this grammar is fully decorated, the value of the expression will be in the val attribute of the root The code fragments for the rules are called SEMANTIC FUNCTIONS  Strictly speaking, they should be cast as functions, e.g., E1.val = sum (E2.val, T.val). Syntax Directed Definition (SDD): Overview

CS 404Ahmed Ezzat 32 Syntax Directed Definition (SDD): Overview - Example Let us consider the Grammar for arithmetic expressions. The Syntax Directed Definition associates to each non terminal a synthesized attribute called val.

CS 404Ahmed Ezzat 33 Implementing Syntax Directed Definitions Dependency Graphs S-Attributed Definitions L-Attributed Definitions

CS 404Ahmed Ezzat 34 Implementing Syntax Directed Definitions: Dependency Graphs Implementing a Syntax Directed Definition consists primarily in finding an order for the evaluation of attributes  Each attribute value must be available when a computation is performed. Dependency Graphs are the most general technique used to evaluate syntax directed definitions with both synthesized and inherited attributes. A Dependency Graph shows the interdependencies among the attributes of the various nodes of a parse-tree.  There is a node for each attribute;  If attribute b depends on an attribute c there is a link from the node for c to the node for b (b c). Dependency Rule: If an attribute b depends on an attribute c, then we need to fire the semantic rule for c first and then the semantic rule for b.

CS 404Ahmed Ezzat 35 Implementing Syntax Directed Definitions Synthesized-Attributes Evaluation Order: Semantic rules in a S-Attributed Definition can be evaluated by a bottom-up, or PostOrder, traversal of the parse-tree. Example: The above arithmetic grammar is an example of an S-Attributed Definition. The annotated parse-tree for the input 3*5+4n is:

CS 404Ahmed Ezzat 36 The above attributes are called Synthesized attributes: They are calculated only from the attributes of things below them in the parse tree Another attribute type is called Inherited attribute:  Inherited attributes may depend on things above or to the side of them in the parse tree  Tokens have only synthesized attributes, initialized by the scanner (name of an identifier, value of a constant, etc.).  Inherited attributes of the start symbol constitute run-time parameters of the compiler Implementing Syntax Directed Definitions Synthesized-Attributes

CS 404Ahmed Ezzat 37 Inherited Attributes are useful for expressing the dependence of a construct on the context in which it appears. Note: It is always possible to rewrite a syntax directed definition to use only synthesized attributes, but it is often more natural to use both synthesized and inherited attributes. Evaluation Order. Inherited attributes cannot be evaluated by a simple PreOrder traversal of the parse-tree:  Unlike synthesized attributes, the order in which the inherited attributes of the children are computed is important Indeed:  Inherited attributes of the children can depend from both left and right siblings! Implementing Syntax Directed Definitions Inherited-Attributes

CS 404Ahmed Ezzat 38 Implementing Syntax-Directed Definition: Inherited + Synthesized-Attributes Example The non terminal T has a synthesized attribute, type, determined by the tokens int/real in the corresponding production. The productionD  TL is associated with the semantic rule L.in := T.type which set the inherited attribute L.in. Note: The production L  L1; id distinguishes the two occurrences of L.

CS 404Ahmed Ezzat 39 Implementing Syntax-Directed Definition: Attributes Synthesized attributes can be evaluated by a PostOrder traversal. Inherited attributes that do not depend from right children can be evaluated by a PreOrder traversal. The annotated parse-tree for the input real id1, id2, id3 is: L.in is then inherited top-down the tree by the other L-nodes. At each L-node the procedure addtype inserts into the symbol table the type of the identifier.

CS 404Ahmed Ezzat 40 Implementing Syntax-Directed Definition: Summary S-Attributed Grammar (uses only synthesized attributes) is purely bottom-up: SLR(1) but not LL(1) LL(1) Grammar requires inherited attributes L-Attributed (L stands for Left) Grammar contains both synthesized and Inherited attributes but do not need to build a dependency graph to evaluate them.

CS 404Ahmed Ezzat 41 Translation Schemes Translation Schemes are more implementation oriented than syntax directed definitions since they indicate the order in which semantic rules and attributes are to be evaluated. Definition. A Translation Scheme is a context-free grammar in which  Attributes are associated with grammar symbols;  Semantic Actions are enclosed between braces { } and are inserted within the right-hand side of productions. Note: Yacc uses Translation Schemes.

CS 404Ahmed Ezzat 42 Translation Schemes Translation Schemes deal with both synthesized and inherited attributes. Semantic Actions are treated as terminal symbols: Annotated parse-trees contain semantic actions as children of the node standing for the corresponding production. Translation Schemes are useful to evaluate L-Attributed definitions at parsing time (even if they are a general mechanism).  An L-Attributed Syntax-Directed Definition can be turned into a Translation Scheme.

CS 404Ahmed Ezzat 43 Translation Schemes: Example Consider the Translation Scheme for the L- Attributed Definition for “type declarations”:  D  T {L.in := T.type} L  T  int {T.type := integer}  T  real {T.type := real}  L  {L 1.in := L.in} L 1, id {addtype(id.entry, L.in)}  L  id {addtype(id.entry, L.in)}

CS 404Ahmed Ezzat 44 Translation Schemes: Example The parse-tree with semantic actions for the input real id1, id2, id3 is: Traversing the Parse-Tree in depth-first order (PostOrder) we can evaluate the attributes.

CS 404Ahmed Ezzat 45 Translation Schemes: Summary When designing a Translation Scheme we must be sure that an attribute value is available when a semantic action is executed. When the semantic action involves synthesized attributes: The action can be put at the end of the production.  Example. The following Production and Semantic Rule: T  T1 * F T.val := T1.val * F.val yield the translation scheme: T  T1 * F {T.val := T1.val * F.val}

CS 404Ahmed Ezzat 46 Translation Schemes: Summary When the semantic action involves inherited attributes of a grammar symbol: The action must be put before the symbol itself.  Example: The following Production and Semantic Rule: D  T L L.in := T.type yield the translation scheme: D  T {L.in := T.type} L

CS 404Ahmed Ezzat 47 END