4. Semantic Processing and Attribute Grammars

Slides:



Advertisements
Similar presentations
Chapter 2-2 A Simple One-Pass Compiler
Advertisements

Semantics Static semantics Dynamic semantics attribute grammars
Attribute Grammars Prabhaker Mateti ACK: Assembled from many sources.
CS 330 Programming Languages 09 / 13 / 2007 Instructor: Michael Eckmann.
Semantic analysis Enforce context-dependent language rules that are not reflected in the BNF, e.g.a function must have a return statement. Decorate AST.
COP4020 Programming Languages
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Syntax & Semantic Introduction Organization of Language Description Abstract Syntax Formal Syntax The Way of Writing Grammars Formal Semantic.
Syntax Directed Definitions Synthesized Attributes
1 Abstract Syntax Tree--motivation The parse tree –contains too much detail e.g. unnecessary terminals such as parentheses –depends heavily on the structure.
Syntax-Directed Translation
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Semantic Analysis1 Checking what parsers cannot.
CS Describing Syntax CS 3360 Spring 2012 Sec Adapted from Addison Wesley’s lecture notes (Copyright © 2004 Pearson Addison Wesley)
3-1 Chapter 3: Describing Syntax and Semantics Introduction Terminology Formal Methods of Describing Syntax Attribute Grammars – Static Semantics Describing.
CS 363 Comparative Programming Languages Semantics.
Chapter 2. Design of a Simple Compiler J. H. Wang Sep. 21, 2015.
Chapter 3 Part II Describing Syntax and Semantics.
1 Compiler Construction (CS-636) Muhammad Bilal Bashir UIIT, Rawalpindi.
CS412/413 Introduction to Compilers and Translators Spring ’99 Lecture 3: Introduction to Syntactic Analysis.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Overview of Previous Lesson(s) Over View 3 Model of a Compiler Front End.
Chapter 8: Semantic Analyzer1 Compiler Designs and Constructions Chapter 8: Semantic Analyzer Objectives: Syntax-Directed Translation Type Checking Dr.
Compiler Principle and Technology Prof. Dongming LU Apr. 15th, 2015.
1 4.Semantic Processing and Attribute Grammars. 2 Semantic Processing The parser checks only the syntactic correctness of a program Tasks of semantic.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Lecture 9 Symbol Table and Attributed Grammars
Chapter 3 – Describing Syntax
Describing Syntax and Semantics
Describing Syntax and Semantics
A Simple Syntax-Directed Translator
Constructing Precedence Table
Programming Languages Translator
CS510 Compiler Lecture 4.
Introduction to Parsing (adapted from CS 164 at Berkeley)
Chapter 3 – Describing Syntax
Syntax Specification and Analysis
Compiler Construction
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Even-Even Devise a grammar that generates strings with even number of a’s and even number of b’s.
PROGRAMMING LANGUAGES
Ch. 4 – Semantic Analysis Errors can arise in syntax, static semantics, dynamic semantics Some PL features are impossible or infeasible to specify in grammar.
Compiler Lecture 1 CS510.
Syntax-Directed Translation Part I
CS 3304 Comparative Languages
Lexical and Syntax Analysis
Syntax Questions 6. Define a left recursive grammar rule.
Syntax-Directed Translation
Syntax-Directed Translation Part I
Syntax-Directed Translation Part I
CSE 3302 Programming Languages
Lecture 7: Introduction to Parsing (Syntax Analysis)
R.Rajkumar Asst.Professor CSE
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Chapter 4 Action Routines.
Syntax-Directed Translation Part I
SYNTAX DIRECTED DEFINITION
Chapter 3 Describing Syntax and Semantics.
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Chapter 10: Compilers and Language Translation
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Syntax-Directed Translation Part I
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
Language translation Programming Language Design and Implementation (4th Edition) by T. Pratt and M. Zelkowitz Prentice Hall, 2001 Sections
COMPILER CONSTRUCTION
Faculty of Computer Science and Information System
Presentation transcript:

4. Semantic Processing and Attribute Grammars

Semantic Processing Tasks of semantic processing The parser checks only the syntactic correctness of a program Tasks of semantic processing Symbol table handling - Maintaining information about declared names - Maintaining information about types - Maintaining scopes Checking context conditions - Scoping rules - Type checking Invocation of code generation routines Semantic actions are integrated into the parser and are described with attribute grammars

Semantic Actions So far: analysis of the input Expr = Term { "+" Term }. the parser checks if the input is syntactically correct. Now: translation of the input (semantic processing) Expr = Term (. int n = 1; .) { "+" Term (. n++; .) } (. Console.WriteLine(n); .) . e.g.: we want to count the terms in the expression semantic actions arbitrary Java statements between (. and .) are executed by the parser at the position where they occur in the grammar "translation" here: 1+2+3  3 47+1  2 909  1

Attributes Syntax symbols can return values (sort of output parameters) Term <int val> Term returns its numeric value as an output attribute Attributes are useful in the translation process e.g.: we want to compute the value of a number Expr (. int sum, val; .) = Term<sum> { "+" Term<val> (. sum += val; .) } (. Console.WriteLine(sum); .) . "translation" here: 1+2+3  6 47+1  48 909  909

Input Attributes Nonterminal symbols can have also input attributes (parameters that are passed from the "calling" production) Expr<bool printHex> printHex: print the result of the addition hexadecimal (otherwise decimal) Example Expr<bool printHex> (. int sum, val; .) = Term<sum> { "+" Term<val> (. sum += val; .) }. (. if (printHex) Console.WriteLine("{0:X}", sum) else Console.WriteLine("{0:D}", sum); .)

Attribute Grammars 1. Productions in EBNF Notation for describing translation processes consist of three parts 1. Productions in EBNF Expr = Term { "+" Term }. 2. Attributes (parameters of syntax symbols) Term<int val> Expr<bool printHex> output attributes (synthesized): yield the translation result input attributes (inherited): provide context from the caller 3. Semantis actions (. ... arbitrary Java statements ... .)

Example ATG for processing declarations VarDecl = Type IdentList ";" . (. Struct type; .) (. Tab.insert(token.str, type); .) <type> <type> <Struct type> IdentLIst = ident { "," ident } . This is translated to parsing methods as follows static void VarDecl () { Struct type; Type(out type); IdentList(type); Check(Token.SEMICOLON); } static void IdentList (Struct type) { Check(Token.IDENT); Tab.Insert(token.str, type); while (la == Token.COMMA) { Scan(); } ATGs are shorter and more readable than parsing methods

Example: Processing of Constant Expressions input: 3 * (2 + 4) desired result: 18 3 Factor * ( 2 + 4 ) Term Expr Expr = Term { "+" Term | "-" Term }. Term = Factor { "*" Factor | "/" Factor } Factor = number | "(" Expr ")" <int val> <val1> <val> (. int val1; .) (. val += val1; .) (. val -= val1; .) (. val *= val1; .) (. val /= val1; .) (. val = t.val; .) 18 18 6 6 2 4 3 2 4

Transforming an ATG into a Parser Production Expr<int val> (. int val1; .) = Term<val> { "+" Term<val1> (. val += val1; .) | "-" Term<val1> (. val -= val1; .) }. Parsing method static void Expr (out int val) { int val1; Term(out val); for (;;) { if (la == Token.PLUS) { Scan(); val1 = Term(out val1); val += val1; } else if (la == Token.MINUS) { Term(out val1); val -= val1; } else break; } input attribute  parameter output atribute  out parameter semantic actions  embedded Java code Terminal symbols have no input attributes. In our form of ATGs they also have no output attributes, but their value is computed from token.str or token.val.

Example: Sales Statistics ATGs can also be used in areas other than compiler constructions Example: given a file with sales numbers File = { Article }. Article = Code { Amount } "END" Code = number. Amount = number. Whenever the input is syntacticlly structured ATGs are a good notation to describe its processing Input for example: 3451 2 5 3 7 END 3452 4 8 1 END 3453 1 1 END ... Desired output: 3451 17 3452 13 3453 2

ATG for the Sales Statistics File (. int code, amount; .) = { Article<code, amount> (. Write(code + " " + amount); .) }. Article<int code, int amount> = Value<code> { (. int x; .) Value<x> (. amount += x; .) } "END". Value<int x> = number (. x = token.val; .) . static void File () { int code, amount; while (la == number) { Article(out code, out number); Write(code + " " + amount); } static void Article (out int code, out int amount) { Value(out code); int x; Value(out x); amount += x; Check(end); static void Value (out int x) { Check(number); x = token.val; } Parsercode terminal symbols number, end, eof

Example: Image Description Language described by: POLY (10,40) (50,90) (40,45) (50,0) END input syntax: Polygon = "POLY" Point {Point} "END". Point = "(" number "," number ")". (50,90) (40,45) (10,40) (50,0) We want a program that reads the input and draws the polygon Polygon (. Pt p, q; .) = "POLY" Point<p> (. Turtle.start(p); .) { "," Point<q> (. Turtle.move(q); .) } "END" (. Turtle.move(p); .) . Point<p> (. Pt p; int x, y; .) = "(" number (. x = t.val; .) "," number (. y = t.val; .) ")" (. p = new Pt(x, y); .) We use "Turtle Graphics" for drawing Turtle.start(p); sets the turtle (pen) to point p Turtle.move(q); moves the turtle to q drawing a line

Example: Transform Infix to Postfix Expressions Arithmetic expressions in infix notation are to be transformed to postfix notation 3 + 4 * 2  3 4 2 * + (3 + 4) * 2  3 4 + 2 * Expr = Term { "+" Term (. Write("+"); .) | "-" Term (. Write("-"); .) } Term = Factor { "*" Factor (. Write("*"); .) | "/" Factor (. Write("/"); .) }. Factor = number (. Write(token.val); .) | "(" Expr ")". 3 Factor + 4 * 2 Term Expr Write + Write * Write 3 Write 4 Write 2

Attribute Grammars According to Knuth

Idea ATGs so far: procedural descriptions (translation algorithms) Every production is processed from left to right In doing so, attributes are computed and semantic actions are executed ATGs according to Donald Knuth (1968) NT T scanner parser syntax tree NT T attributation "decorated" syntax tree attributes Nonterminal symbols have attributes static properties (do not change after their evaluation) examples: type of an expression, address of a variable, ... Attributation The syntax tree is traversed (possibly several times up and down) until all attributes have been computed.

Attribute Evaluation Rules For every production they define ... the input attributes of all symbols on the right-hand side of the production the output attributes of the symbol on the left-hand side of the production any context conditions if necessary Example A B C a b   c d e f production p represents a section of the syntax tree We must define all attributes that leave p (i.e. b, c, e) Production p: Aab = Bcd Cef . Attribute evaluation rules R(p) e.g.: c = a; e = d + foo(a); b = d + f; Context condition CC(p) e.g.: d >= f

Example: Computing the Value of a Hex Number Grammar (must be in BNF so that we can build a syntax tree) Number = Digits. // decimal number Number = Digits "H". // hexadecimal number Digits = hex. // hex ... 0..9, A..F Digits = Digits hex. Attributes Number val Digits base val hex Syntax tree for the input: 1BH Digits base val hex "H" Number 1 B H attributes have not yet been evaluated so far

Attribute Evaluation Rules Production 1 Numberval = Digitsbaseval. Digits.base = 10; Number.val = Digits.val; Production 2 Numberval = Digitsbaseval "H". Digits.base = 16; Number.val = Digits.val; Production 3 Digitsbaseval = hexval. Digits.val = hex.val; CC: Digits.base == 10 && 0  hex.val  9 || Digits.base == 16 && 0  hex.val  15 Production 4 Digitsbaseval = Digits1baseval hexval. Digits1.base = Digits.base; Digits.val = Digits1.val * Digits.base + hex.val; CC: Digits.base == 10 && 0  hex.val  9 || Digits.base == 16 && 0  hex.val  15

Attributation of the Tree Scanner fills the attribute values of the terminal symbols Number val Digits "H" base val Digits hex base val val 11 hex val 1 1 B H

Attributation of the Tree (cont.) The tree is traversed top-down For every production we check which attribute evaluation rules are ready to be executed Production 2 Number val Numberval = Digitsbaseval "H". Digits.base = 16; Number.val = Digits.val; Digits "H" base val 16 Digits hex base val val 11 hex val 1 1 B H

Attributation of the Tree (cont.) The tree is traversed top-down For every production we check which attribute evaluation rules are ready to be executed Number val Production 4 Digits "H" Digitsbaseval = Digits1baseval hexval. base val 16 Digits1.base = Digits.base; Digits.val = Digits1.val * Digits.base + hex.val; CC: Digits.base == 10 && 0  hex.val  9 || Digits.base == 16 && 0  hex.val  15 Digits hex base val val 16 11 hex val 1 1 B H

Attributation of the Tree (cont.) The tree is traversed top-down For every production we check which attribute evaluation rules are ready to be executed Number val Digits "H" base val 16 Production 3 Digits hex base val val Digitsbaseval = hexval. 16 1 11 Digits.val = hex.val; CC: Digits.base == 10 && 0  hex.val  9 || Digits.base == 16 && 0  hex.val  15 hex val 1 1 B H

Attributation of the Tree (cont.) The tree is traversed bottom-up For every production we check which attribute evaluation rules are ready to be executed Number val Production 4 Digits "H" Digitsbaseval = Digits1baseval hexval. base val 16 27 Digits1.base = Digits.base; Digits.val = Digits1.val * Digits.base + hex.val; CC: Digits.base == 10 && 0  hex.val  9 || Digits.base == 16 && 0  hex.val  15 Digits hex base val val 16 1 11 hex val 1 1 B H

Attributation of the Tree (cont.) The tree is traversed bottom-up For every production we check which attribute evaluation rules are ready to be executed Production 2 Number val 27 Numberval = Digitsbaseval "H". Digits.base = 16; Number.val = Digits.val; Digits "H" base val 16 27 Digits hex base val val 16 1 11 hex val 1 1 B H All attributes have been computed  end of the tree traversal

Definition of ATGs According to Knuth Attribute Grammar Context-free Grammar CFG = (T, N, P, S) T ... terminal symbols N ... nonterminal symbols P ... productions S ... start symbol ATG = (CFG, A, R, CC) CFG ... context-free grammar A ... set of attributes R ... set of attribute evaluation rules CC ... set of context conditions Attributes A(X) ... attributes of the symbol X (written as X.a, X.b, ...) AS(X) ... output attributes of X (synthesized) AI(X) ... input attributes of X (inherited) Attribute evaluation rules R(p) ... attribute evaluation rules for production p: X0 = X1 ... Xn R(p) = {Xi.a = f(Xj.b, ..., Xk.c)} for all AS of the left-hand side and all AI of the right-hand side of p Context conditions CC(p) ... context conditions of production p: X0 = X1 ... Xn in the form of a Boolean expression B(Xi.a, ..., Xj.b) check the "static semantics", i.e. whether the input is semantically correct

Complete ATGs Definition An ATG is called complete if for all productions p: X = Y1 ... Yn the following condition holds: all AS(X) and all AI(Yi) are computed in R(p)

Well-defined ATGs (WAGs) Definition An ATG is called well-defined (WAG) if the ATG is complete and if the relations between attributes are non-circular in every possible syntax tree In other words: We can find an attribute evaluation order for every possible syntax tree Example A B C ab cd D e E f well-defined A B C ab cd F g circular  not well-defined Checking for well-definedness is NP complete (can only be done in exponential time)! However, there are subclasses of WAG for which this check can be simplified.

Ordered ATGs (OAGs) Definition Example An ATG is called ordered (OAG), if a fixed attribute evaluation order can be specified for every production regardless of its context in the syntax tree Attributation code of a production p can be specified by the following operations: compi ... execute attribute evaluation rule i from R(p) up ... go to the father in the syntax tree downi ... go to son i in the syntax tree Example A a1 a2a3 a4 B b1b2 C c1c2 Attributation code comp (b1 = a1) downB comp (a3 = b2) up comp (c1 = a2) downC comp (a4 = c2)

Counter-Example ATG which is not ordered A B C Attributation code a1 a2a3 a4 B b1b2 C c1c2 Attributation code comp (c1 = a2) downC comp (a4 = c2) up comp (b1 = a1) downB comp (a3 = b2) The attributation order depends on where A occurs in the syntax tree. A a1 a2a3 a4 B C b1b2 c1c2 Attributation code comp (b1 = a1) downB comp (a3 = b2) up comp (c1 = a2) downC comp (a4 = c2) Checking whether a grammar is an OAG can be done in polynomial time.

L-Attributed ATGs (LAGs) Definition An ATG is called L-attributed (LAG) if all attributes in the syntax tree can be avaluated in a single sweep (down and up, left to right). In other words: If the attributes can be computed during syntax analysis. For LAGs it is not even necessary to build a syntax tree (corresponds to our procedural ATGs). Example A a1 a2 B b1b2 C c1c2 D d1d2 E e1e2 Information from the front of a program can be propagated backwards but not vice versa

Relations Between Classes of ATGs ATG  WAG  OAG  LAG in other words every LAG is ordered every OAG is well-defined WAG is more powerful than OAG OAG is more powerful than LAG