Compiler Structures 8. Attribute Grammars Objectives

Slides:



Advertisements
Similar presentations
Chapter 2-2 A Simple One-Pass Compiler
Advertisements

Compilers: Parse Tree/9 1 Compiler Structures Objective – –extend the expressions language compiler to generate a parse tree for the input program,
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Abstract Syntax Trees Lecture 14 Wed, Mar 3, 2004.
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Chapter 5 Syntax-Directed Translation Section 0 Approaches to implement Syntax-Directed Translation 1、Basic idea Guided by context-free grammar (Translating.
Syntax Directed Definitions Synthesized Attributes
LEX and YACC work as a team
Compilers: topDown/5 1 Compiler Structures Objective – –look at top-down (LL) parsing using recursive descent and tables – –consider a recursive.
Compilers: Attr. Grammars/8 1 Compiler Structures Objective – –describe semantic analysis with attribute grammars, as applied in yacc and recursive.
Semantic Analysis (Generating An AST) CS 471 September 26, 2007.
Compilers: IC/10 1 Compiler Structures Objective – –describe intermediate code generation – –explain a stack-based intermediate code for the expression.
Lesson 3 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
Review 1.Lexical Analysis 2.Syntax Analysis 3.Semantic Analysis 4.Code Generation 5.Code Optimization.
CS 363 Comparative Programming Languages Semantics.
TDDD55- Compilers and Interpreters Lesson 1 Zeinab Ganjei Department of Computer and Information Science Linköping University.
Lecture 6: YACC and Syntax Directed Translation CS 540 George Mason University.
1 Syntax-Directed Translation Part I Chapter 5 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
Chapter 8: Semantic Analyzer1 Compiler Designs and Constructions Chapter 8: Semantic Analyzer Objectives: Syntax-Directed Translation Type Checking Dr.
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture 1 Ahmed Ezzat.
Compilers: Bottom-up/6 1 Compiler Structures Objective – –describe bottom-up (LR) parsing using shift- reduce and parse tables – –explain how LR.
CS 404Ahmed Ezzat 1 CS 404 Introduction to Compiler Design Lecture Ahmed Ezzat.
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Chapter4 Syntax-Directed Translation Introduction : 1.In the lexical analysis step, each token has its attribute , e.g., the attribute of an id is a pointer.
Lecture 9 Symbol Table and Attributed Grammars
Announcements/Reading
Syntax-Directed Translation
Context-Sensitive Analysis
A Simple Syntax-Directed Translator
Constructing Precedence Table
CS510 Compiler Lecture 4.
Compiler Construction
Chapter 5 Syntax Directed Translation
Abstract Syntax Trees Lecture 14 Mon, Feb 28, 2005.
Context-free Languages
Ch. 4 – Semantic Analysis Errors can arise in syntax, static semantics, dynamic semantics Some PL features are impossible or infeasible to specify in grammar.
Bison: Parser Generator
Syntax-Directed Translation Part I
4 (c) parsing.
CS416 Compiler Design lec00-outline September 19, 2018
CS 3304 Comparative Languages
Lexical and Syntax Analysis
Syntax-Directed Translation Part I
Syntax-Directed Translation Part I
Syntax-Directed Definition
SYNTAX DIRECTED TRANSLATION
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
CS416 Compiler Design lec00-outline February 23, 2019
Syntax-Directed Translation Part I
SYNTAX DIRECTED DEFINITION
Compiler Structures 3. Lex Objectives , Semester 2,
Syntax-Directed Translation
Compiler Structures 5. Top-down Parsing Objectives
Compiler Structures 4. Syntax Analysis Objectives
Yacc Yacc.
Compiler Structures 7. Yacc Objectives , Semester 2,
Compiler Structures 2. Lexical Analysis Objectives
10. Intermediate Code Generation
Appendix B.2 Yacc Appendix B.2 -- Yacc.
9. Creating and Evaluating a
Compilers Principles, Techniques, & Tools Taught by Jing Zhang
Syntax-Directed Translation Part I
Lec00-outline May 18, 2019 Compiler Design CS416 Compiler Design.
Chapter 5 Syntax Directed Translation
CMPE 152: Compiler Design December 4 Class Meeting
Compiler Structures 11. IC Generation: Control Structures Objectives
Presentation transcript:

Compiler Structures 8. Attribute Grammars Objectives 242-437, Semester 2, 2018-2019 8. Attribute Grammars Objectives describe semantic analysis with attribute grammars, as applied in yacc and recursive descent parsers

Overview 1. What is an Attribute Grammar? 2. Parse Tree Evaluation 3. Attributes 4. Attribute Grammars and yacc 5. A Grid Grammar 6. Recursive Descent and Attributes

In this lecture Front End Back End Source Program Lexical Analyzer Syntax Analyzer In this lecture Semantic Analyzer Int. Code Generator concentrating on attribute grammars Intermediate Code Code Optimizer Back End As I said earlier, there will be 5 homeworks, each of which will contribute to 5% of your final grade. You will have at least 2 weeks to complete each of the homeworks. Talking about algorithms really helps you learn about them, so I encourage you all to work in small groups. If you don’t have anyone to work with please either e-mail me or stop by my office and I will be sure to match you up with others. PLEASE make sure you all work on each problem; you will only be hurting yourself if you leach off of your partners. Problems are HARD! I will take into account the size of your group when grading your homework. Later in the course I will even have a contest for best algorithm and give prizes out for those who are most clever in their construct. I will allow you one late homework. You *must* write on the top that you are taking your late. Homework 1 comes out next class. Target Code Generator Target Lang. Prog.

1. What is an Attribute Grammar? An attribute grammar is a context free grammar with semantic actions attached to some of the productions semantic = meaning An action specifies the meaning of a production in terms of its body terminals and nonterminals.

Example Attribute Grammar Production Semantic Action L  E E  E + T E  T T  T * F T  F F  ( E ) F  num printf(Ebody.val) E.val := Ebody.val + Tbody.val E.val := Tbody.val T.val := Tbody.val * Fbody.val T.val := Fbody.val F.val := Ebody.val F.val := value(num)

2. Parse Tree Evaluation One way of understanding semantic actions is as extra information (attributes) attached to the nodes of the parse tree for the input. The semantic action specifies the parent node attribute in terms of the attributes of its children.

Basic Parse Tree Input: 9 * 5 + 2 L L  E E  E + T E  T T  T * F T  F F  ( E ) F  num E E + T T F T 2 * F F 5 9

Adding Meaning to the Tree What is the meaning of "9 * 5 + 2"? the answer is to evaluate it, to get 47 Add attributes to the tree, starting from the leaves and working up to the root use the semantic actions to get the attribute values

Parse Tree with Actions printf 47 L printf(Ebody.val) E.val := Ebody.val + Tbody.val E.val := Tbody.val T.val := Tbody.val * Fbody.val T.val := Fbody.val F.val := Ebody.val F.val := value(num) 47 E 45 E + T 2 45 T F 2 9 T F 2 * 5 9 F 5 evaluate bottom-up 9

3. Attributes Attribute values can be numbers, strings, any data structures, code, assembly language instructions It's not always necessary to build a parse tree in order to evaluate the grammar's action.

Kinds of Attribute There are two main kinds of attribute evaluation: synthesized and inherited attributes The value of a synthesized attribute is calculated by using its body values as in the previous example

Synthesized Attributes in a Tree Example: Production Semantic Action T  T * F T.val := Tbody.val * Fbody.val 45 T 9 T * F 5 evaluate bottom-up

Inherited Attributes An inherited attribute for a body symbol (i.e. terminal, non-terminal) gets its value from the other body symbols and the parent value often used for evaluating more complex programming language features

Inherited Attributes in a Tree Two examples: A.a X.x := function(A.a, Y.y) X.x Y.y Direction of evaluation A.a Y.y := function(A.a, X.x) X.x Y.y

4. Attribute Grammars and yacc yacc supports (synthesized) attribute grammars yacc actions are semantic actions no parse tree is needed, since yacc evaluates the actions using the parser's built-in stack

expr.y Again declarations attributes actions continued %token NUMBER %% exprs: expr '\n' { printf("Value = %d\n", $1); } | exprs expr '\n' { printf("Value = %d\n", $2); } ; expr: expr '+' term { $$ = $1 + $3; } | expr '-' term { $$ = $1 - $3; } | term { $$ = $1; } declarations attributes actions continued

more actions continued term: term '*' factor { $$ = $1 * $3; } | term '/' factor { $$ = $1 / $3; } /* integer division */ | factor ; factor: '(' expr ')' { $$ = $2; } | NUMBER more actions continued

$$ #include "lex. yy. c" int yyerror(char $$ #include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse(); // the syntax analyzer c code

Evaluation in yacc Input: 3 * 5 + 4\n Stack $ $ 3 $ F $ T $ T * $ T * 5 $ T * F $ T $ E $ E + $ E + 4 $ E + F $ E + T $ E $ E \n $ Es val _ 3 3 3 3 3 5 3 5 15 15 15 15 4 15 4 15 4 19 19 19 Input 3*5+4\n$ *5+4\n$ *5+4\n$ *5+4\n$ 5+4\n$ +4\n$ +4\n$ +4\n$ +4\n$ 4\n$ \n$ \n$ \n$ \n$ $ $ Stack Action shift reduce F  num reduce T  F shift shift reduce F  num reduce T  T * F reduce E  T shift shift reduce F  num reduce T  F reduce E  E + T shift reduce Es  E \n accept Semantic Action $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 * $3 $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 (implicit) $$ = $1 + $3 printf $1

5. A Grid Grammar A robot starts at (0,0) on a grid, and is given compass directions: n = north, s = south, e = east, w = west Evaluate the sequence of directions to work out the final position of the robot.

Example start final The robot receives the directions: n e e n n w what is the 'meaning' (semantics) of the directions? the 'meaning' is the final robot position, (1,3) n w e start final s

5.1. Grid Grammar Input: n w s s robot robot  path path  path step | e step  e | w | s | n path path step path step s path step s path step w e n

Grid Attribute Grammar Production Semantic Actions robot  path path  path step path  e step  e step  w step  s step  n printf( pathbody.(x,y) ) path.x := pathbody.x + stepbody.dx path.y := pathbody.y + stepbody.dy path.(x,y) = (0,0) step.(dx,dy) := (1,0) step.(dx,dy) := (-1,0) step.(dx,dy) := (0,-1) step.(dx,dy) := (0,1)

Data Types The path rules use (x,y), the position of the robot. The step rules use (dx,dy), the step taken by the robot. Implementing these data types requires new features of yacc. dx,dy

Parse Tree with Actions Input: n w s s robot printf (-1,-1) (-1,-1) path (-1,0) path step 0,-1 (-1,1) path 0,-1 step s (0,1) path -1,0 step s (0,0) path 0,1 step w e evaluate bottom-up n

5.2. Non-integer Yacc Attributes The default yacc attributes (e.g. $$, $1, etc) are integers. We want data structures for (x,y) and (dx,dy), coded as two struct types.

Defining New Types The new types are collected together inside a %union in the yacc definitions section: %union{ type1 name1; type2 name2; . . . } For the grid: %union{ struct (int x, int y; } pos; struct (int dx, int dy; } offset; }

Using the Types The non-terminals that return the new types must be listed. Any tokens that use the types must be listed. For the grid: % type <offset> step % type <pos> path these non-terminals return values of the specified type

Using Typed Variables If an attribute (variable) is a record, then dotted-name notation is used to refer to its fields e.g. $$.dx, $1.y The default action ($$ = $1) will cause an error if $$ and $1 are not the same type.

5.3. Grid Compiler grid.l, a flex file flex lex.yy.c gridEval, c executable #include gcc grid.y, a bison file bison grid.tab.c $ flex grid.l $ bison grid.y $ gcc grid.tab.c -o gridEval

Usage I typed these lines. I typed ctrl-D $ ./gridEval nwss Robot is at (-1,-1) n n n w w w s e Robot is at (-2,2) $ I typed these lines. I typed ctrl-D

grid.l %% [nN] {return NORTH;} [sS] {return SOUTH;} [eE] {return EAST;} [wW] {return WEST;} [ \n\t] ; int yywrap(void) { return 1; }

grid.y type definitions types use by the non-terminals continued %union{ struct { int x; int y; } pos; struct { int dx; int dy; } offset; } %token EAST WEST NORTH SOUTH %type <offset> step %type <pos> path %% type definitions types use by the non-terminals continued

robot: path { printf("Robot is at (%d,%d)\n", $1. x, $1 robot: path { printf("Robot is at (%d,%d)\n", $1.x, $1.y); } ; path: path step {$$.x = $1.x + $2.dx; $$.y = $1.y + $2.dy;} | {$$.x = 0; $$.y = 0;} step: EAST {$$.dx = 1; $$.dy = 0;} | WEST {$$.dx = -1; $$.dy = 0;} | SOUTH {$$.dx = 0; $$.dy = -1;} | NORTH {$$.dx = 0; $$.dy = 1;} %% continued

#include "lex. yy. c" int yyerror(char #include "lex.yy.c" int yyerror(char *s) { fprintf(stderr, "%s\n", s); return 0; } int main(void) { yyparse();

6. Recursive Descent and Attributes It is easy to add semantic actions to a recursive descent parser in many cases, there's no need for the parser to build a parse tree in order to evaluate the attributes The basic translation strategy: each production becomes a function continued

The function (e.g. f()) calls other functions representing its body non-terminals those functions return values (attributes) to f() f() combines the values, and returns a value (attribute)

6.1. The Expressions Parser Again The basic LL(1) grammar: Stats => ( [ Stat ] \n )* Stat => let ID = Expr | Expr Expr => Term ( (+ | - ) Term )* Term => Fact ( (* | / ) Fact ) * Fact => '(' Expr ')' | Int | Id

An Expressions Program (test3.txt) 5 + 6  give answer let x = 2  declare variable 3 + ( (x*y)/2) // comments // y let x = 5 let y = x /0  error // comments

6.2. Parsing with Actions exprParse1.c is a recursive descent parser using the expressions language. It differs from exprParse0.c by having semantic actions attached to its productions these actions evaluate the expressions, and assign values to expression variables

Grammar with Actions Productions Actions Stats => ( [ Stat ] \n )* --- Stat => let ID = Expr add id to symbol table; id.val = expr.val; print( id.val ); Stat => Expr print( expr.val ); continued

Expr => Term ( (+ | - ) Term ). return term1. val (+| -) term2 Expr => Term ( (+ | - ) Term )* return term1.val (+| -) term2.val (+| -) ... termn.val; Term => Fact ( (* | / ) Fact ) * return fact1.val (*| /) fact2.val (*| /) ... factn.val; continued

Fact => '(' Expr ') return expr. val; Fact => Int return int Fact => '(' Expr ') return expr.val; Fact => Int return int.val; Fact => Id lookup id; if not found then add (id, 0) to table; return id.val;

The Symbol Table The symbol table is a data structure used to store expression variables and their values. In exprParse1.c, it's an array of structs, with each struct holding the name of the variable and its current integer value. id . . . . value syms[]

6.3. Usage $ gcc -Wall -o exprParse1 exprParse1.c $ ./exprParse1 < test3.txt == 11 x being declared x = 2 y being declared == 3 x = 5 Error: Division by zero; using 1 instead y = 5 $

6.4. exprParse1.c Callgraph same as in exprParse0.c generated from grammar (now with actions) symbol table (new)

6.5. Symbol Table Data Structures #define MAX_SYMS 15 // max no of variables typedef struct SymInfo { char *id; // name of variable int value; // value (an integer) } SymbolInfo; int symNum = 0; // number of symbols stored SymbolInfo syms[MAX_SYMS]; syms[] id . . . . value 1 2 14

Symbol Table Functions SymbolInfo *getIDEntry(void) /* find _OR_ create symbol table entry for current tokString; return a pointer to it */ { SymbolInfo *si = NULL; if ((si = lookupID(tokString)) != NULL) // already declared return si; // add id to table printf("%s being declared\n", tokString); return addID(tokString, 0); //0 is default value } // end of getIDEntry()

SymbolInfo. lookupID(char. nm) /. is nm in the symbol table SymbolInfo *lookupID(char *nm) /* is nm in the symbol table? return pointer to struct or NULL */ { int i; for(i=0; i<symNum; i++) if (!strcmp(syms[i].id, nm)) return &syms[i]; return NULL; } // end of lookupID()

SymbolInfo. addID(char. nm, int value) / SymbolInfo *addID(char *nm, int value) /* add nm and value to the symbol table; return pointer to struct */ { if (symNum == MAX_SYMS) { printf("Symbol table full; cannot add %s\n", nm); exit(1); } syms[symNum].id = (char *) malloc(strlen(nm)+1); strcpy(syms[symNum].id, nm); syms[symNum].value = value; SymbolInfo *si = &syms[symNum]; symNum++; return si; } // end of addID()

Using the Symbol Table The grammar functions use the symbol table via the matchID() function. SymbolInfo *matchId(void) // checks current ID with symbol table { SymbolInfo *si; dprint("Parsing ident\n"); if ((si = getIDEntry()) == NULL) { printf("Error: id is NULL on line %d\n",lineNum); exit(1); } match(ID); // ok, so consume ID token return si; } // end of matchId()

6.6. Translating the Grammar Rules The same translation is carried out as before, but the code is augmented with actions. The semantic actions are translated into extra C code in the grammar functions.

The Grammar Functions main() and statements() are unchanged from exprParse0.c since they don't have any semantic actions. Functions with extra actions: statement(), expression(), term(), factor()

Unchanged Functions void statements(void) // statements ::= { // [ statement] '\n' } { dprint("Parsing statements\n"); while (currToken != SCANEOF) { if (currToken != NEWLINE) statement(); match(NEWLINE); } } // end of statements() int main(void) { nextToken(); statements(); match(SCANEOF); return 0; }

statement() Before and After with no semantic actions void statement(void) // statement ::= ( 'let' ID '=' EXPR ) | EXPR { if (currToken == LET) { match(LET); match(ID); match(ASSIGNOP); expression(); } else } // end of statement()

add id to table; id.val = expr.val; print( id.val ); or void statement(void) // statement ::= ( 'let' ID '=' EXPR ) | EXPR { SymbolInfo *si; int value; dprint("Parsing statement\n"); if (currToken == LET) { match(LET); si = matchId(); // was match(ID); match(ASSIGNOP); value = expression(); si->value = value; printf("%s = %d\n", si->id, value); } else { // expression printf("== %d\n", value); Actions: add id to table; id.val = expr.val; print( id.val ); or print( expr.val );

expression() Before and After with no semantic actions void expression(void) // expression ::= term ( ('+'|'-') term )* { term(); while((currToken == PLUSOP) || (currToken == MINUSOP)) { match(currToken); } } // end of expression()

return term1.val (+| -) term2.val (+| -) ... termn.val; int expression(void) // expression ::= term ( ('+'|'-') term )* { int result, v2; int isAddOp; dprint("Parsing expression\n"); result = term(); while((currToken == PLUSOP) || (currToken == MINUSOP)) { isAddOp = (currToken == PLUSOP) ? 1 : 0; match(currToken); v2 = term(); if (isAddOp == 1) // addition result += v2; else // subtraction result -= v2; } return result; } // end of expression() Action: return term1.val (+| -) term2.val (+| -) ... termn.val;

term() Before and After with no semantic actions void term(void) // term ::= factor ( ('*'|'/') factor )* { factor(); while((currToken == MULTOP) || (currToken == DIVOP)) { match(currToken); } } // end of term()

return fact1.val (*| / ) fact2.val (*| / ) ... factn.val; int term(void) // term ::= factor ( ('*'|'/') factor )* { int result, v2; int isMultOp; dprint("Parsing term\n"); result = factor(); while((currToken == MULTOP) || (currToken == DIVOP)) { isMultOp = (currToken == MULTOP) ? 1 : 0; match(currToken); v2 = factor(); if (isMultOp == 1) // multiplication result *= v2; else { // division if (v2 == 0) printf("Error: Division by zero; using 1 instead\n"); else result = result / v2; } return result; } // end of term() Action: return fact1.val (*| / ) fact2.val (*| / ) ... factn.val;

factor() Before and After with no semantic actions void factor(void) // factor ::= '(' expression ')' | INT | ID { if(currToken == LPAREN) { match(LPAREN); expression(); match(RPAREN); } else if(currToken == INT) match(INT); else if (currToken == ID) match(ID); else syntax_error(currToken); } // end of factor()

add id to table (if new); return id.val; int factor(void) // factor ::= '(' expression ')' | INT | ID { int result = 0; dprint("Parsing factor\n"); if(currToken == LPAREN) { match(LPAREN); result = expression(); match(RPAREN); } else if(currToken == INT) { match(INT); result = currTokValue; else if (currToken == ID) { SymbolInfo *si = matchId(); result = si->value; else syntax_error(currToken); return result; } // end of factor() Actions: return expr.val; or return int.val; add id to table (if new); return id.val;