CH4.1 CSE244 Bottom Up Translation (revisited) Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road,

Slides:



Advertisements
Similar presentations
CH4.1 Type Checking Md. Fahim Computer Engineering Department Jamia Millia Islamia (A Central University) New Delhi –
Advertisements

SYMBOL TABLES &CODE GENERATION FOR EXECUTABLES. SYMBOL TABLES Compilers that produce an executable (or the representation of an executable in object module.
Structure of a YACC File Has the same three-part structure as Lex Each part is separated by a % symbol The three parts are even identical: – definition.
CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 191 Auditorium Road, Box U-155.
Chapter 5 Syntax-Directed Translation. Translation of languages guided by context-free grammars. Attach attributes to the grammar symbols. Values of the.
1 Compiler Construction Intermediate Code Generation.
1 Error detection in LR parsing Errors are discovered when a slot in the action table is blank. Canonical LR(1) parsers detect and report the error as.
CH4.1 CSE244 Type Checking Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit 1155 Storrs,
1 Beyond syntax analysis An identifier named x has been recognized. Is x a scalar, array or function? How big is x? If x is a function, how many and what.
Lecture # 17 Syntax Directed Definition. 2 Translation Schemes A translation scheme is a CF grammar embedded with semantic actions rest  + term { print(“+”)
Intermediate Code Generation Professor Yihjia Tsai Tamkang University.
CH4.1 CSE244 More on LR Parsing Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit 1155 Storrs,
CH4.1 CSE244 L-Attributed Definitions Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit.
Abstract Syntax Tree (AST)
Syntax Directed Translation
CH4.1 CSE244 Sections 4.5,4.6 Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs,
CH4.1 CSE244 Syntax Directed Translation Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit.
CH4.1 CSE244 Bottom Up Translation Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit 1155.
CH2.1 CSE4100 Chapter 2: A Simple One Pass Compiler Prof. Steven A. Demurjian Computer Science & Engineering Department The University of Connecticut 371.
Chapter 2 A Simple Compiler
CH4.1 CSE244 Intermediate Code Generation Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit.
Parser construction tools: YACC
Syntax Analysis – Part II Quick Look at Using Bison Top-Down Parsers EECS 483 – Lecture 5 University of Michigan Wednesday, September 20, 2006.
Compilers: Yacc/7 1 Compiler Structures Objective – –describe yacc (actually bison) – –give simple examples of its use , Semester 1,
2.2 A Simple Syntax-Directed Translator Syntax-Directed Translation 2.4 Parsing 2.5 A Translator for Simple Expressions 2.6 Lexical Analysis.
Chapter 5 Syntax-Directed Translation Section 0 Approaches to implement Syntax-Directed Translation 1、Basic idea Guided by context-free grammar (Translating.
CSc 453 Semantic Analysis Saumya Debray The University of Arizona Tucson.
Syntax and Semantics Structure of programming languages.
Semantic Analysis (Generating An AST) CS 471 September 26, 2007.
COP4020 Programming Languages Semantics Prof. Xin Yuan.
Lesson 10 CDT301 – Compiler Theory, Spring 2011 Teacher: Linus Källberg.
CH4.1 CSE244 Sections 4.5,4.6 Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs,
Syntax and Semantics Structure of programming languages.
–Writing a parser with YACC (Yet Another Compiler Compiler). Automatically generate a parser for a context free grammar (LALR parser) –Allows syntax direct.
Chapter 5. Syntax-Directed Translation. 2 Fig Syntax-directed definition of a simple desk calculator ProductionSemantic Rules L  E n print ( E.val.
1 Using Yacc. 2 Introduction Grammar –CFG –Recursive Rules Shift/Reduce Parsing –See Figure 3-2. –LALR(1) –What Yacc Cannot Parse It cannot deal with.
YACC. Introduction What is YACC ? a tool for automatically generating a parser given a grammar written in a yacc specification (.y file) YACC (Yet Another.
Applications of Context-Free Grammars (CFG) Parsers. The YACC Parser-Generator. by: Saleh Al-shomrani.
1 Syntax-Directed Translation Part I Chapter 5 COP5621 Compiler Construction Copyright Robert van Engelen, Florida State University, 2007.
國立台灣大學 資訊工程學系 薛智文 98 Spring Syntax-Directed Translation (textbook ch#5.1–5.6, 4.8, 4.9 )
Yacc. Yacc 2 Yacc takes a description of a grammar as its input and generates the table and code for a LALR parser. Input specification file is in 3 parts.
1 A Simple Syntax-Directed Translator CS308 Compiler Theory.
Semantic Values and Symbol Tables © Allan C. Milne Abertay University v
Code Generation CPSC 388 Ellen Walker Hiram College.
LECTURE 7 Lex and Intro to Parsing. LEX Last lecture, we learned a little bit about how we can take our regular expressions (which specify our valid tokens)
LECTURE 11 Semantic Analysis and Yacc. REVIEW OF LAST LECTURE In the last lecture, we introduced the basic idea behind semantic analysis. Instead of merely.
More yacc. What is yacc – Tool to produce a parser given a grammar – YACC (Yet Another Compiler Compiler) is a program designed to compile a LALR(1) grammar.
CH4.1 CSE244 Midterm Subjects Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Box U-155 Storrs,
YACC (Yet Another Compiler-Compiler) Chung-Ju Wu
LECTURE 10 Semantic Analysis. REVIEW So far, we’ve covered the following: Compilation methods: compilation vs. interpretation. The overall compilation.
Chapter4 Syntax-Directed Translation Introduction : 1.In the lexical analysis step, each token has its attribute , e.g., the attribute of an id is a pointer.
Syntax-Directed Translation
A Simple Syntax-Directed Translator
Tutorial On Lex & Yacc.
Programming Languages Translator
Context-free Languages
Syntax-Directed Translation Part I
Intermediate Code Generation
Chapter 5. Syntax-Directed Translation
Syntax-Directed Translation Part I
Syntax-Directed Translation Part I
Syntax-Directed Translation Part II
Syntax-Directed Translation Part I
SYNTAX DIRECTED DEFINITION
Syntax-Directed Translation
Yacc Yacc.
Compiler Structures 7. Yacc Objectives , Semester 2,
Appendix B.2 Yacc Appendix B.2 -- Yacc.
Syntax-Directed Translation Part II
Syntax-Directed Translation Part I
Presentation transcript:

CH4.1 CSE244 Bottom Up Translation (revisited) Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit 1155 Storrs, CT

CH4.2 CSE244 The Picture So Far  We discussed already “top-down translation.”  Works exceptionally well with L-attributed Definitions (with corresponding translation schemes)  Nevertheless we must make sure that the underlying grammar is suitable for predictive parsing. Grammar has no left-recursion and it is left-factored.  We also discussed “bottom up translation” For S- Directed Definitions.  Simple Idea: Use additional stack space to store attribute values for Non-terminals. During Reduce Actions update attributes in stack accordingly.

CH4.3 CSE244 Inherited Attributes and Bottom Up Translation  Some inherited attributes might not be available when we are reducing by a certain production. Consider the translation scheme: PRODUCTIONSEMANTIC RULE D  T LL.in = T.type T  int T.type = integer T  real T.type = real L  L 1, idL 1.in = L.in addtype(id.entry, L.in) L  id addtype(id.entry, L.in) Attempt B-U parsing over real id 1, id 2, id 3

CH4.4 CSE244 Parsing Example. STACKInputAction $real a,b,c$SHIFT $[real, lexval=‘real’]a,b,c$REDUCE T  real modify type $[T, type = ‘real’]a,b,c$SHIFT $[T, type = ‘real’] [id, lexval=‘a’],b,c$REDUCE L  id requires L.in $[T, type = ‘real’] [L, …],b,c$SHIFT $[T, type = ‘real’] [L, …], [id, lexval=‘b’],c$REDUCE L  L, id requires L.in $[T, type = ‘real’] [L, …],c$SHIFT $[T, type = ‘real’] [L, …], [id, lexval=‘c’] $REDUCE L  L, id requires L.in Value of L.in is not necessary… We might look into the stack and recover its intended value…

CH4.5 CSE244 Bottom Up Translation with Inherited Attributes Try to predict the location in the stack that you can recover the value of the inherited attribute you need. PRODUCTIONSEMANTIC RULE D  T L T  int val[ntop] = integer T  real val[ntop] = real L  L 1, idaddtype(id.entry, val[top-3]) L  id addtype(id.entry, val[top-1]) Dangerous Stuff !!!

CH4.6 CSE244 A Brief Look into Yacc  Terminals and Non-terminals in Yacc they have a “semantic value” (one attribute).  The attribute type is specified by YYSTYPE  Different typing for attributes is achieved through %union  The current terminal semantic values are passed by Lex in the variable yylval (specified by Lex).  Semantic values for a certain production are specified by the symbols $$, $1, $2, $3, …  For a certain production, LHS is $$ and $1, $2, $3, … denote the semantic values of each item in the RHS.  Also one can use $0, $-1, $-2, … to peek into the Yacc stack (beyond the current production).   Dangerous Stuff !!!

CH4.7 CSE244 A Brief Look into Yacc,II  Frequently one needs more versatility in defining the semantic values of non-terminals.  (i.e., various different attributes).  We define a collection of data types using %union  We determine the attribute type of each symbol by using the %type declaration.

CH4.8 CSE244 Defining Attributes in Lex/Yacc typedef struct { int value; } myattribute; %union{ int number_type; int ident_type; int ident_type; myattribute myattribute_type; } myattribute myattribute_type; } %token ID %token NUM %type expr %left '+' % expr : ID { printf("%d\n", $1); $$.value = $1; } | NUM { printf("%d\n", $1); $$.value = $1; } | NUM { printf("%d\n", $1); $$.value = $1; } | expr '+' expr { $$.value = $1.value + $3.value; printf("%d\n", $$.value); } | expr '+' expr { $$.value = $1.value + $3.value; printf("%d\n", $$.value); } ;% *************** LEX FILE ***************** id [A-Za-z][A-Za-z0-9]* num [0-9]+ ws [ \t]+ % {ws} /* do nothing */ {id} { yylval.ident_type = 44; return ID; } {numr} { yylval.number_type = atoi(yytext); return NUM; }. { return yytext[0]; } %

CH4.9 CSE244 The Stack of Yacc  Yacc employs and maintains a stack that contains all semantic values/ attributes.  When Yacc makes a shift action  It enters into the stack the corresponding token identifier along with its semantic value as this is determined by YYSTYPE and/or %token declaration. The value is provided by yylval  When Yacc makes a reduce action for a production A: X 1 X 2 { $$ = f ($1,$2) };  stack is interpreted as: …[ X 1,$1] [X 2,$2]  I.e., yacc pops two stack elements and uses their semantic values to fill $1,$2  After the action: …[ A,$$]

CH4.10 CSE244 Attributes and Yacc  Standard rules is that we cannot refer to any semantic-value or attribute to the “right.”  E.g. the following will produce an error A: B C { $4 = f ($1,$2) } D;  Usually we do the computation for $$ at the end of a production.  Attributes of Yacc can be inherited in the following sense: A: B C { $2 = f ($1) }; But this is of limited use..

CH4.11 CSE244 Attributes and Yacc, II PRODUCTIONSEMANTIC RULE D  T LL.in = T.type T  int T.type = integer T  real T.type = real L  L 1, idL 1.in = L.in addtype(id.entry, L.in) L  id addtype(id.entry, L.in) %type T %type L % D: T { $2 = $1 }L; This is no Good: L: ID_TOKEN {addtype($1,$0)}; Dangerous Stuff !!! Instead we opt to look into the stack:

CH4.12 CSE but there is a more “sane” way to go  Use variables… When you code these productions: T  int T.type = integer T  real T.type = real T: REAL_TOKEN { current_type=$1 }; T: INT_TOKEN { current_type=$1 }; Then when time comes for the production: L  id addtype(id.entry, L.in) Code it as: L: ID_TOKEN { addtype($1,current_type) };  It is easy to see that current_type will hold the most recent type occurrence.  As a rule of thumb keep in mind the DFS traversal of the parse-tree.

CH4.13 CSE244 Always prefer the sane way.  Unless you want to prove to your friends what a yacc-freak you are.  + you want to show that you really understand grammars.  + you want to make it really hard for other people to understand your YACC programs.  IN THIS CASE prefer using $0,$-1,$-2,…  The highest negative number used in a yacc code earns higher yacc-geekiness degree.

CH4.14 CSE244 Translation with Yacc (Looking Ahead)  For a programming language:  Target to an Intermediate Language.  Not really assembly but very close to it.  Restricted set of commands:  Assignments with two operands. X=Y+Z  Goto (jump statements)  Conditional Goto’s using only two vars e.g.  If X>Y Goto LABEL  Push, Pop statements (stack)

CH4.15 CSE244 Translation with Yacc  Define the main attribute of any construct to be a char buffer of a certain size + have any additional typedef struct {char* translation; int var;} myattribute;  Then define semantics actions appropriately: e.g. expr : NUM { varcounter++; { varcounter++; append($$.translation, “a”, varcounter, “=“, $1); $$.var = varcounter; } | expr ‘ +‘ expr {varcounter++; | expr ‘ +‘ expr {varcounter++; append($$.translation, $1.translation); append($$.translation, $3.translation); append($$.translation, “a”, varcounter,“=“,“a”, $1.var, “+”, “a”, $3.var); $$.var = varcounter; } ;