CH4.1 CSE244 Bottom Up Translation Aggelos Kiayias Computer Science & Engineering Department The University of Connecticut 371 Fairfield Road, Unit 1155 Storrs, CT
CH4.2 CSE244 S-Attributed Definitions PRODUCTIONSEMANTIC RULE L E nprint(E.val) E E 1 + TE.val = E 1.val + T.val E TE.val = T.val T T 1 * FT.val = T 1.val * F.val T FT.val = F.val F (E)F.val = E.val F digitF.val = digit.lexval
CH4.3 CSE244 Semantic Rules Using the Stack of the S/R parser PRODUCTIONSEMANTIC RULE L E nprint(val[top]) E E 1 + Tval[ntop]=val[top-2]+val[top] E Tval[ntop]=val[top] T T 1 * Fval[ntop]=val[top-2]*val[top] T Fval[ntop]=val[top] F (E)val[ntop]=val[top-1] F digitval[ntop]=val[top] top = top of the stack ntop = top of the stack after popping right hand side of production. val[…] = attribute value of stack contents
CH4.4 CSE244 A trace of a Shift Reduce Parser with Attribute Evaluation STACKINPUTACTION
CH4.5 CSE244 A trace of a Shift Reduce Parser with Attribute Evaluation $3*5+4n$shift $[digit,3]*5+4n$ reduce F digit $[F,3]*5+4n$ reduce T F $[T,3]*5+4n$shift $[T,3][*,.]5+4n$shift $[T,3][*,.][digit,5]+4n$ reduce F digit $[T,3][*,.][F,5]+4n$ reduce T T*F $[T,15]+4n$ reduce E T $[E,15]+4n$shift $[E,15][+,.]4n$shift $[E,15][+,.][digit,4]n$ reduce F digit $[E,15][+,.][F,4]n$ reduce T F $[E,15][+,.][T,4]n$ reduce E E*T $[E,19]n$shift $[E,19][n,.]$ reduce L En ; print(19) $[L,.]$ACCEPT
CH4.6 CSE244 Inherited Attributes and Bottom Up Translation Some inherited attributes might not be available when we are reducing by a certain production. Consider the translation scheme: PRODUCTIONSEMANTIC RULE D T LL.in = T.type T int T.type = integer T real T.type = real L L 1, idL 1.in = L.in addtype(id.entry, L.in) L id addtype(id.entry, L.in) Attempt B-U parsing over real id 1, id 2, id 3
CH4.7 CSE244Example Example real id 1, id 2, id 3 id entry=id 2 id entry=id 3, L in=real T type=real D real id entry=id 1, L in=real L in=real addtype(id 1,real) addtype(id 2,real) addtype(id 3,real)
CH4.8 CSE244 Parsing Example. STACKInputAction $real a,b,c$SHIFT $[real, lexval=‘real’]a,b,c$REDUCE T real modify type $[T, type = ‘real’]a,b,c$SHIFT $[T, type = ‘real’] [id, lexval=‘a’],b,c$REDUCE L id requires L.in $[T, type = ‘real’] [L, …],b,c$SHIFT $[T, type = ‘real’] [L, …], [id, lexval=‘b’],c$REDUCE L L, id requires L.in $[T, type = ‘real’] [L, …],c$SHIFT $[T, type = ‘real’] [L, …], [id, lexval=‘c’] $REDUCE L L, id requires L.in Value of L.in is not necessary… We might look into the stack and recover its intended value…
CH4.9 CSE244 Bottom Up Translation with Inherited Attributes Try to predict the location in the stack that you can recover the value of the inherited attribute you need. PRODUCTIONSEMANTIC RULE D T L T int val[ntop] = integer T real val[ntop] = real L L 1, idaddtype(id.entry, val[top-3]) L id addtype(id.entry, val[top-1]) Dangerous Stuff !!!
CH4.10 CSE244 A Brief Look into Yacc Terminals and Non-terminals in Yacc they have a “semantic value” (one attribute). The attribute type is specified by YYSTYPE Different typing for attributes is achieved through %union The current terminal semantic values are passed by Lex in the variable yylval (specified by Lex). Semantic values for a certain production are specified by the symbols $$, $1, $2, $3, … For a certain production, LHS is $$ and $1, $2, $3, … denote the semantic values of each item in the RHS. Also one can use $0, $-1, $-2, … to peek into the Yacc stack (beyond the current production). Dangerous Stuff !!!
CH4.11 CSE244 A Brief Look into Yacc,II Frequently one needs more versatility in defining the semantic values of non-terminals. (i.e., various different attributes). We define a collection of data types using %union We determine the attribute type of each symbol by using the %type declaration.
CH4.12 CSE244 Defining Attributes in Lex/Yacc typedef struct { int value; } myattribute; %union{ int number_type; int ident_type; int ident_type; myattribute myattribute_type; } myattribute myattribute_type; } %token ID %token NUM %type expr %left '+' % expr : ID { printf("%d\n", $1); $$.value = $1; } | NUM { printf("%d\n", $1); $$.value = $1; } | NUM { printf("%d\n", $1); $$.value = $1; } | expr '+' expr { $$.value = $1.value + $3.value; printf("%d\n", $$.value); } | expr '+' expr { $$.value = $1.value + $3.value; printf("%d\n", $$.value); } ;% *************** LEX FILE ***************** id [A-Za-z][A-Za-z0-9]* num [0-9]+ ws [ \t]+ % {ws} /* do nothing */ {id} { yylval.ident_type = 44; return ID; } {numr} { yylval.number_type = atoi(yytext); return NUM; }. { return yytext[0]; } %
CH4.13 CSE244 The Stack of Yacc Yacc employs and maintains a stack that contains all semantic values/ attributes. When Yacc makes a shift action It enters into the stack the corresponding token identifier along with its semantic value as this is determined by YYSTYPE and/or %token declaration. The value is provided by yylval When Yacc makes a reduce action for a production A: X 1 X 2 { $$ = f ($1,$2) }; stack is interpreted as: …[ X 1,$1] [X 2,$2] I.e., yacc pops two stack elements and uses their semantic values to fill $1,$2 After the action: …[ A,$$]
CH4.14 CSE244 Attributes and Yacc Standard rules is that we cannot refer to any semantic-value or attribute to the “right.” E.g. the following will produce an error A: B C { $4 = f ($1,$2) } D; Usually we do the computation for $$ at the end of a production. Attributes of Yacc can be inherited in the following sense: A: B C { $2 = f ($1) }; But this is of limited use..
CH4.15 CSE244 Attributes and Yacc, II PRODUCTIONSEMANTIC RULE D T LL.in = T.type T int T.type = integer T real T.type = real L L 1, idL 1.in = L.in addtype(id.entry, L.in) L id addtype(id.entry, L.in) %type T %type L % D: T { $3 = $1 }L; This is no Good: L: ID_TOKEN {addtype($1,$0)}; Dangerous Stuff !!! Instead we opt to look into the stack:
CH4.16 CSE but there is another way to go Use variables… When you code these productions: T int T.type = integer T real T.type = real T: REAL_TOKEN { current_type=$1 }; T: INT_TOKEN { current_type=$1 }; Then when time comes for the production: L id addtype(id.entry, L.in) Code it as: L: ID_TOKEN { addtype($1,current_type) }; It is easy to see that current_type will hold the most recent type occurrence. As a rule of thumb keep in mind the DFS traversal of the parse-tree.
CH4.17 CSE244 Which way to use? If you want to prove to your friends what a yacc- freak you are. + you want to show that you really understand grammars. + you want to make it really hard for other people to understand your YACC programs. IN THIS CASE prefer using $0,$-1,$-2,… The highest negative number used in a yacc code earns higher yacc-geekiness degree.
CH4.18 CSE244 Translation with Yacc (Looking Ahead) For a programming language: Target to an Intermediate Language. Not really assembly but very close to it. Restricted set of commands: Assignments with two operands. X=Y+Z Goto (jump statements) Conditional Goto’s using only two vars e.g. If X>Y Goto LABEL Push, Pop statements (stack)
CH4.19 CSE244 Translation with Yacc Define the main attribute of any construct to be a char buffer of a certain size + have any additional typedef struct {char* translation; int var;} myattribute; Then define semantics actions appropriately: e.g. expr : NUM { varcounter++; { varcounter++; append($$.translation, “a”, varcounter, “=“, $1); $$.var = varcounter; } | expr ‘ +‘ expr {varcounter++; | expr ‘ +‘ expr {varcounter++; append($$.translation, $1.translation); append($$.translation, $3.translation); append($$.translation, “a”, varcounter,“=“,“a”, $1.var, “+”, “a”, $3.var); $$.var = varcounter; } ;