Download presentation
Presentation is loading. Please wait.
1
CSCE 531 Compiler Construction
Lecture 17 Control Flow Topics Debugging Review: Positional Encoding of Booleans No class Thursday, we will reschedule! Readings: 5.7, 6.3, 7.4 March 29, 2018
2
Overview Last Time Last Time Didn’t Finish Today’s Lecture
Pascal like declarations Hierarchical Symbol Tables Numeric Implementation of Booleans Positional Encoding of Booleans Short Circuit Evaluation Last Time Didn’t Finish Debugging YACC generated parsers Today’s Lecture Review Semantic Attributes and actions for Boolean expressions Markers Attribute for Control Flow If statement If else (parsing first; should have done this earlier) If else semantic actions
3
/class/csce531 cocsce-l1d39-11> tree -L 1 csce531 csce531 ├── Postfix ├── SimpleYacc ├── Table ├── TestLexAnalyzer ├── TestPrograms └── Tree
4
Compiler Construction under UNIX Schreiner and Friedman
Postfix/mmmerror.c extern int yylineno; void mywhere(){ fprintf(stderr, "line %d", yylineno); } // FILE *yyerfp = stdout; yyerror(char *s, char *t) { extern int yynerrs; static int list = 0; if(s || ! list){ fprintf(stderr, "[error %d] ", yynerrs+1); mywhere(); Compiler Construction under UNIX Schreiner and Friedman
5
tree Makefile Tree.output clean:
6
TestLexAnalyzer cocsce-l1d39-11> tree TestLexAnalyzer/ TestLexAnalyzer/ ├── core2.tab.h ├── lex.yy.c ├── lex.yy.o ├── Makefile ├── regexp.py ├── struct.h ├── test8 ├── testLex ├── testLex.c ├── toName.c └── toName.o
7
./testLex < test8 cocsce-l1d39-11> ./testLex < test8 Token returned = 267 lexeme=program TokenDefinedConstant=PROGRAM Token returned = 274 lexeme=int TokenDefinedConstant=UNRECOGNIZED TOKEN CODE! Token returned = 278 lexeme=a TokenDefinedConstant=ID Token returned = 44 lexeme=, TokenDefinedConstant=, Token returned = 278 lexeme=b TokenDefinedConstant=ID Token returned = 278 lexeme=c TokenDefinedConstant=ID Token returned = 278 lexeme=d TokenDefinedConstant=ID Token returned = 59 lexeme=; TokenDefinedConstant=; Token returned = 268 lexeme=begin TokenDefinedConstant=BEGN Token returned = 260 lexeme=if TokenDefinedConstant=IF
8
testLex.c #include<stdio.h> int yylex(); #include "struct.h" #include "core2.tab.h" YYSTYPE yylval; extern char *yytext; char *toTokenName(); void main(){ int tokenCode = 0; while((tokenCode = yylex()) != 0){ printf("Token returned = %d\tlexeme=%s\t TokenDefinedConstant=%s\n", tokenCode, yytext, toTokenName(tokenCode)); }
9
core2.tab.h … enum yytokentype { INT = 258, RELOP = 259, IF = 260, THEN = 261, ELSE = 262, AND = 263, ASSIGNOP = 264, OR = 265, NOT = 266, PROGRAM = 267, BEGN = 268, END = 269, WHILE = 270, DO = 271, ENDLOOP = 272,
10
toName.c char * toTokenName(int tokenCode){ char ascii[2]; ascii[1]=(char)0; // EOS (End Of String) if(tokenCode < 128){ ascii[0] = (char) tokenCode; return(strdup((char *) ascii)); } switch (tokenCode){ case 258: return("INT"); case 259: return("RELOP"); case 260: return("IF"); case 261: return("THEN"); … case 278: return("ID"); default: return("UNRECOGNIZED TOKEN CODE!");
11
Python3 regular expressions
import re then fairly standard regular expression patterns Grouping using parentheses m = re.search(‘[0-9]{3}-[0-9]{2}-([0-9]{4})’, string) re.search – search for pattern (regexpr) in string Return match in ‘m’ (a match object) Note parentheses specify grouping last4 = m.group(1)
12
import re filename="core2. tab
import re filename="core2.tab.h" … # compile the regular expression to do searches etc. dfa = re.compile("\s\s\s\s([A-Z][A-Z]+)\s*=\s*([0-9][0-9][0-9])") # open the file "core2.tab.h" and process each line by matching the reg expr f = open(filename) for line in f: match = dfa.search(line) if match: print("\t\t case "+ match.group(2) + ": return(\"" + match.group(1) + "\");")
13
Generate header of the function
# create the header part of the function print("char *") print("toTokenName(int tokenCode){") print("\t char ascii[2];") print("\t ascii[1]=(char)0; // EOS (End Of String) ") print("\t if(tokenCode < 128){") print("\t\t ascii[0] = (char) tokenCode;") print("\t\t return(strdup((char *) ascii));") print("\t }") print("\t switch(tokenCode){")
14
dfa = re. compile("\s\s\s\s([A-Z][A-Z]+)\s. =\s
dfa = re.compile("\s\s\s\s([A-Z][A-Z]+)\s*=\s*([0-9][0-9][0-9])") # open the file "core2.tab.h" and process each line by matching the re f = open(filename) for line in f: match = dfa.search(line) if match: print("\t\t case "+ match.group(2) + ": return(\"" + match.group(1) + "\");")
15
bison -- report=all State 8 1 task: task expr ';' . $default reduce using rule 1 (task) State 9 4 expr: expr . '+' expr 4 | expr '+' expr . ['+', ';'] 5 | expr . '*' expr '*' shift, and go to state 7 $default reduce using rule 4 (expr) Conflict between rule 4 and token '+' resolved as reduce (%left '+'). Conflict between rule 4 and token '*' resolved as shift ('+' < '*').
16
Debugging Parsers written with Yacc
Debug the grammar Rewrite grammar to eliminate reduce/reduce and as many shift/reduce as you can. Tracing parses using –t option to bison or yacc -DYYDEBUG compile option int yydebug=1; in flex specification (C definitions section %{ ..%} extern int yydebug; in yacc specification Debug the semantic actions Compile with –g option; set CFLAGS=-g in Makefile and use gcc $(CFLAGS) … as the compile (or rely on the builtin rules) Use gdb (Gnu debugger) to debug the program
17
Parsing Traces with yydebug=1
Setting up for traces –t option to bison or yacc -DYYDEBUG compile option int yydebug=1; in lex specification C definitions section %{ … %} extern int yydebug; in yacc specification Generate .output file and print out so you can follow along
18
Stack now Entering state 10 Stack now Entering state 9 Reading a token: Next token is token ';' () Reducing stack by rule 4 (line 55): $1 = nterm expr () $2 = token '+' () $3 = nterm expr () -> $$ = nterm expr () Stack now 0 1 Entering state 5 Next token is token ';' () Shifting token ';' () Entering state 8 Reducing stack by rule 1 (line 46): $1 = nterm task () $2 = nterm expr () $3 = token ';' () ./tree Dumping tree + ID x * ID y ID z
19
Common Mistakes Segmentation fault - This means you have referenced a memory location that is outside of the memory segment of your program. You have a pointer that has a bad value! First make sure everytime you copy a string value you use strdup. Several people have had errors with strcat(s,t) where they did not allocate space for the string “s”. Use gdb and bt (backtrace) to trace down the pointer with the bad value
20
GDB - Essential Commands
gdb program [core] - debug program [using coredump core] b [file:] function set breakpoint at function [in file] run [arglist] start your program [with arglist] bt backtrace: display program stack p expr display the value of an expression c continue running your program n next line, stepping over function calls s next line, stepping into function calls
21
Example using gdb deneb> make bison -d decaf.y
decaf.y contains 51 shift/reduce conflicts. gcc -c -g decaf.tab.c flex decaf.l gcc -DYYDEBUG -g -c lex.yy.c gcc -c -g tree.c gcc -DYYDEBUG -g decaf.tab.o lex.yy.o tree.o -ly -o decaf deneb> ./decaf < t1 Keyword int Segmentation Fault (core dumped) !!!
22
Note the use of the –g option (CFLAGS=-g in Makefile
Example using gdb deneb> make bison -d decaf.y decaf.y contains 51 shift/reduce conflicts. gcc -c -g decaf.tab.c flex decaf.l gcc -DYYDEBUG -g -c lex.yy.c gcc -c -g tree.c gcc -DYYDEBUG -g decaf.tab.o lex.yy.o tree.o -ly -o decaf deneb> ./decaf < t1 Keyword int Segmentation Fault (core dumped) Note the use of the –g option (CFLAGS=-g in Makefile
23
Getting in and out of GDB
deneb> gdb decaf GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, … (gdb) run < t1 Starting program: /class/csce /submissions/huang27/review01/decaf < t1 before treeprint demo=Program info=X demo= info= Program received signal SIGSEGV, Segmentation fault. 0xff2b3474 in strlen () from /usr/lib/libc.so.1 (gdb) quit The program is running. Exit anyway? (y or n) y deneb>
24
Backtrace(bt) To see the Activation Stack
(gdb) run < t1 Starting program: /class/csce /submissions/huang27/review01/decaf < t1 before treeprint demo=Program info=X demo= info= Program received signal SIGSEGV, Segmentation fault. 0xff2b3474 in strlen () from /usr/lib/libc.so.1 (gdb) bt #0 0xff2b3474 in strlen () from /usr/lib/libc.so.1 #1 0xff in _doprnt () from /usr/lib/libc.so.1 #2 0xff3072cc in printf () from /usr/lib/libc.so.1 #3 0x199e8 in treeprint (p=0x32718) at tree.c:9 #4 0x19a0c in treeprint (p=0x32738) at tree.c:11 #5 0x19a0c in treeprint (p=0x32748) at tree.c:11 #6 0x11414 in yyparse () at decaf.y:91 #7 0xff3804e0 in main () from /usr/lib/liby.so.1
25
Printing values (gdb) up
#1 0xff in _doprnt () from /usr/lib/libc.so.1 #2 0xff3072cc in printf () from /usr/lib/libc.so.1 #3 0x199e8 in treeprint (p=0x32718) at tree.c:9 printf("demo=%s info=%s\n", p->demo, p->info); (gdb) print *p $1 = {tag = VDECL, child = 0x326e8, next = 0x1, info = 0x1adf8 "", tagname = 0x7 <Address 0x7 out of bounds>, demo = 0x0, value = {i = 1, d = e-314, s = 0x1 <Address 0x1 out of bounds>}} (gdb)
26
Markers Markers – Grammar symbol that only derive ε which are used perform some semantic action and usually to save some attribute value Example S A B M C D In this case we want to perform some action when the production M ε is reduced, like saving the starting place of the code the evaluates C Markers for handling Booleans using the positional encoding M ε (M.quad – marks the start of a piece of code) N ε (N.next – a list consisting of a single quad that needs to have its target field filled in later.)
27
Semantic Actions for B B or M B
backpatch($1.false,$3); $$.false = $4.false; $$.true = merge($1.true, $4.true); } M: {$$ = nextquad;} /* ε production */ ;
28
S: ID ASSIGNOP expr { gen(ASSIGNOP, $<place>3, VOID, $1); $$ = NULL; } | IF B THEN M S N ELSE M S { backpatch($2.true, $4); backpatch($2.false, $8); tmplist = merge($5, $6); $$ = merge(tmplist, $9); ; M: {$$ = nextquad;} N: {gen(GOTO, VOID, VOID, VOID); $$ = makelist(nextquad - 1);
29
Project Two and Three Due Tonight – dropbox project two Project Three
Test cases using yourlogin name followed by a numeral as the names, e.g., matthews1, matthews2, matthews3 Your grammar file without necessarily having any semantic actions, except for expressions and undefined variables. Your Lex, Makefile (not necessarily any routines for semantic actions) We will create a TestDir with all of these files that you can get to. Note I will not vouch for the validity of any test program. Project Three Add in Boolean to grammar: && (AND), || (OR) and !(not) Quad Array List manipulating functions: makelist, merge, and backpatch Add semantic actions for control-flow statements Due Sunday April 7!
30
Test 2 April 19th !! (last day I can give it.) Exam May 8
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.