1 Using Yacc: Part II
2 Main() ? How do I activate the parser generated by yacc in the main() –See mglyac.y
3 main(int argc, char **argv) { char *outfile; char *infile; extern FILE *yyin, *yyout; progname = argv[0]; if(argc > 3) { fprintf(stderr,usage, progname); exit(1); } if(argc > 1) { infile = argv[1]; /* open for read */ yyin = fopen(infile,"r"); if(yyin == NULL) /* open failed */ { fprintf(stderr,"%s: cannot open %s\n", progname, infile); exit(1); } if(argc > 2) { outfile = argv[2]; } else {outfile = DEFAULT_OUTFILE;} yyout = fopen(outfile,"w"); if(yyout == NULL) /* open failed */ { fprintf(stderr,"%s: cannot open %s\n", progname, outfile); exit(1); } yyparse(); end_file(); /* write out any final information */ exit(0); /* no error */ } (Definition section) % (Rules section) % (User subroutines section) Place this code in the User subroutines section
4 Yacc Ambiguities and Conflicts Read the Chapter 8 of “lex & yacc” –It focuses on finding and correcting conflicts within a yacc grammar Rather than telling where your conflicts lie in your yacc grammar, yacc tells where they are in y.output –You can generate y.output by running yacc with the –v (verbose) option “ yacc –dv filename.y ”
5 Shift/Reduce Conflicts Identifying a shift/reduce conflict is a little harder. To identify the conflict, we will do the following: –Find the shift/reduce error in y.output –Pick out the reduce rule –Pick out the relevant shift rules –See where the reduce rule reduces to –Deduce the token stream that will produce the conflict
6 Common Examples of Conflicts Expression Grammars IF-THEN-ELSE –The so-called dangling else problem in C-like programming language Fortran 95, Ada, Perl do not have this problem –Nested List Grammars
7 How Do I Fix the Conflict?
8 Error Reporting and Recovery Read the Chapter 9 of “lex & yacc” How the parser and lexical analyzer detect errors Yacc provides the error taken and the yyerror() routine
9 Error reporting The default yacc error only declares that a syntax error exists exists and to stop parsing –Actually, it calls yyerror(“syntax error”);
10 Error reporting (Cont’d) The duty for error correction does not lie with yacc alone, however. Many fundamental errors are better detected by lex. –Check page 244 about the example of using lex to detect an unterminated quoted string Note that that $ means end of the line in lex.
11 Error reporting (Cont’d) It is a lot more useful to tell the user that the string is the wrong type rather than just saying “syntax error” –In the example shown at P. 245, when the yacc grammar detects an improper string literal or identifier, it can pinpoint the type of error. It introduces a new non-terminal
12 Better Lex Error Reports How to report the line number whenever a syntax error happens? –Trace the error1.y and error1.l You should download ErrHandling.tar –Note that you may need to use yyless(1) ~/Compilers $ more errori1 abc=1 abc bc=abc*2/3 bc d=abc+bc d ~/Compilers $ more errori2 abc=1 abc bc=abc*2/3 bc=bc+1-+2 d=abc+bc d ~/Compilers $./error1<errori1 = 1 = = ~/Compilers $./error1<errori2 = 1 syntax error happened at line 4 ~/Compilers $
13 To reprocess input The function yyless() resets the end point of the current token. yyless() takes a single integer argument: yyless(n) causes the current token to consist of the first n characters of what was originally matched (that is, up to yytext[n-1]). The remaining yyleng-n characters are returned to the input stream. –Check the error3.y and error3.l
14 Error Recovery Why error recovery is necessary? –It may be possible to recover from the error and continue examining the file for additional errors, stopping the compiler before invoking the next stage. –This technique improves the productivity of the programmer by shortening the edit-compile-test cycle, since several errors can be repaired in each iteration of the cycle.
15 Yacc Error Recovery Yacc has some provision for error recovery, by using the error token. After reporting a syntax error, a yacc parser discards any partially parsed rules until it finds one in which it can shift an error token. It then reads and discards input tokens until it finds one which can follow the error token in the grammar.
16 Yacc Error Recovery (Cont’d) When an error occurs, the parser stops unless you provide error-handling subroutines. To continue processing the input to find more errors, restart the parser at a point in the input stream where the parser can try to recognize more input. One way to restart the parser when an error occurs is to discard some of the tokens following the error. Then try to restart the parser at that point in the input stream.
17 Yacc Error Recovery (Cont’d) The yacc command uses a special token name, error, for error handling. Put this token in the rules file at places that an input error might occur so that you can provide a recovery subroutine. If an input error occurs in this position, the parser executes the action for the error token, rather than the normal action.
18 Yacc Error Recovery (Cont’d) For example, a rule of the following form. –stat : error ';' It tells the parser that when there is an error, it should ignore the token and all following tokens until it finds the next semicolon. All tokens after the error and before the next semicolon are discarded. After finding the semicolon, the parser reduces this rule and performs any cleanup action associated with it.
19 Yacc Error Recovery (Cont’d) However, in this example, the parser stays in the error state for three input tokens following the error. To allow for this condition, use the following yacc statement. When the parser finds this statement, it leaves the error state and begins processing normally. –yyerrok;
20 statement: NAME '=' expression { $1->value = $3; } | expression { printf("= %g\n", $1); } | error '\n' { yyerrok; printf("Something Wrong\n"); } Terminator Error handling routine
21 Yacc Error Recovery (Cont’d) Check the code –error2.y error2.l –./error2<errori3
22 Yacc Error Recovery (Cont’d) Clearing the Look-Ahead Token –The look-ahead token is the next token that the parser examines. –When an error occurs, the look-ahead token becomes the token at which the error was detected. – However, if the error recovery action includes code to find the correct place to start processing again, that code must also change the look-ahead token. –To clear the look-ahead token, include the following statement in the error-recovery action: yyclearin ;
23 Yacc Error Recovery (Cont’d) More complicated example input : error '\n' { yyerrok; printf(" Reenter last line: " ); } input { $$ = $4; } ;