Download presentation
Presentation is loading. Please wait.
1
The Recursive Descent Algorithm
A useful predictive parser for many applications. Under Construction (Nov 16)
2
The Recursive Descent Algorithm
The recursive descent algorithm directly implements a grammar written as EBNF rules. The rules should not contain left recursion There is one function (method) for each EBNF rule. Each method parses the input corresponding to its EBNF rule, and returns a value. The value may be: a node on the abstract syntax tree of the input value computed by evaluating the input (e.g. a calculator) Recursive descent is a predictive parser. Limited look-ahead ("peek" at the next token) can be incorporated.
3
Recursive-descent intro (0)
Grammar: expr => expr + term | expr - term | term term => term factor | factor factor => '(' expr ')' | number
4
Recursive-descent intro (0.5)
Grammar in EBNF (no "self-recursion"): expr => term { ( + | -) term } term => factor { factor } factor => '(' expr ')' | number
5
Recursive-descent intro (1)
Grammar: expr => term { + term } term => factor { factor } factor => '(' expr ')' | number Generic C code for concept only (don't use this): expr() { term(); while(token=='+') { match('+'); } term() { factor(); while(token=='*') { match('*'); }
6
Recursive-descent intro (2)
Grammar: expr => term { + term } term => factor { factor } factor => '(' expr ')' | number Factor and number: factor() { if (token == ‘(‘) { match('('); expr( ); match(‘)’); } else number( ); number() { if ( isNumber(token) ) { add_to_parse_tree(); nextToken( ); } else error("invalid number");
7
Recursive-descent intro (3)
match(value) is a utility that requires a match: if current token matches the argument, consume the token and get next token. Otherwise print an error. ... and then what? void match(char what) { if ( *token == what ) { nextToken( ); } else { /* 'printf' style error function */ error("expected %c got %s", what, token);
8
Where's the token? In this algorithm, token is a global variable that always contains the next unread token. nextToken() returns true if there are more tokens, and also sets the token variable. boolean nextToken( ) { token = scanner( ); return ( token != EOF ); } Another utility function is match(value): 1) if value matches token, get a new token 2) if value doesn't match, raise an error condition.
9
Where's the output? In the generic algorithm, the result is a global variable. The methods must either return a value or accumulate value as a side effect. Rules which have terminal values should return the terminal value. factor => ( expr ) | number number() { if ( isNumber(token) ) { // add token to the parse tree // or return a value } else error("invalid number");
10
Recursive Descent Example (1)
Let's look at a recursive descent code for a calculator. We will modify the generic algorithm so that each function returns a double value. input: expr '\n' expr: term { (+|-) term } term: factor { (*|/) factor} factor: '(' expr ')' | number
11
Recursive Descent Example (1)
Let's look at a recursive descent code for a calculator. We will modify the generic algorithm so that each function returns a double value. Example: here is a modified expr( ) function double expr() { double expr = term(); while( token =='+' || token =='-' ) ) { if (token == '+') { match('+'); expr = expr + term(); } else { match('-'); expr = expr - term(); } return expr ; Grammar Rule: expr: term { (+|-) term }
12
Recursive Descent Example (2)
The rule for factor is more interesting: we must check the first token to decide which alternative to use, then double factor() { double fact; if ( token == '(' ) { nextToken( ); fact = expr( ); match( ')' ); return fact; } else { fact = number( ); Grammar Rule: factor: '(' expr ')' | number
13
Recursive Descent Example (3)
Input: 2 * 3 + ( ) / 6 Progress: token = "2" input line token = nextToken(); ans = expr( );
14
Recursive Descent Example (4)
Input: 2 * 3 + ( ) / 6 Progress: token = "2" input line ans = expr( ); expr expr = term( ); expr( ) { expr = term( ); while ( token=='+'|| token='-') { if ( token=='+' ) { match('+'); expr = expr + term( ); } else { match('-'); expr = expr - term( ); } return expr;
15
Recursive Descent Example (5)
Input: 2 * 3 + ( ) / 6 Progress: token = "2" input line ans = expr( ); expr expr = term( ); term term = factor( ); term( ) { term = factor( ); while ( token=='*' || token=='/' ) { if ( token=='*' ) { match('*'); term = term * factor( ); } else { match('/'); term = term / factor( ); }
16
Recursive Descent Example (6)
Input: 2 * 3 + ( ) / 6 Progress: token = "*" input line ans = expr( ); expr expr = term( ); term term = factor( ); factor( ) { if ( token=='(' ) { match('('); fact = expr( ); match(')'); } else { fact = number( ); } factor fact = number( ); /* token = '*' */ return fact
17
Recursive Descent Example (7)
Input: 2 * 3 + ( ) / 6 Progress: token = "3" input line ans = expr( ); expr expr = term( ); term term = 2; term = term * factor( ); term( ) { term = factor( ); while ( token=='*' || token='/' ) { if ( token=='*' ) { match('*'); term = term * factor( ); } else { match('/'); term = term / factor( ); }
18
Recursive Descent Example (8)
Input: 2 * 3 + ( ) / 6 Progress: token = "+" input line ans = expr( ); expr expr = term( ); term term = term * factor( ); factor( ) { if ( token=='(' ) { match('('); fact = expr( ); match(')'); } else { fact = number( ); } factor fact = number( ); /* token = '*' */ return fact
19
Recursive Descent Example (9)
Input: 2 * 3 + ( ) / 6 Progress: token = "+" input line ans = expr( ); expr expr = term( ); term term = term * 3; return term term( ) { term = factor( ); while ( token=='*' || token=='/' ) { if ( token=='*' ) { match('*'); term = term * factor( ); } else { match('/'); term = term / factor( ); }
20
Recursive Descent Example (10)
Input: 2 * 3 + ( ) / 6 Progress: token = "(" input line ans = expr( ); expr expr = 6; token = '+' match('+') expr = expr term( ) expr( ) { expr = term( ); while( token=='+'|| token=='-') { if ( token=='+' ) { match('+'); expr = expr + term( ); } else { match('-'); expr = expr - term( ); } return expr;
21
Recursive Descent Example (11)
Input: 2 * 3 + ( ) / 6 Progress: token = "(" input line ans = expr( ); expr expr = term( ); term term = factor( ); term( ) { term = factor( ); while ( token=='*' || token=='/' ) { if ( token=='*' ) { match('*'); term = term * factor( ); } else { match('/'); term = term / factor( ); }
22
Recursive Descent Example (12)
Input: 2 * 3 + ( ) / 6 Progress: token = "4" input line ans = expr( ); expr expr = term( ); term term = term * factor( ); factor( ) { if ( token=='(' ) { match('('); fact = expr( ); match(')'); } else { fact = number( ); } return fact; factor match('(') fact = expr( );
23
Recursive Descent Example (13)
Input: 2 * 3 + ( ) / 6 Progress: token = "4" input line ans = expr( ); expr expr = term( ); expr( ) { expr = term( ); while (token=='+'|| token=='-') { if ( token=='+' ) { match('+'); expr = expr + term( ); } else { match('-'); expr = expr - term( ); } return expr; term term = term * factor( ); factor fact = expr( ); expr expr = term( );
24
Recursive Descent Example (14)
Input: 2 * 3 + ( ) / 6 Progress: token = "-" input line ans = expr( ); expr expr = term( ); term term = term * factor( ); factor fact = expr( ); expr expr = term( ); term term = factor( ); factor fact = number( ); /* = 4, token = "-" */
25
Recursive Descent Example (15)
Input: 2 * 3 + ( ) / 6 Progress: token = "-", then token = "5" input line ans = expr( ); expr expr = term( ); expr( ) { expr = term( ); while (token=='+'|| token=='-') { if ( token=='+' ) { match('+'); expr = expr + term( ); } else { match('-'); expr = expr - term( ); } return expr; term term = term * factor( ); factor fact = expr( ); expr match('-') expr = expr term( );
26
Recursive Descent Example (16)
Input: 2 * 3 + ( ) / 6 Progress: token = "5" input line ans = expr( ); expr expr = term( ); term term = term * factor( ); factor fact = expr( ); expr expr = expr - term( ); term term = factor( ); factor fact = number( ); /* = 5 . token = ")" */
27
Recursive Descent Example (16)
Input: 2 * 3 + ( ) / 6 Progress: token = "/" input line ans = expr( ); expr expr = term( ); term( ) { term = factor( ); while ( token=='*' || token=='/' ) { if ( token = '*' ) { ... } term term = factor( ); factor fact = expr( ); match(')'); return fact; expr expr = 4 - 5; return expr term term = 5; return term; factor return 5
28
Recursive Descent Example (17)
Input: 2 * 3 + ( ) / 6 Progress: token = "6" input line ans = expr( ); expr expr = term( ); term term = -1; match('/') term = term / factor( ); term( ) { term = factor( ); while ( token=='*' || token=='/' ) { if ( token=='*' ) { match('*'); term = term * factor( ); } else { match('/'); term = term / factor( ); }
29
Imperative Approach to Parsing
In the generic algorithm, the token is a global variable, and the results of the parse are a side effect (a change to global variables or structures) bison and flex operate this way, too. Programs difficult to understand and maintain. No error recovery in generic algorithm. /* yylex uses global variables / constants. */ int yylex( ) { ... if ( isdigit(c) ) { ungetc(c, stdin); scanf("%lf", &yylval); return INT; }
30
O-O Approach to Parsing
In O-O approach, we can return an object to allow a scanner and parser without global variables. First, let's look at the overall design. <<interface>> Iterator <<enum>> TokenType refex : Patterm IDENTIFIER OPERATOR NUMBER hasNext() next() Parser Scanner parseTree: TreeSet token: Token scanner: Iterator instream: InputStream token: Token hasNext( ) : boolean expression( ) : Node Token next( ) : Token term( ) : Node type value factor( ) : Node match( String ) : boolean
31
O-O Scanner The Scanner should provide two services: test for more tokens and return the next token. In this view, a Scanner looks like an Iterator<Token>. A "token" has both a type and a value. /** Token class */ public class Token { Type type; /* consider an enumeration */ public Object value; /* can be anything */ public Token(Type type, Object value) {...} public Object getValue( ) { ... } }
32
O-O Parser The Parser implements the parsing algorithm.
Result is either a parse tree or a value (calculator application). Use an attribute to represent next token. /** Parser class */ public class Parser { Iterator<Token> scanner; private Token token; private TreeNode result; /* parse tree */ TreeNode expression( ) { ... }; TreeNode term( ) { ... }; TreeNode factor( ) { ... }; boolean match( String what ) { ... }; boolean match( Type what ) { ... }; }
33
O-O Parser for Calculator
For a calculator, the parser can compute result. Can use a primitive data type for expression, etc. /** Parser class */ public class Parser { Iterator<Token> scanner; private Token token; private double result; double expression( ) { ... }; double term( ) { ... }; double factor( ) { ... }; boolean match( String what ) {...}; }
34
Observation: match If the generic algorithm, the token is almost always tested before calling match. Eliminate redundancy by redefining match(value) to return a boolean value if token matches. if match, then consume the token. private boolean match( String what ) { if ( ! (token.value instanceof String) ) return false; if ( what.equals( (String)(token.value) ) ) { token = scanner.next( ); return true; }
35
O-O Parser for Calculator (2)
Example method: expression EBNF: expr ::= term { (+ | -) term } private double expression( ) { double result = term( ); while( true ) { if ( match("+") ) result += term( ); else if ( match("-") ) result -= term( ); else break; /* why not error( )? */ } return result;
36
O-O Parser : Top-Level What is the top-level routine of the parser?
Look at standard bison code for inspiration: %% /* Bison grammar rules */ input : /* empty input */ | input line ; line : expr '\n' { output( $1 ); }
37
Parsing Errors How are you going to handle parsing errors?
You might have many levels of function calls... input line result = expr( ); expr expr = term( ) { +|- term( ) }; term term = factor( ) { *|/ factor( ) }; factor factor = '(' expr() ')' | number() ...; Using recursive-decent, parse errors are usually detected at the bottom of the tree: in factor, number, etc. expr term factor Parse error found here
38
Parsing Errors If you set an error flag or return an error result, then all the methods must check for this condition... input line if ( error ) print "parse error"; expr if ( error ) return /* what value? */; This error checking will make your methods longer and harder to understand. term if ( error ) return /* what value? */; factor if ( error ) return /* what value? */; expr if ( error ) return /* what value? */; term if ( error ) return /* what value? */; factor Parse error found here
39
Throwing an Exception Your code will be simpler if the methods simply throw an exception, and let the top-most method catch it. input line try { result = expr( ); } catch (ParseException e) {/*error*/} expr expr( ) throws ParseException { ... } term term( ) throws ParseException { ... } Let someone else handle it! factor factor( ) throws ParseException { ... } expr expr( ) throws ParseException { ... } term term( ) throws ParseException { ... } factor throw new ParseException( )
40
Using Java's ParseException
Java has a ParseException class you can use: java.text.ParseException the constructor requires two parameters: new ParseException("error message", offset); Example: number( ) { /* parse a number */ whitespace(); token = tokenizer.next(); if ( token.type != TokenType.NUMBER ) throw new ParseException( "invalid number", cptr);
41
Defining your own ParseException
You can define a new Exception type for your own use import java.io.IOException; class ParseException extends IOException { /* constructors */ ParseException() { super("Parse Error"); } ParseException(String msg) { super(msg); } ParseException(String msg, int column) { super(msg + " in column " + column); }
42
Using ParseException factor( ) {
You should try to return useful error messages, such as... factor( ) { if ( match('(') ) { result = expr( ); if ( ! match(')') ) throw new ParseException("missing right parenthesis"); } The getMessage( ) method returns the error message... try { } catch(ParseException e) { println( e.getMessage() ); Including the column number in error messages can be helpful.
43
Parsing Unary Minus Sign
Parsing negative numbers and unary minus can also be tricky. The following are valid expressions in most languages: sum = sum + -1; sum = sum - -2; sum = sum * -x; The GNU C compiler (gcc) allows a space after the unary "-" : sum = sum - - 2; Exponentiation has higher precedence than unary minus, so it should be incorporated in a rule at the bottom of your grammar rules: -2 ^ means - (2^3)
44
What's Next? Later we will add to the implementation...
symbol table and assignments x = 3.5E7 a = 5 b = 0.1 y = ( a*x + b ) / ( a*x - b ) built-in functions y = sqrt( x ) user defined functions function f(x) = a*x + b f(0.5)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.