Top-Down Parsing Identify a leftmost derivation for an input string

Top-Down Parsing Identify a leftmost derivation for an input string
Why ? By always replacing the leftmost non-terminal symbol via a production rule, we are guaranteed of developing a parse tree in a left-to-right fashion that is consistent with scanning the input. A  aBc  adDc  adec (scan a, scan d, scan e, scan c - accept!) Recursive-descent parsing concepts Predictive parsing Recursive / Brute force technique non-recursive / table driven Error recovery Implementation

Top-Down Parsing From Grammar to Parser, take I

Recursive Descent Parsing
General category of Parsing Top-Down Choose production rule based on input symbol May require backtracking to correct a wrong choice. Example: S  c A d A  ab | a input: cad cad S c d A a b Problem: backtrack cad S c d A S cad S c d A a b cad S c d A a cad c d A a

Top-Down Parsing From Grammar to Parser, take II

Predictive Parsing Backtracking is bad!
To eliminate backtracking, what must we do/be sure of for grammar? no left recursion apply left factoring (frequently) when grammar satisfies above conditions: current input symbol in conjunction with current non-terminal uniquely determines the production that needs to be applied. Utilize transition diagrams: For each non-terminal of the grammar do following: 1. Create an initial and final state 2. If A X1X2…Xn is a production, add path with edges X1, X2, … , Xn Once transition diagrams have been developed, apply a straightforward technique to algorithmicize transition diagrams with procedure and possible recursion.

Transition Diagrams Unlike lexical equivalents, each edge represents a token Transition implies: if token, match input else call proc Recall earlier grammar and its associated transition diagrams E  TE’ E’  + TE’ |  T  FT’ T’  * FT’ |  F  ( E ) | id 2 1 T E’ E: How are transition diagrams used ? Are -moves a problem ? Can we simplify transition diagrams ? Why is simplification critical ? 3 6 + 4 T E’: 5 E’  7 9 8 F T’ T: 10 13 * 11 F T’: 12 T’  14 17 ( 15 E F: 16 ) id

How are Transition Diagrams Used ?
main() { TD_E(); } TD_E’() { token = get_token(); if token = ‘+’ then { TD_T(); TD_E’(); } } What happened to -moves? … “else unget() and terminate” NOTE: not all error conditions have been represented. TD_F() { token = get_token(); if token = ‘(’ then { TD_E(); match(‘)’); } else if token.value <> id then {error + EXIT} ... } TD_E() { TD_T(); TD_E’(); } TD_T() { TD_F(); TD_T’(); } TD_E’() { token = get_token(); if token = ‘*’ then { TD_F(); TD_T’(); } }

How can Transition Diagrams be Simplified ?
6 E’ E’: 5 3 + 4 T 

How can Transition Diagrams be Simplified ? (2)
6 E’ E’: 5 3 + 4 T  E’: 5 3 + 4 T  6

6 E’ E’: 5 3 + 4 T  E’: 3 + 4 T  6 E’: 5 3 + 4 T  6

6 E’ E’: 5 3 + 4 T  E’: 3 + 4 T  6 E’: 5 3 + 4 T  6 2 1 E’ T E:

6 E’ E’: 5 3 + 4 T  E’: 3 + 4 T  6 E’: 5 3 + 4 T  6 2 1 E’ T E: T E: 3 + 4  6

Additional Transition Diagram Simplifications
Similar steps for T and T’ Simplified Transition diagrams: * F 7 T: 10  13 Why is simplification important ? How does code change? T’: 10 * 11 F  13 14 17 ( 15 E F: 16 ) id

Top-Down Parsing From Grammar to Parser, take III

Motivating Table-Driven Parsing
1. Left to right scan input 2. Find leftmost derivation Terminator Grammar: E  TE’ E’  +TE’ |  T  id Input : id + id $ Derivation: E  Processing Stack:

Non-Recursive / Table Driven
+ b $ Y X Z Input Predictive Parsing Program Stack Output Parsing Table M[A,a] (String + terminator) NT + T symbols of CFG What actions parser should take based on stack / input Empty stack symbol General parser behavior: X : top of stack a : current input 1. When X=a = $ halt, accept, success 2. When X=a  $ , POP X off stack, advance input, go to 1. 3. When X is a non-terminal, examine M[X,a] if it is an error  call recovery routine if M[X,a] = {X  UVW}, POP X, PUSH W,V,U DO NOT expend any input

Algorithm for Non-Recursive Parsing
Set ip to point to the first symbol of w$; repeat let X be the top stack symbol and a the symbol pointed to by ip; if X is terminal or $ then if X=a then pop X from the stack and advance ip else error() else /* X is a non-terminal */ if M[X,a] = XY1Y2…Yk then begin pop X from stack; push Yk, Yk-1, … , Y1 onto stack, with Y1 on top output the production XY1Y2…Yk end until X=$ /* stack is empty */ Input pointer May also execute other code based on the production used

Example Our well-worn example ! Table M E  TE’ E’  + TE’ | 
T  FT’ T’  * FT’ |  F  ( E ) | id Our well-worn example ! Table M Non-terminal INPUT SYMBOL id + * ( ) $ E E’ T T’ F ETE’ TFT’ Fid E’+TE’ T’ T’*FT’ F(E) E’

Trace of Example STACK INPUT OUTPUT

Trace of Example Expend Input STACK INPUT OUTPUT $E $E’T $E’T’F
$E’T’id $E’T’ $E’ $E’T+ $E’T’F* $ id + id * id$ + id * id$ id * id$ * id$ id$ E TE’ T FT’ F  id T’   E’  +TE’ T’  *FT’ E’   STACK INPUT OUTPUT Expend Input

Leftmost Derivation for the Example
The leftmost derivation for the example is as follows: E  TE’  FT’E’  id T’E’  id E’  id + TE’  id + FT’E’  id + id T’E’  id + id * FT’E’  id + id * id T’E’  id + id * id E’  id + id * id

What’s the Missing Puzzle Piece ?
Constructing the Parsing Table M ! 1st : Calculate First & Follow for Grammar 2nd: Apply Construction Algorithm for Parsing Table ( We’ll see this shortly ) Basic Tools: First: Let  be a string of grammar symbols. First() is the set that includes every terminal that appears leftmost in  or in any string originating from . NOTE: If   , then  is First( ). Follow: Let A be a non-terminal. Follow(A) is the set of terminals a that can appear directly to the right of A in some sentential form. (S  Aa, for some  and ). NOTE: If S  A, then $ is Follow(A). * * *

Motivation Behind First & Follow
Is used to help find the appropriate reduction to follow given the top-of-the-stack non-terminal and the current input symbol. First: Example: If A   , and a is in First(), then when a=input, replace A with  (in the stack). ( a is one of first symbols of , so when A is on the stack and a is input, POP A and PUSH . Follow: Is used when First has a conflict, to resolve choices, or when First gives no suggestion. When    or   , then what follows A dictates the next choice to be made. * Example: If A   , and b is in Follow(A ), then when    and b is an input character, then we expand A with  , which will eventually expand to , of which b follows! (   : i.e., First( ) contains .) * *

An example. S  a B C d B  CB |  | S a C  b STACK INPUT OUTPUT $S
abbd$ STACK INPUT OUTPUT S  a B C d B  CB |  | S a C  b

Computing First(X) : All Grammar Symbols
1. If X is a terminal, First(X) = {X} 2. If X  is a production rule, add  to First(X) 3. If X is a non-terminal, and X Y1Y2…Yk is a production rule Place First(Y1) in First(X) if Y1 , Place First(Y2) in First(X) if Y2  , Place First(Y3) in First(X) … if Yk-1  , Place First(Yk) in First(X) NOTE: As soon as Yi   , Stop. Repeat above steps until no more elements are added to any First( ) set. Checking “Yj   ?” essentially amounts to checking whether  belongs to First(Yj) * * * * *

Computing First(X) : All Grammar Symbols - continued
Informally, suppose we want to compute First(X1 X2 … Xn ) = First (X1) “+” First(X2) if  is in First(X1) “+” First(X3) if  is in First(X2) “+” … First(Xn) if  is in First(Xn-1) Note 1: Only add  to First(X1 X2 … Xn) if  is in First(Xi) for all i Note 2: For First(X1), if X1 Z1 Z2 … Zm , then we need to compute First(Z1 Z2 … Zm) !

Example 1 Given the production rules: S  i E t SS’ | a S’  eS | 
E  b

Example 1 Given the production rules: Verify that S  i E t SS’ | a
S’  eS |  E  b Verify that First(S) = { i, a } First(S’) = { e,  } First(E) = { b }

Example 2 Computing First for: E  TE’ E’  + TE’ | 
T  FT’ T’  * FT’ |  F  ( E ) | id

Example 2 Computing First for: E  TE’ E’  + TE’ | 
T  FT’ T’  * FT’ |  F  ( E ) | id First(TE’) First(T) “+” First(E’) First(E) * Not First(E’) since T   First(T) First(F) “+” First(T’) First(F) * Not First(T’) since F   First((E)) “+” First(id) “(“ and “id” Overall: First(E) = { ( , id } = First(F) First(E’) = { + ,  } First(T’) = { * ,  } First(T)  First(F) = { ( , id }

Top-Down Parsing Identify a leftmost derivation for an input string

Similar presentations

Presentation on theme: "Top-Down Parsing Identify a leftmost derivation for an input string"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Top-Down Parsing Identify a leftmost derivation for an input string

Similar presentations

Presentation on theme: "Top-Down Parsing Identify a leftmost derivation for an input string"— Presentation transcript:

Similar presentations

About project

Feedback