Download presentation
Presentation is loading. Please wait.
Published byToby Allen Modified over 9 years ago
1
c Chuen-Liang Chen, NTUCS&IE / 51 CONTEXT-FREE GRAMMARS Chuen-Liang Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, TAIWAN
2
c Chuen-Liang Chen, NTUCS&IE / 52 Parsing function:checking syntactically validity of the input string producing structure of the corresponding parse tree callee:scanner (when need a token) semantic routine (when match a production rule) theoretical basis: context-free grammar executor: parser, syntax analyzer top-down parsing –beginning at the start symbol, expanding nonterminals in depth- first manner (predictive in nature) –left-most derivation –pre-order traversal of parse tree –e.g.LL(k) [read from Left; Left-most derivation; k lookaheads], recursive descent parsing bottom-up parsing –beginning from terminal string, determining the production used to generate leaves –right-most derivation in reverse order –post-order traversal of parse tree –e.g.LR(k) [read from Left; Right-most derivation; k lookaheads]
3
c Chuen-Liang Chen, NTUCS&IE / 53 Definitions about context-free grammar (1/2) context-free grammar -- G = (V t, V n, S, P) V t --set of terminal symbols V n --set of nonterminal symbols –a, b, c,... V t –A, B, C,... V n –U, V, W,... V = V t V n –u, v, w,... V t * – , , ,... V* S --start symbol, goal symbol; S V n P --set of production rules of the form : A derivation by production rule A one step derivation : A left-most derivation : u A lm u right-most derivation : A v rm v one or more steps derivation : lm rm zero or more steps derivation : * * lm * rm
4
c Chuen-Liang Chen, NTUCS&IE / 54 Definitions about context-free grammar (2/2) set of sentential forms -- SF(G) = { | S * } left-most sentential form -- the so that S * lm right-most sentential form -- the so that S * rm context-free language -- L(G) = SF(G) V t * parse tree, derivation tree -- graphic representation of derivations root -- start symbol leaf nodes -- grammar symbols or interior nodes -- nonterminals offspring of a nonterminal -- a production for a given sentential form -- phrase -- a sequence of symbols derived from a single nonterminal simple phrase, prime phrase -- minimal phrase handle -- left-most simple phrase
5
c Chuen-Liang Chen, NTUCS&IE / 55 Example of context-free grammar grammar G 0 -- E Prefix ( E ) | V Tail Prefix F | Tail + E | left-most derivation -- right-most derivation -- E lm Prefix ( E ) E rm Prefix ( E ) lm F ( E ) rm Prefix ( V Tail ) lm F ( V Tail ) rm Prefix ( V + E ) lm F ( V + E ) rm Prefix ( V + V Tail ) lm F ( V + V Tail ) rm Prefix ( V + V ) lm F ( V + V ) rm F ( V + V ) right-most sentential forms -- 1. E 2. Prefix ( E ) 3. Prefix ( V Tail ) 4. Prefix ( V + E ) 5. Prefix ( V + V Tail ) 6. Prefix ( V + V ) 7. F ( V + V ) 8. and so on L(G 0 ) { F ( V + V ) }
6
c Chuen-Liang Chen, NTUCS&IE / 56 parse trees of left-most derivations blue symbols : left-most sentential forms Example of left-most derivation Tail E Prefix(E) FVTail +E V E Prefix(E) E (E) FVTail E Prefix(E) FVTail +E E Prefix(E) FVTail +E V EE Prefix(E) F
7
c Chuen-Liang Chen, NTUCS&IE / 57 Parsing function:checking syntactically validity of the input string producing structure of the corresponding parse tree callee:scanner (when need a token) semantic routine (when match a production rule) theoretical basis: context-free grammar executor: parser, syntax analyzer top-down parsing –beginning at the start symbol, expanding nonterminals in depth- first manner (predictive in nature) –left-most derivation –pre-order traversal of parse tree –e.g.LL(k) [read from Left; Left-most derivation; k lookaheads], recursive descent parsing bottom-up parsing –beginning from terminal string, determining the production used to generate leaves –right-most derivation in reverse order –post-order traversal of parse tree –e.g.LR(k) [read from Left; Right-most derivation; k lookaheads]
8
c Chuen-Liang Chen, NTUCS&IE / 58 trace of top-down parsing (left-most derivation) orange : just derived (predicted)blue : just read (matched) black : derived or readgreen : un-processed (parse stack) Example of top-down parsing Tail E Prefix(E) FVTail +E V E Prefix(E) E (E) FVTail E Prefix(E) FVTail +E E Prefix(E) FVTail +E V EE Prefix(E) F
9
c Chuen-Liang Chen, NTUCS&IE / 59 Definitions about context-free grammar (2/2) set of sentential forms -- SF(G) = { | S * } left-most sentential form -- the so that S * lm right-most sentential form -- the so that S * rm context-free language -- L(G) = SF(G) V t * parse tree, derivation tree -- graphic representation of derivations root -- start symbol leaf nodes -- grammar symbols or interior nodes -- nonterminals offspring of a nonterminal -- a production for a given sentential form -- phrase -- a sequence of symbols derived from a single nonterminal simple phrase, prime phrase -- minimal phrase handle -- left-most simple phrase
10
c Chuen-Liang Chen, NTUCS&IE / 60 Example of right-most derivation (1/2) parse trees of right-most derivations and corresponding sentential form, phrases, simple phrases, handle blue symbols : sentential form : phrase : simple phrase : handle E Prefix(E) E (E) VTail E Prefix(E) VTail +E E Prefix ( V + E )Prefix ( V Tail )EPrefix ( E )
11
c Chuen-Liang Chen, NTUCS&IE / 61 Example of right-most derivation (2/2) E Prefix(E) FVTail +E V E Prefix(E) VTail +E V E Prefix(E) VTail +E V Prefix ( V + V Tail ) Prefix ( V + V )F ( V + V )
12
c Chuen-Liang Chen, NTUCS&IE / 62 Parsing function:checking syntactically validity of the input string producing structure of the corresponding parse tree callee:scanner (when need a token) semantic routine (when match a production rule) theoretical basis: context-free grammar executor: parser, syntax analyzer top-down parsing –beginning at the start symbol, expanding nonterminals in depth- first manner (predictive in nature) –left-most derivation –pre-order traversal of parse tree –e.g.LL(k) [read from Left; Left-most derivation; k lookaheads], recursive descent parsing bottom-up parsing –beginning from terminal string, determining the production used to generate leaves –right-most derivation in reverse order –post-order traversal of parse tree –e.g.LR(k) [read from Left; Right-most derivation; k lookaheads]
13
c Chuen-Liang Chen, NTUCS&IE / 63 trace of bottom-up parsing (inverse order of right-most derivation) blue : just read (shifted)orange : just derived (reduced to) pink : not readgreen : derived or read (parse stack) Example of bottom-up parsing ()FV+V Prefix() F V+E VTail Prefix() F V+V Prefix() F V+VTail Prefix() F VTail +E V Prefix(E) FVTail +E V E Prefix(E) FVTail +E V
14
c Chuen-Liang Chen, NTUCS&IE / 64 Examples - 排骨麵特餐 example 1 排骨麵特餐 冰紅茶 排骨麵 柳丁切片 排骨麵 炸排骨 湯麵 lookahead is unnecessary example 2 排骨麵特餐 冰紅茶 排骨麵 service 柳丁切片 排骨麵 炸排骨 湯麵 | 湯麵 炸排骨 service 芋仔冰 | 別想了 ( ) lookahed is required
15
c Chuen-Liang Chen, NTUCS&IE / 65 Ambiguity of grammar a string with two different parse trees (i.e., two different structures) example : - id for an unambiguous grammar, parse trees of leftmost derivation and right-most derivation are the same id - - id - -
16
c Chuen-Liang Chen, NTUCS&IE / 66 First set and Follow set (1/2) First( ) = { a V t | * a } ( if * then { } else ) set of all terminals that can begin a sentential form derived from First k ( ) -- set of k-symbol terminal strings that can begin a sentential form derived from QUIZ: for what? QUIZ: for what? Follow(A) = { a V t | S + A a } ( if S + A then { } else ) set of all terminals that may follow A in some sentential form Follow k (A) -- set of k-symbol terminal strings that may follow A in some sentential form QUIZ: for what? QUIZ: for what?
17
c Chuen-Liang Chen, NTUCS&IE / 67 First set and Follow set (2/2) example 1 -- E Prefix ( E ) E V Tail Prefix F | Tail + E | example 2 -- S a S e | B B b B e | C C c C e | d example 3 -- S A B c A a | B b |
18
c Chuen-Liang Chen, NTUCS&IE / 68 Algorithms for First & Follow sets (1/6) typedef int symbol; /* a symbol in the grammar */ /* The symbolic constants used * below, NUM_TERMINALS, * NUM_NONTERMINALS, and * NUM_PRODUCTIONS are * determined by the grammar. * MAX_RHS_LENGTH should * simply be "big enough." */ #define VOCABULARY (NUM_NONTERMINALS + NUM_TERMINALS) typedef struct gram { symbol terminals[NUM_TERMINALS]; symbol nonterminals[NUM_NONTERMINALS]; symbol start_symbol; int num_productions; struct prod { symbol lhs; int rhs_length; symbol rhs[MAX_RHS_LENGTH]; } productions[NUM_PRODUCTIONS]; symbol vocabulary[VOCABULARY]; } grammar; typedef struct prod production; typedef symbol terminal; typedef symbol nonterminal;
19
c Chuen-Liang Chen, NTUCS&IE / 69 Algorithms for First & Follow sets (2/6) typedef short boolean; typedef boolean marked_vocabulary[VOCABULARY]; /* * Mark those vocabulary symbols found to derive (directly or indirectly). */ marked_vocabulary mark_lambda(const grammar g) { static marked_vocabulary derives_lambda; boolean changes;/* any changes during last iteration? */ boolean rhs_derives_lambda;/* does the RHS derive ? */ symbol v;/* a word in the vocabulary */ production p;/* a production in the grammar */ int i, j; /* loop variables */ for (v = 0; v < VOCABULARY; v++) derives_lambda[v] = FALSE; /* initially, nothing is marked */
20
c Chuen-Liang Chen, NTUCS&IE / 70 Algorithms for First & Follow sets (3/6) do { changes = FALSE; for (i = 0; i < g.num_productions; i++) { p = g.productions[i]; if (! derives_lambda[p.lhs]) { if (p.rhs_length == 0) { /* derives directly */ changes = derives_lambda[p.lhs] = TRUE; continue; } /* does each part of RHS derive ? */ rhs_derives_lambda = derives_lambda[p.rhs[0]]; for (j = 1; j < p.rhs_length, j++) rhs_derives_lambda = rhs_derives_lambda && derives_lambda[p.rhs[j]]; if (rhs_derives_lambda) changes = derives_lambda[p.lhs] = TRUE; } } while (changes); return derives_lambda; }
21
c Chuen-Liang Chen, NTUCS&IE / 71 Algorithms for First & Follow sets (4/6) typedef set_of_terminal_or_lambda termset; termset follow_set[NUM_NONTERMINAL]; termset first_set[SYMBOL]; marked_vocabulary derives_lambda = mark_lambda(g); /* mark_lambda(g) as defined above */ termset compute_first(string_of_symbols alpha) { inti, k; termset result; k = length(alpha); if (k == 0) result = SET_OF( ); else { result = first_set[alpha[0]] - SET_OF( ) ; for (i = 1; i < k && first_set[alpha[i-1] ]; i++) result = result ( first_set[alpha[i]] - SET_OF( ) ); if (i == k && first_set[alpha[k - 1]]) result = result SET_OF( ); } return result; }
22
c Chuen-Liang Chen, NTUCS&IE / 72 Algorithms for First & Follow sets (5/6) extern grammar g; void fill_first_set(void) { nonterminalA; terminala; productionp; booleanchanges; inti, j; for (i = 0; i < NUM_NONTERMINAL; i++) { A = g.nonterminals[i]; if (derives_lambda[A]) first_set[A] = SET_OF( ); else first_set[A] = ; } for (i = 0; i < NUM_TERMINAL; i++) { a = g.terminals[i]; first_set[a] = SET_OF( a ); for (j = 0; j < NUM_NONTERMINAL; j++) { A = g.nonterminals[j]; if (there exists a production A a ) first_set[A] = first_set[A] SET_OF( a ); } do { changes = FALSE; for (i = 0; i < g.num_productions; i++) { p = g.productions[i]; first_set[p.lhs] = first_set[p.lhs] compute_first(p.rhs); if ( first_set changed ) changes = TRUE; } } while (changes); } QUIZ: termination? QUIZ: correctness?
23
c Chuen-Liang Chen, NTUCS&IE / 73 Algorithms for First & Follow sets (6/6) void fill_follow_set(void) { nonterminal A, B; int i; boolean changes; for (i = 0; i < NUM_NONTERMINAL; i++) { A = g.nonterminals[i]; follow_set[A] = ; } follow_set[g.start_symbol] = SET_OF( ); do { changes = FALSE; for (each production A B ) { /* * I.e. for each production and each * occurrence of a nonterminal in its * right-hand side. */ follow_set[B] = follow_set[B] (compute_first( ) - SET_OF( )); if ( compute_first( ) ) follow_set[B] = follow_set[B] follow_set[A]; if ( follow_set[B] changed ) changes = TRUE; } } while (changes); } QUIZ: termination? QUIZ: correctness?
24
c Chuen-Liang Chen, NTUCS&IE / 74 Tracing examples example 1 -- E Prefix ( E ‚ ) E V Tail Prefix F | Tail + E „ | ‘ example 2 -- S a S e | B ‚ ’ B b B e | C „ C c C e | d‘ example 3 -- S A B ‚ c A a | B b | ‚ „„‚ ‚ ’’ ‘ ‚ „„ ‘
25
c Chuen-Liang Chen, NTUCS&IE / 75 From extended BNF to CFG { } QUIZ: how, systematically?
26
c Chuen-Liang Chen, NTUCS&IE / 76 Other types of grammars regular grammar --A a B or C QUIZ: how? QUIZ: how? context-free grammar --A context-sensitive grammar -- A type 0 grammar -- regular grammar : too simple, e.g., { [ i ] i | i 1 } QUIZ: how to specify { [ i ] i | i 1 } by context-free grammar? context-sensitive, type 0 : without sufficient parser context-free grammar : a balance between generality and practicality
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.