Context Free Grammars & Parsing CPSC 388 Fall 2001 Ellen Walker Hiram College
Syntax Analysis Program structure is sequence of tokens Structure is too complex to be regular –Parentheses & braces are nested –If-else need to match up Structure is context free
Context Free Grammar Rules more general than RG: –Right side can be any sequence of terminals, non-terminals –Left side must be a single non-terminal S -> abScTdef(context free) AS -> bSAc(not CF) Every Regular Grammar is also a Context Free Grammar
Context Free Language The language of a CFG is the set of all strings that can be generated by repeatedly applying rules to the start symbol. The sequence of rule applications is called a derivation
Example CFG This grammar generates a*b* where the number of a’s = number of b’s S -> aSb | e Derivation of aaabbb: S -> aSb -> aaSbb -> aaaSbbb -> aaabbb
Same # a’s as b’s Grammar –S -> aSb | bSa | abS | baS | Derivation –“aabbabab” –S-> aSb -> aabSb -> aabbaSb ->aabbabaSb -> aabbabab
Formal Description of a CFG A set of terminals (e.g. {a,b}) A set of non-terminals (e.g. {S}) A start non-terminal (e.g. S) A set of rules with a single nonterminal on the left side.
Notation In formal (theory) CFL’s, non-terminals are usually capital letters E -> E O E In programming languages & compiler CFL’s non-terminals are usually names, sometimes in angle brackets (BNF) -> Terminals can be single or multiple-character symbols, or token types
Parse Tree S (start symbol) is the root String is the leaves Intermediate nodes are non-terminals
Parse Tree Example S-> aSb -> aabSb -> aabbaSb ->aabbab
Another Grammar S -> abScB | B-> bB | b Strings to generate: –abcb –ababccb Give both derivation and parse tree Find a string of length 5 that is accepted Find a string with 2 c’s that is accepted
Grammar for Expression E => E O E E => ( E ) E => a E => b O => + O => - O => *
Example Derivation E-> E O E -> ( E ) O E -> ( E O E ) O E -> ( a O E ) O E -> ( a + E ) O E -> ( a + b ) O E -> ( a + b ) * E -> ( a + b) * a
More CFL’s Palindromes over {a,b,c} Strings with twice as many a’s as b’s Strings with 3 more a’s than b’s (hint: create a non-terminal that generates equal #’s of a’s and b’s, then add 3 more a’s)