CFGs and PDAs Sipser 2 (pages )
Long long ago…
CS 311 Fall Context-free grammars A context-free grammar G is a quadruple (V, Σ, R, S), where – V is a finite set called the variables – Σ is a finite set, disjoint from V, called the terminals – R is a finite subset of V × ( V ∪ Σ )* called the rules – S ∈ V is called the start symbol For any A ∈ V and u ∈ (V ∪ Σ) *, we write A → G u whenever (A, u) ∈ R
CS 311 Fall Arithmetic expressions and parse trees Consider G = (V, Σ, R, S), where – V ={,, } – Σ ={a, +, ×, (, )} – R ={ → G + |, → G × |, → G ( ) | a } – S = What about a × a +a ?
CS 311 Fall Leftmost derivation A derivation of a string in a grammar is a leftmost derivation if: at every step the leftmost remaining variable is the one replaced
CS 311 Fall Needlessly complicated? How about just → G + | × | | a A grammar G is ambiguous if some string w has two or more different leftmost derivations
CS 311 Fall Regular languages are context-free
CS 311 Fall Chomsky normal form A context-free grammar G is in Chomsky normal form –If every rule is of the form A → BC A → a where A,B,C ∈ V, B ≠ S ≠ C, and a ∈ Σ –We permit S → ε
CS 311 Fall Chomsky normal form Theorem 2.9: Any context-free language is generated by a context-free grammar in Chomsky normal form Proof: 1.Make sure S appears only on the left 2.Remove empty rules: A → ε 3.Handle unit rules: A → B 4.Fix all the rest… For instance: – S → G ASA | aA – A → G b | ε
CS 311 Fall
CS 311 Fall Balanced Brackets The grammar G = (V, Σ, R, S), where V = {S} Σ = {[, ]} R = { S → G ε, S → G SS, S → G [S]} generates all strings of balanced brackets Is the language L(G) is regular? –Why/Why not?
CS 311 Fall Recognizing Context-Free Languages Grammars are language generators. It is not immediately clear how they might be used as language recognizers. The language L(G) of balanced brackets is not regular. It cannot be recognized by a finite state automaton. However, it is very similar to the BEGIN…END blocks of many procedural languages and, therefore, must be recognizable by some compiler or interpreter!
CS 311 Fall Auxiliary storage We could recognize the language L(G) of balanced brackets by reading left to right, if we could remember left brackets along the way. [[][[]]] Must match some left bracket along the way
CS 311 Fall Pushdown Automata The last left bracket seen matches the first right bracket. This suggests a stack storage mechanism. [ [ [ [ ] ] [ [ [ [ ] ] ] ] ] ] [ [ [ [ [ [ Finite control $ $ stack or pushdown store reading head
CS 311 Fall Describing a pushdown machine
CS 311 Fall Formally… A pushdown automaton is a sextuple M = (Q, Σ, Γ, δ, q 0, F), where – Q is a finite set of states – Σ is a finite alphabet (the input symbols) – Γ is a finite alphabet (the stack symbols) – δ: (Q × Σ ε ×Γ ε ) → P(Q × Γ ε ) is the transition function – q 0 ∈ Q is the initial state, and – F ⊆ Q is the set of accept states