Download presentation
Presentation is loading. Please wait.
Published byPolly Wade Modified over 9 years ago
1
Context-Free Grammar CSCI-GA.2590 – Lecture 3 Ralph Grishman NYU
2
A Grammar Formalism We have informally described the basic constructs of English grammar Now we want to introduce a formalism for representing these constructs – a formalism that we can use as input to a parsing procedure 1/16/14NYU2
3
Context-Free Grammar A context-free grammar consists of – a set of non-terminal symbols A, B, C, … ε N – a set of terminal symbols a, b, c, … ε T – a start symbol S ε N – a set of productions P of the form N (N U T)* 1/16/14NYU3
4
A Simple Context-Free Grammar A simple CFG: S NP VP NP cats NP the cats NP the old cats NP mice VP sleep VP chase NP 1/16/14NYU4
5
Derivation and Language If A β is a production of the grammar, we can rewrite α A γ α β γ A derivation is a sequence of rewrite operations … … NP VP cats VP cats chase NP The language generated by a CFG is the set of strings (sequences of terminals) which can be derived from the start symbol S … … T* S NP VP cats VP cats chase NP cats chase mice 1/16/14NYU5
6
Preterminals It is convenient to include a set of symbols called preterminals (corresponding to the parts of speech) which can be directly rewritten as terminals (words) This allows us to separate the productions into a set which generates sequences of preterminals (the “grammar”) and those which rewrite the preterminals as terminals (the “dictionary”) 1/16/14NYU6
7
A Grammar with Preterminals grammar: S NP VP NP N NP ART N NP ART ADJ N VP V VP V NP dictionary: N cats N mice ADJ old DET the V sleep V chase 1/16/14NYU7
8
Grouping Alternates To make the grammar more compact, we group productions with the same left-hand side: S NP VP NP N | ART N | ART ADJ N VP V | V NP 1/16/14NYU8
9
A grammar can be used to – generate – recognize – parse Why parse? – parsing assigns the sentence a structure that may be helpful in determining its meaning 1/16/14NYU9
10
vs Finite State Language CFGs are more powerful than finite-state grammars (regular expressions) – FSG cannot generate center embeddings S ( S ) | x – even if FSG can capture the language, it may be unable to assign the nested structures we want
11
A slightly bigger CFG sentence np vp np ngroup | ngroup pp ngroup n | art n | art adj n vp v | v np | v vp | v np pp (auxilliary) pp p np (pp = prepositional phrase) 1/16/1411NYU
12
Ambiguity Most sentences will have more than one parse Generally different parses will reflect different meanings … “I saw the man with a telescope.” Can attach pp (“with a telescope”) under np or vp 1/16/14NYU12
13
A CFG with just 2 nonterminals S NP V | NP V NP NP N | ART NOUN | ART ADJ N use this for tracing our parsers 1/16/1413NYU
14
Top-down parser repeat expand leftmost non-terminal using first production (save any alternative productions on backtrack stack) if we have matched entire sentence, quit (success) if we have generated a terminal which doesn't match sentence, pop choice point from stack (if stack is empty, quit (failure)) 1/16/14NYU14
15
Top-down parser 1/16/14NYU15 0: S the cat chases mice
16
Top-down parser 1/16/14NYU16 0: S 1: NP the cat chases mice 2: V backtrack table 0: S NP V NP
17
Top-down parser 1/16/14NYU17 0: S 1: NP 3: N the cat chases mice 2: V backtrack table 0: S NP V NP 1:NP ART ADJ N 1: NP ART N
18
Top-down parser 1/16/14NYU18 0: S 1: NP 3: ART the cat chases mice 4: N 2: V backtrack table 0: S NP V NP 1:NP ART ADJ N
19
Top-down parser 1/16/14NYU19 0: S 1: NP 3: ART the cat chases mice 4: ADJ 2: V 5: N backtrack table 0: S NP V NP
20
Top-down parser 1/16/14NYU20 0: S 1: NP the cat chases mice 2: V3: NP backtrack table
21
Top-down parser 1/16/14NYU21 0: S 1: NP 3: N the cat chases mice 2: V3: NP backtrack table 1:NP ART ADJ N 1: NP ART N
22
Top-down parser 1/16/14NYU22 0: S 1: NP 4: ART the cat chases mice 2: V3: NP 5: N backtrack table 1:NP ART ADJ N
23
Top-down parser 1/16/14NYU23 0: S 1: NP 4: ART the cat chases mice 2: V3: NP 5: N 6: N parse! backtrack table 1:NP ART ADJ N 3: NP ART ADJ N 3: NP ART N
24
Bottom-up parser Builds a table where each row represents a parse tree node spanning the words from start up to end 1/16/14NYU24 symbolstartendconstituents N01-
25
Bottom-up parser We initialize the table with the parts-of – speech of each word … 1/16/14NYU25 symbolstartendconstituents ART01- N12- V23- N34-
26
Bottom-up parser We initialize the table with the parts-of – speech of each word … remembering that many English words have several parts of speech 1/16/14NYU26 symbolstartendconstituents ART01- N12- V23- N23- N34-
27
Bottom-up parser Then if these is a production A B C and we have entries for B and C with end B = start C, we add an entry for A with start = start B and end = end C [see lecture notes for handling general productions] 1/16/14NYU27 node #symbolstartendconstituents 0ART01- 1N12- 2V23- 3N23- 4N34- 5NP02[0, 1]
28
Bottom-up parser 1/16/14NYU28 node #symbolstartendconstituents 0ART01- 1N12- 2V23- 3N23- 4N34- 5NP02[0, 1] 6NP12[1] 7NP23[3] 8NP34[4] 9S04[5, 2, 8] 10S14[6, 2, 8] several more S’s parse!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.