Presentation is loading. Please wait.

Presentation is loading. Please wait.

General Information on Context-free and Probabilistic Context-free Grammars İbrahim Hoça CENG784, Fall 2013.

Similar presentations


Presentation on theme: "General Information on Context-free and Probabilistic Context-free Grammars İbrahim Hoça CENG784, Fall 2013."— Presentation transcript:

1 General Information on Context-free and Probabilistic Context-free Grammars İbrahim Hoça CENG784, Fall 2013

2 Outline Basics of Context-free Grammars (CFGs) Tree Structure Convenience of Tree Structures Natural Language Examples Probabilistic Context-free Grammars (PCFGs) PCFG Rules Computing the Probabilities in PCFGs Some Aspects of PCFGs

3 Basics of Context-free Grammars A CFG is a quadruple (V, Σ, P, S) where -V is a finite set of variables, -Σ (the alphabet) is a finite set of terminal symbols, -P is a finite set of rules, and -S is a distinguished element of V called the start symbol.

4 Basics of Context-free Grammars -A rule is an element of the set V x (VU)* -The rule [A, w] is written as A  w. -Lambda (null) rules are also possible: A  λ -The rules are written using the shorthand A  u|v to abbreviate A  u and A  v, the vertical bar being ‘or’.

5 Sample Derivations G =(V, Σ, P, S) V = {S,A} Σ = {a,b} P: S  AA A  AAA | bA | Ab | a S => AA => aA => aAAA => abAAA => abaAA => ababAA => ababaA => ababaa (a) S => AA => AAAA => aAAA => abAAA => abaAA => ababAA => ababaA => ababaa (b) S => AA => Aa => AAAa => AAbAa => AAbaa => AbAbaa => Ababaa => ababaa (c) S => AA => aA => aAAA => aAAa => abAAa => abAbAa => ababAa => ababaa (d)

6 Tree Structure Trees corresponding to the derivations in the previous slide.

7 Implementing CFG on Natural Language Let’s consider the following sentence: ‘nice dogs like cats’ Rules: S  NP VP NP  Adj N NP  N VP  V NP N  dogs | cats V  like Adj  nice Tree:

8 Convenience of Tree Structures -Natural language have a recursive structure. -Tree structures, hence CFGs, allow us to extend the context according to the properties of the relevant head nodes rather than limiting it with an arbitrary amount of adjacent words.

9 Convenience of Tree Structures Consider the verb agreement in the following construction: ‘Velocity of the seismic waves rises to …’ bigram:trigram: P(rises|waves)P(rises|seismic waves) quadrigram: P(rises|the seismic waves)

10 Convenience of Tree Structures -The verb ‘rises’ is apparently modified by a singular noun, which is ‘velocity’ in this case. -CFG allows us to capture this relationship between non-adjacent words.

11 Probabilistic Context-free Grammars (PCFG) -PCFG is simply a CFG with probabilities added to the rules, indicating how likely different rewritings are.

12 PCFG A PCFG G consists of: A set of terminals: {w k }, k = 1, …,V A set of non-terminals: {N i }, i = 1,…n A designated start symbol: N 1 A set of rules: {N i  ζ j }, (where ζ j is the sequence of terminals and non-terminals) A corresponding set of probabilities on rules such that:

13 PCFG Rules S  NP VP1.0 NP  NP PP0.4 PP  P NP1.0 VP  V NP0.7 NP  astronomers0.1 NP  ears0.18 NP  saw0.04 NP  stars0.18 NP  telescopes0.1 V  saw1.0 P  with1.0 Note that the NP rules are chosen to make the rules comply with the Chomsky Normal Form, which basically allows: A  B C A  w A  λ Where A, B, and C are non-terminals, and w is a terminal symbol.

14 Computing the Probabilities in PCFG where t is the parse tree and w 1m is the sentence from w 1 to w m. -This formula gives us the total probability of a sentence. -Probability of each tree is found by multiplication of the probabilities of the rules that created the tree.

15 Computing the Probabilities in PCFG P(t 1 ) = 1.0 × 0.1 × 0.7 × 1.0 × 0.4 × 0.18 × 1.0 × 1.0 × 0.18 = 0.0009072

16 Computing the Probabilities in PCFG P(t2) = 1.0 × 0.1 × 0.3 × 0.7 × 1.0 × 0.18 × 1.0 × 1.0 × 0.18 = 0.0006804

17 Computing the Probabilities in PCFG P(w 15 ) = P(t 1 ) + P(t 2 ) = 0.0015876

18 Some Aspects of PCFG + As grammars expand to cover a large and diverse corpus of text, they become increasingly ambiguous. A PCFG gives some idea of plausibility of different parses. - Nevertheless, a PCFG does not offer a very good idea of plausibility in itself, since its probability estimates are based purely on structural factors, and do not include lexical co- occurrence.

19 Some Aspects of PCFG + Real text tends to have grammatical mistakes, disfluencies and errors. This problem can be avoided to some extent with a PCFG by ruling out nothing excluded by the grammar but instead, by giving implausible sentences a low probability. - In a PCFG, the probability of a smaller tree is greater than a larger tree. For instance, the most frequent length for Wall Street Journal sentences is around 23 words. A PCFG gives too much of the probability mass to very short sentences.


Download ppt "General Information on Context-free and Probabilistic Context-free Grammars İbrahim Hoça CENG784, Fall 2013."

Similar presentations


Ads by Google