10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 : 2007. 10. 6.

2 Table of Contents 12.1 Probabilistic Context-Free Grammars 12.2 Problems with PCFGs 12.3 Probabilistic Lexicalized CFGs

3 Introduction Goal  To build probabilistic models of sophisticated syntactic information  To use this probabilistic information in efficient probabilistic parser Use of Probabilistic Parser  To disambiguation  Earley algorithm can represent the ambiguities of sentences, but it can not resolve them  Probabilistic grammar can choose the most-probable interpretation  To make language model  For speech recognizer, N-gram model was used in predicting upcoming words, and helping constrain the search for words  Probabilistic version of more sophisticated grammar can provide additional predictive power to speech recognition

4 12.1 Probabilistic Context-Free Grammars Probabilistic Context-Free Grammars(PCFG)  PCFG is also known as the Stochastic Context Free Grammar(SCFG)  Five parameters of PCFG 5-tuple G=(N, ∑, P, S, D) 1.A set of non-terminal symbols (or “variables”) N 2.A set of terminal symbols ∑ (disjoint from N) 3.A set of productions P, each of the form A → β, where A is a non-terminal and β is a string of symbols from the infinite set of strings (∑ ∪ N)* 4. A designated start symbol S 5. A function assigning probabilities to each rule in P P(A → β) or P(A → β|A)

5 12.1 Probabilistic Context-Free Grammars Sample PCFG for a miniature grammar

6 12.1 Probabilistic Context-Free Grammars Probability of a particular parse T  Production of the probabilities of all the rules r used to expand each node n in the parse tree  By the definition of conditional probability  Since a parse tree includes all the words of sentence, P(S|T) is 1

7 12.1 Probabilistic Context-Free Grammars Higher probability

8 12.1 Probabilistic Context-Free Grammars Formalization of selecting the parse with highest probability  The best tree for a sentence S out of the set of parse trees for S(which we’ll call τ(S))  Since P(S) is constant for each tree, we can eliminate it → → Since P(T,S) = P(T)

9 12.1 Probabilistic Context-Free Grammars Probability of an ambiguous sentence  Sum of probabilities of all the parse trees for the sentence

10 Other issue on PCFG Prefix  Jelinek and Lafferty (1991) gives an Algorithm for efficiently computing the probability of a prefix of sentence  Stolcke (1995) describes how the standard Earley parser can be augmented to compute these prefix probabilities  Jurafsky et al (1995) describes an application of a version of this algorithm as the language model for a speech recognizer Consistent  PCFG is said to be consistent if the sum of the probabilities of all sentences in the language equals 1  Certain kinds of recursive rules cause a grammar to be inconsistent by causing infinitely looping derivations for some sentences  Booth and Thompson (1973) gives more details on consistent and inconsistent grammars

11 Probabilistic CYK Parsing of PCFGs Parsing problem for PCFG  Can be interpretated into how to compute the most-likely parse for a given sentence Algorithms for computing the most-likely parse  Augmented Earley algorithm (Stolcke, 1995)  Probabilistic Earley algorithm is somewhat complex to present  Probabilistic CYK(Cocke-Younger-Kasami) algorithm  CYK algorithm is worth understanding

12 Probabilistic CYK Parsing of PCFGs Probabilistic CYK(Cocke-Younger-Kasami) algorithm  CYK algorithm is essentially a bottom-up parser using dynamic programming table  Bottom-up makes it more efficient when processing lexicalized grammar  Probabilistic CYK parsing was first described by Ney(1991)  CYK parsing algorithm presented here  Collins(1999), Aho and Ullman(1972)

13 Probabilistic CYK Parsing of PCFGs Input, output, and data structure of Probabilistic CYK

14 Probabilistic CYK Parsing of PCFGs

15 Probabilistic CYK Parsing of PCFGs Pseudocode for Probabilistic CYK algorithm

16 Learning PCFG Probabilities Where do PCFG probabilities come from  Obtaining the PCFG probabilities from Tree bank  Tree bank: a corpus of already-parsed sentences  E.g.) Penn Tree bank(Marcus et al., 1993) –Brown Corpus, Wall street Journal, parts of Switchboard corpus  Probability of each expansion of a non-terminal –By counting the number of times that expansion occurs –Normalizing

17 Learning PCFG Probabilities Where do PCFG probabilities come from  Learning the PCFG probabilities  PCFG probabilities can be generated by first parsing a (raw) corpus  Unambiguous sentences –Parse the corpus –Increment a counter for every rule in the parse –Then normalize to get probabilities  Ambiguous sentences –We need to keep a separate count for each parse of a sentence and weight each partial count by the probability of the parse it appears in –Standard algorithm for computing this is called Inside-Outside algorithm proposed by Baker(1979) »Cf.) Manning and Schuze(1999) for a complete description

18 12.2 Problems with PCFGs

10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 : 2007. 10. 6.

Similar presentations

Presentation on theme: "10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 : 2007. 10. 6."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 : 2007. 10. 6.

Similar presentations

Presentation on theme: "10. Lexicalized and Probabilistic Parsing -Speech and Language Processing- 발표자 : 정영임 발표일 : 2007. 10. 6."— Presentation transcript:

Similar presentations

About project

Feedback