Download presentation
Presentation is loading. Please wait.
Published byGordon Wright Modified over 8 years ago
1
CS626-449: Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
2
Example of Sentence labeling: Parsing [ S1 [ S [ S [ VP [ VB Come][ NP [ NNP July]]]] [,,] [ CC and] [ S [ NP [ DT the] [ JJ UJF] [ NN campus]] [ VP [ AUX is] [ ADJP [ JJ abuzz] [ PP [ IN with] [ NP [ ADJP [ JJ new] [ CC and] [ VBG returning]] [ NNS students]]]]]] [..]]]
3
Corpus A collection of text called corpus, is used for collecting various language data With annotation: more information, but manual labor intensive Practice: label automatically; correct manually The famous Brown Corpus contains 1 million tagged words. Switchboard: very famous corpora 2400 conversations, 543 speakers, many US dialects, annotated with orthography and phonetics
4
Discriminative vs. Generative Model W * = argmax (P(W|SS)) W Compute directly from P(W|SS) Compute from P(W).P(SS|W) Discriminative Model Generative Model
5
Notion of Language Models
6
Language Models N-grams: sequence of n consecutive words/chracters Probabilistic / Stochastic Context Free Grammars: Simple probabilistic models capable of handling recursion A CFG with probabilities attached to rules Rule probabilities how likely is it that a particular rewrite rule is used?
7
PCFGs Why PCFGs? Intuitive probabilistic models for tree-structured languages Algorithms are extensions of HMM algorithms Better than the n-gram model for language modeling.
8
Formal Definition of PCFG A PCFG consists of A set of terminals {w k }, k = 1,….,V {w k } = { child, teddy, bear, played…} A set of non-terminals {N i }, i = 1,…,n {N i } = { NP, VP, DT…} A designated start symbol N 1 A set of rules {N i j }, where j is a sequence of terminals & non-terminals NP DT NN A corresponding set of rule probabilities
9
Rule Probabilities Rule probabilities are such that E.g., P( NP DT NN) = 0.2 P( NP NN) = 0.5 P( NP NP PP) = 0.3 P( NP DT NN) = 0.2 Means 20 % of the training data parses use the rule NP DT NN
10
Probability of a sentence Notation : –w ab – subsequence w a ….w b –N j dominates w a ….w b or yield(N j ) = w a ….w b w a ……………..w b NjNj Where t is a parse tree of the sentence the..sweet..teddy..bear NP Probability of a sentence = P(w 1m ) If t is a parse tree for the sentence w 1m, this will be 1 !!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.