Download presentation
Presentation is loading. Please wait.
Published byPaul Hodde Modified over 9 years ago
1
Incrementally Learning Parameter of Stochastic CFG using Summary Stats Written by:Brent Heeringa Tim Oates
2
Goals: To learn the syntax of utterances Approach: SCFG (Stochastic Context Free Grammar) M= V-finite set of non-terminal E-finite set of terminals R-finite set of rules, each r has p(r). Sum of p(r) of the same left-hand side = 1 S-start symbol
3
Problems with most SCFG Learning Algorithms 1)Expensive storage: need to store a corpus of complete sentences 2)Time-consuming: algorithms needs to repeat passes throughout all data
4
Learning SCFG Inducing context-free structure from corpus(sentences) Learning – the production(rules) probabilities
5
Learning SCFG –Cont General method: Inside/Outside algorithm –Expectation- Maximization (EM) Find expectation of rules Maximize the likelihood given both expectation & corpus Disadvantage of Inside/Outside algo. –Entire sentence corpus must be stored using some representation(eg. chart parse) –Expensive storage (unrealistic for human agent!)
6
Proposed Algorithm Use Unique Normal Form (UNF) –Replace all terminal A-z to 2 new rules A->D p[A->D]=p[A->z] D-> z p[D->z]=1 –No two productions have the same right hand side
7
Learning SCFG- Proposed Algorithm -cont Use Histogram –Each rule has 2 histograms (H o r, H L r )
8
Proposed Algorithm -cont –H o r -contructed when parsing sentences in O – H L r- -will continue to be updated throughout learning process H L r rescale to fixed size h –Why?! –Recently used rules has more impact on histogram
9
Comparing between H L r & H o r Relative entropy T decrease- increase prob of rules used –(if s large, increase prob of rules used when parsing last sentence ) T increase- decrease prob of rules used (eg p t+1 (r)=0.01* p t+1 (r)
10
Comparing Inside/Outside Algo with the proposed algorithm Inside/Outside –O(n 3 ) Good –3-5 iterations Bad –Need to store complete sentence corpus Proposed Algo –O(n 3 ) Bad –500-1000 iterations Good –Memory requirements is constant!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.