Parsing with Compositional Vector Grammars Socher, Bauer, Manning, NG 2013
Problem How can we parse a sentence and create a dense representation of it? – N-grams have obvious problems, most important is sparsity Can we resolve syntactic ambiguity with context? “They ate udon with forks” vs “They ate udon with chicken”
Standard Recursive Neural Net I like green eggs [ Vector(I)] [ Vector(like)] W Main [ Vector(I-like)] Score [ Vector(green)] [ Vector(eggs)] Classifier? W Main [ Vector((I-like)green)]
Standard Recursive Neural Net
Syntactically Untied RNN I like green eggs [ Vector(I)] [ Vector(like)] W N,V [ Vector(I-like)] Score [ Vector(green)] [ Vector(eggs)] Classifier W adj,N [ Vector(green-eggs)] First, parse lower level with PCFG N VAdj N
Syntactically Untied RNN
Examples: Composition Matrixes Notice that he initializes them with two identity matrixes (in the absence of other information we should average
Learning the Weights (for logistic) input
Tricks
Learning the Tree
Finding the Best Tree (inference) Want to find the parse tree with the max score (which is the sum all the scores of all sub trees) Too expensive to try every combination Trick: use non-RNN method to select best 200 trees (CKY algorithm). Then, beam search these trees with RNN.
Model Comparisons (WSJ Dataset) (Socher’s Model) F1 for parse labels
Analysis of Errors
Conclusions:
The model in this paper has (probably) been eclipsed by the Recursive Neural Tensor Network. Subsequent work showed this model performed better (in different situations) than the SU-RNN