Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in.

Similar presentations


Presentation on theme: "1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in."— Presentation transcript:

1 1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in entirety. the strictness of the isomorphism may vary across words or syntactic rules.  Key idea: Unlike some synchronous grammars (e.g. SCFG, which is more strict and rigid), QG defines a monolingual grammar for the target tree, “inspired” by the source tree.

2 2 Quasi-Synchronous Grammars  In other words, we model the generation of the target tree, influenced by the source tree (and their alignment)  QA can be thought of as extremely free monolingual translation.  The linkage between question and answer trees in QA is looser than in MT, which gives a bigger edge to QG.

3 3 Model  Works on labeled dependency parse trees  Learn the hidden structure (alignment between Q and A trees) by summing out ALL possible alignments  One particular alignment tells us both the syntactic configurations and the word-to-word semantic correspondences  An example… question answer parse tree question parse tree an alignment

4 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

5 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

6 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person is VB Q:A: $ root $ root subjwith nmod Our model makes local Markov assumptions to allow efficient computation via Dynamic Programming (details in paper) given its parent, a word is independent of all other words (including siblings).

7 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword is VB Q:A: $ root $ root subj root subjwith nmod

8 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB Q:A: $ root $ root subjobj root subjwith nmod

9 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT Q:A: $ root $ root subjobj det root subjwith nmod

10 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

11 11 6 types of syntactic configurations  Parent-child

12 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

13 Parent-child configuration

14 14 6 types of syntactic configurations  Parent-child  Same-word

15 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

16 Same-word configuration

17 17 6 types of syntactic configurations  Parent-child  Same-word  Grandparent-child

18 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

19 Grandparent-child configuration

20 20 6 types of syntactic configurations  Parent-child  Same-word  Grandparent-child  Child-parent  Siblings  C-command (Same as [D. Smith & Eisner ’06])

21

22 22 Modeling alignment  Base model

23 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

24 Bush NNP person met VBD French JJ location president NN Jacques Chirac NNP person who WP qword leader NN is VB the DT France NNP location Q:A: $ root $ root subjobj detof root subjwith nmod

25 25 Modeling alignment cont.  Base model  Log-linear model Lexical-semantic features from WordNet, Identity, hypernym, synonym, entailment, etc.  Mixture model

26 26 Parameter estimation  Things to be learnt Multinomial distributions in base model Log-linear model feature weights Mixture coefficient  Training involves summing out hidden structures, thus non-convex.  Solved using conditional Expectation- Maximization

27 27 Experiments  Trec8-12 data set for training  Trec13 questions for development and testing

28 28 Candidate answer generation  For each question, we take all documents from the TREC doc pool, and extract sentences that contain at least one non-stop keywords from the question.  For computational reasons (parsing speed, etc.), we only took answer sentences <= 40 words.

29 29 Dataset statistics  Manually labeled 100 questions for training Total: 348 positive Q/A pairs  84 questions for dev Total: 1415 Q/A pairs 3.1+, 17.1-  100 questions for testing Total: 1703 Q/A pairs 3.6+, 20.0-  Automatically labeled another 2193 questions to create a noisy training set, for evaluating model robustness

30 30 Experiments cont.  Each question and answer sentence is tokenized, POS tagged (MX-POST), parsed (MSTParser) and labeled with named-entity tags (Identifinder)

31 31 Baseline systems (replications)  [Cui et al. SIGIR ‘05] The algorithm behind one of the best performing systems in TREC evaluations. It uses a mutual information-inspired score computed over dependency trees and a single fixed alignment between them.  [Punyakanok et al. NLE ’04] measures the similarity between Q and A by computing tree edit distance.  Both baselines are high-performing, syntax-based, and most straight-forward to replicate  We further enhanced the algorithms by augmenting them with WordNet.

32 32 Results Mean Average Precision Mean Reciprocal Rank of Top 1 Statistically significantly better than the 2 nd best score in each column 28.2% 23.9% 41.2% 30.3%

33 33 Summing vs. Max

34 34 Switching back  Tree-edit CRFs


Download ppt "1 Quasi-Synchronous Grammars  Based on key observations in MT: translated sentences often have some isomorphic syntactic structure, but not usually in."

Similar presentations


Ads by Google