Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment Regina Barzilay and Lillian Lee Cornell University HLT-NAACL 2003.

Similar presentations


Presentation on theme: "1 Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment Regina Barzilay and Lillian Lee Cornell University HLT-NAACL 2003."— Presentation transcript:

1 1 Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment Regina Barzilay and Lillian Lee Cornell University HLT-NAACL 2003 (22% 哇 !)

2 2 Objective Generate paraphrases automatically by learning from comparable corpora Domain-dependent paraphrasing News-oriented The plane bombed the town.  The town was bombed by the plane.

3 3 Three Stages 1.Pattern Selection (Within Corpus) Find reoccurring patterns such as “X kicked Y” 2.Paraphrase Acquisition (Across Corpora) Pair patterns such as “X kicked Y” and “Y was kicked by X” 3.Generation Convert “Alice kicked Bob” to “Bob was kicked by Alice”

4 4 免費送一個圖解

5 5 1.Pattern Selection (Within Corpus) 2.Paraphrase Acquisition (Across Corpora) 3.Generation 一步一腳印

6 6 Pattern Selection 之 Sentence Clustering Use complete-link clustering to cluster similar sentences within a corpus Use n-gram overlap as similarity measure 還沒上過老師 IR 課或上課不專心或已經忘 記的鄉親請參考老師的投影片. Replace dates, numbers and proper names in sentences with generic tokens to account for argument variability

7 7 Sample Cluster

8 8 Pattern Selection 之 Inducing Patterns Use multiple sequence alignment (MSA), which is an extension of pairwise sequence alignment Pairwise sequence alignment : similar to edit distance! –Aligning two identical words scores 1; inserting a word scores -0.01; aligning two different words scores -0.5 –Want to find the alignment that has the highest score MSA’s scoring function is the sum of all the pairwise alignment scores 人算不如天算 MSA is NP-complete! But there is an approximation algorithm

9 9 Lattice Example

10 10 Identify the Variables We want to identify variable parts (e.g. event, people name, location, …) The non-variable part (backbone node) is defined as a node which is shared by more than 50% of the cluster’s sentences Still have the problem of argument variability (bad) and synonym variability (good)

11 11 Argument Variability VS Synonym Variability Idea: more variability in argument (e.g. different location names) than synonym Define synonymy threshold : 30% If none of the parallel nodes have at least 30% of all edges pointing to it, then the parallel nodes are arguments rather than synonyms Replace parallel argument nodes with a slot

12 12 Lattice with Slots

13 13 1.Pattern Selection (Within Corpus) 2.Paraphrase Acquisition (Across Corpora) 3.Generation 一步一腳印

14 14 Paraphrase Acquisition 之速配成功 Want to pair up lattices in two different corpora (e.g. “X kicked Y” in Corpus A and “Y was kicked by X” in Corpus B) Idea : paraphrases have the same slot values “Take a pair of lattices from different corpora, look back at the sentence clusters from which the two lattices were derived, and compare the slot values of those cross- corpus sentence pairs that appear in articles written on the same day” – © Barzilay 2003 For example, we can have “the plane bombed the town” in Corpus A, and “the town was bombed by the plane” in Corpus B both written on the same day. More overlapping slot values  better

15 15 1.Pattern Selection (Within Corpus) 2.Paraphrase Acquisition (Across Corpora) 3.Generation 一步一腳印

16 16 Generation 之大功告成 Given an input sentence, use MSA to find the most similar training sentence Use the training sentence’s lattice to generate paraphrases

17 17 Evaluation Corpora Corpus 1: Agence France-Presse (AFP) Corpus 2: Reuters Between September 2000 and August 2002 Focus on violence in Isreal and army raids on Palestinian territories 9 MB of articles in total

18 18 Experiment ½ 之 Template Evaluate the quality of template generated Example : Is “X kicked Y” equivalent to “Y was kicked by X”? Baseline : DIRT, another paraphrase system (focus on shorter phrases) 4 human judges Randomly extract 250 pairs of paraphrases per system 100 pairs (50 per system) are evaluated by all 4 judges Each judge evaluates different 100 of the remaining 400 pairs

19 19 Result ½ : 開放 Call-in

20 20 Result ½ : 開放 Call-in MSA outperforms DIRT by about 38% in all cases But DIRT focuses only on short phrases, so it’s unfair But no one has done sentence-level paraphrasing before

21 21 Experiment 2/2 之 Paraphrase Evaluate the actual paraphrases generated For testing, choose 20 AFP articles about violence in Middle East, but are not in training corpus Try to paraphrase every sentence in the 20 articles Baseline : randomly substitute sentence words with Wordnet synonyms 2 judges ( 人工太貴, 沒有五年五百億 )

22 22 Result 2/2 : 開放 Call-in 59 out of 484 sentences have paraphrases (12.2%) JudgeMSAWordnet Synonym 181.4%69.5% 278.0%66.1%

23 23 終於 Generating sentence-level paraphrases is not addressed previously Use comparable corpora instead of parallel corpora 實驗室已經報告過 5 篇 Barzilay 的文章 ( 不 包括此篇 )


Download ppt "1 Learning to Paraphrase: An Unsupervised Approach Using Multiple-Sequence Alignment Regina Barzilay and Lillian Lee Cornell University HLT-NAACL 2003."

Similar presentations


Ads by Google