Download presentation
Presentation is loading. Please wait.
1
Normalized alignment of dependency trees for detecting textual entailment Erwin Marsi & Emiel Krahmer Tilburg University Wauter Bosma & Mariët Theune University of Twente
2
April 10 2006RTE2 Workshop2 Basic idea A true hypothesis is included in the text, allowing omission and rephrasing Text: The Rolling Stones kicked off their latest tour on Sunday with a concert at Boston's Fenway Park. Hypothesis: The Rolling Stones have begun their latest tour with a concert in Boston. Entailment: True Omissions: –on Sunday –Fenway Park Paraphrases: –kicked off begun –Boston's Fenway Park Boston
3
April 10 2006RTE2 Workshop3 Matching surface words alone is not sufficient... Variation in surface realization perfect word match is no guarantee for entailment Using syntactic analysis –for syntactic normalization –to match on hierarchical relations among constituents Example: “He became a boxing referee in 1964, and became well- known […]” “He became well-known in 1964”
4
April 10 2006RTE2 Workshop4 Preprocessing Input: T-H pairs in XML Processing pipeline: 1.Sentence splitting, MXTERMINATOR (Reynar & Ratnaparkhi, 1997) 2.Tokenization, Penn Treebank SED script 3.POS tagging with PTB POS tags using Mbt (van den Bosch et al) 4.Lemmatizing using Memory-based learning (van den Bosch et al) 5.Dependency parsing using Maltparser trained on PTB (Nivre & Scholz, 2004) 6.Syntactic normalization Output: T-H dependency tree(s) pairs in XML
5
April 10 2006RTE2 Workshop5 Syntactic Normalization Three types of syntactic normalization: –Auxiliary reduction –Passive to active form –Copula reduction
6
April 10 2006RTE2 Workshop6 Auxiliary Reduction Auxiliaries of progressive and perfective tense are removed Their children are attached to the remaining content verb The same goes for modal verbs, and for do in the do-support function. Example: “demand for ivory has dropped” “demand for ivory dropped” Example: “legalization does not solve any social problems” “legalization not solves any social problems”
7
April 10 2006RTE2 Workshop7 Passive to Active Form The passive form auxiliary is removed The original subject becomes object Where possible, a by-phrase becomes the subject Example: “Ahmedinejad was attacked by the US” “the US attacked Ahmedinejad”
8
April 10 2006RTE2 Workshop8 Copula Reduction Copular verbs are removed by attaching the predicate as a daughter to the subject Example: “Microsoft Corp. is a partner of Intel Corp.” “Microsoft Corp., a partner of Intel Corp.”
9
April 10 2006RTE2 Workshop9 Alignment of Dependency Trees Tree alignment algorithm based on (Meyers, Yangarbar and Grishman, 1996) Searches for an optimal alignment of the nodes of the text tree to the nodes of the hypothesis tree Tree alignment is a function of: 1.how well the words of the two nodes match 2.recursively, the weighted alignment score for each of the aligned daughter nodes
10
April 10 2006RTE2 Workshop10 Word Matching function WordMatch(w t,w h ) -> [0,1] maps text-hypothesis word pairs to a similarity score returns 1 if –w t is identical to w h –the lemma of w t is identical to the lemma of w h –w t is a synonym of w h (lookup in EuroWordnet with lemma & POS) –w h is a hypernym of w t (idem) returns similarity from automatically derived thesaurus if > 0.1 (Lin’s dependency-based thesaurus) otherwise returns 0 also match on phrasal verbs –e.g. “kick off“ is a synonym of “begin“
11
April 10 2006RTE2 Workshop11 Alignment example Text: The development of agriculture by early humans, roughly 10,000 years ago, was also harmful to many natural ecosystems as they were systematically destroyed and replaced with artificial versions. Hypothesis: Humans existed 10,000 years ago. Entailment: True
12
April 10 2006RTE2 Workshop12 Alignment example (cont’d)
13
April 10 2006RTE2 Workshop13 Entailment prediction Prediction rule: IF top node of the hypothesis is aligned AND score > threshold THEN entailment = true ELSE entailment = false Threshold and parameters of tree alignment algorithm (skip penalty) optimized per task
14
April 10 2006RTE2 Workshop14 Results Percentage entailment accuracy (n=800)
15
April 10 2006RTE2 Workshop15 Problems Many parses contain errors due to syntactic ambiguity and propagation of –Spelling errors –Tokenization errors –POS errors –broken dependency trees Consequently, syntactic normalization & alignment failed Dependency relations did not help
16
April 10 2006RTE2 Workshop16 Discussion & Conclusion There are many forms of textual entailment that we cannot recognize automatically... –Paraphrasing –Co-reference resolution –Ellipsis –Condition/modality –Inference –Common sense / world knowledge RTE requires a combination of deep NLP, common sense knowledge and reasoning
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.