Presentation is loading. Please wait.

Presentation is loading. Please wait.

Textual Entailment as a Framework for Applied Semantics

Similar presentations


Presentation on theme: "Textual Entailment as a Framework for Applied Semantics"— Presentation transcript:

1 Textual Entailment as a Framework for Applied Semantics
Ido Dagan Bar-Ilan University, Israel Joint works with: Oren Glickman, Idan Szpektor, Roy Bar Haim, Maayan Geffet, Moshe Koppel, Efrat Marmorshtein, Bar Ilan University Shachar Mirkin Hebrew University, Israel Hristo Tanev, Bernardo Magnini, Alberto Lavelli, Lorenza Romano ITC-irst, Italy Bonaventura Coppola, Milen Kouylekov University of Trento and ITC-irst, Italy Danilo Giampiccolo, CELCT, Italy Dan Roth, UIUC

2 Applied Semantics for Text Understanding/Reading
Understanding text meaning refers to the semantic level of language An applied computational framework for semantics is needed Such common framework is still missing

3 Desiderata for Modeling Framework
A framework for a target level of language processing should provide: Generic module for applications Unified paradigm for investigating language phenomena Unified knowledge representation Most semantics research is scattered WSD, NER, SRL, lexical semantics relations… (e.g. vs. syntax) Dominating approach - interpretation Scattered – but no systematic coverage of the space; SEMEVAL – 10 different tasks (19 overall)

4 Outline The textual entailment task – what and why?
Evaluation – PASCAL RTE Challenges Modeling approach: Knowledge acquisition Inference (briefly) Application example An alternative framework for investigating semantics Clearer frameworks in applied morphology, syntax, but not in semantics

5 Natural Language and Meaning
Variability Ambiguity Language Text understanding in a nutshell Levels of representation.

6 Variability of Semantic Expression
The Dow Jones Industrial Average closed up 255 Dow ends up Dow gains 255 points Stock market hits a record high Dow climbs 255 Model variability as relations between text expressions: Equivalence: expr1  expr2 (paraphrasing) Entailment: expr1  expr2 – the general case Incorporates inference as well First step towards a broad semantic model of language variation

7 Typical Application Inference
Question Expected answer form Who bought Overture? >> X bought Overture Overture’s acquisition by Yahoo Yahoo bought Overture entails text hypothesized answer Similar for IE: X buy Y Similar for “semantic” IR: t: Overture was bought … Summarization (multi-document) – identify redundant info MT evaluation (and recent ideas for MT) Educational applications

8 KRAQ'05 Workshop - KNOWLEDGE and REASONING for ANSWERING QUESTIONS (IJCAI-05)
CFP: Reasoning aspects:     * information fusion,     * search criteria expansion models     * summarization and intensional answers,     * reasoning under uncertainty or with incomplete knowledge, Knowledge representation and integration:     * levels of knowledge involved (e.g. ontologies, domain knowledge),     * knowledge extraction models and techniques to optimize response accuracy … but similar needs for other applications – can entailment provide a common empirical task?

9 Classical Entailment Definition
Chierchia & McConnell-Ginet (2001): A text t entails a hypothesis h if h is true in every circumstance (possible world) in which t is true Strict entailment - doesn't account for some uncertainty allowed in applications Related notions (implicature, pre-supposition) encompass likely inference

10 “Almost certain” Entailments
t: The technological triumph known as GPS … was incubated in the mind of Ivan Getting. h: Ivan Getting invented the GPS.

11 Applied Textual Entailment
Directional relation between two text fragments: Text (t) and Hypothesis (h): t entails h (th) if, typically, a human reading t would infer that h is most likely true Operational (applied) definition: Human gold standard - as in NLP applications Assuming common background knowledge – which is indeed expected from applications!

12 Probabilistic Interpretation
Definition: t probabilistically entails h if: P(h is true | t) > P(h is true) t increases the likelihood of h being true ≡ Positive PMI – t provides information on h’s truth P(h is true | t ): entailment confidence The relevant entailment score for applications In practice: “most likely” entailment expected !!!Add: the intention is to have an absolutely high value for the conditional prob, which is also a confidence value Add assumption about prior < 1

13 The Role of Knowledge For textual entailment to hold we require:
text AND knowledge  h but knowledge should not entail h alone Systems are not supposed to validate h’s truth without utilizing t

14 PASCAL Recognizing Textual Entailment (RTE) Challenges EU FP-6 Funded PASCAL NOE 2004-7
Bar-Ilan University ITC-irst and CELCT, Trento MITRE Microsoft Research

15 Generic Dataset by Application Use
7 application settings in RTE-1, 4 in RTE-2/3 QA IE “Semantic” IR Comparable documents / multi-doc summarization MT evaluation Reading comprehension Paraphrase acquisition Most data created from actual applications output RTE-2: 800 examples in development and test sets 50-50% YES/NO split

16 Some Examples TEXT HYPOTHESIS TASK ENTAIL-MENT 1
Regan attended a ceremony in Washington to commemorate the landings in Normandy. Washington is located in Normandy. IE False 2 Google files for its long awaited IPO. Google goes public. IR True 3 …: a shootout at the Guadalajara airport in May, 1993, that killed Cardinal Juan Jesus Posadas Ocampo and six others. Cardinal Juan Jesus Posadas Ocampo died in 1993. QA 4 The SPD got just 21.5% of the vote in the European Parliament elections, while the conservative opposition parties polled 44.5%. The SPD is defeated by the opposition parties.

17 Participation and Impact
Very successful challenges, world wide: RTE-1 – 17 groups RTE-2 – 23 groups 30 groups in total ~150 downloads! RTE-3 underway – 25 groups Joint workshop at ACL-07 High interest in the research community Papers, conference sessions and areas, PhD’s, influence on funded projects Textual Entailment special issue at JNLE ACL-07 tutorial

18 Methods and Approaches (RTE-2)
Measure similarity match between t and h (coverage of h by t): Lexical overlap (unigram, N-gram, subsequence) Lexical substitution (WordNet, statistical) Syntactic matching/transformations Lexical-syntactic variations (“paraphrases”) Semantic role labeling and matching Global similarity parameters (e.g. negation, modality) Cross-pair similarity Detect mismatch (for non-entailment) Logical interpretation and inference (vs. matching)

19 Dominant approach: Supervised Learning
Similarity Features: Lexical, n-gram,syntactic semantic, global Classifier YES t,h NO Feature vector Features model similarity and mismatch Classifier determines relative weights of information sources Train on development set and auxiliary t-h corpora

20 Results Average Precision Accuracy First Author (Group) 80.8% 75.4%
Hickl (LCC) 71.3% 73.8% Tatu (LCC) 64.4% 63.9% Zanzotto (Milan & Rome) 62.8% 62.6% Adams (Dallas) 66.9% 61.6% Bos (Rome & Leeds) 58.1%-60.5% 11 groups 52.9%-55.6% 7 groups Average: 60% Median: 59%

21 Analysis For the first time: deeper methods (semantic/ syntactic/ logical) clearly outperform shallow methods (lexical/n-gram) Cf. Kevin Knight’s invited talk at EACL-06, titled: Isn’t linguistic Structure Important, Asked the Engineer Still, most systems based on deep analysis did not score significantly better than the lexical baseline

22 Why? System reports point at:
Lack of knowledge (syntactic transformation rules, paraphrases, lexical relations, etc.) Lack of training data It seems that systems that coped better with these issues performed best: Hickl et al. - acquisition of large entailment corpora for training Tatu et al. – large knowledge bases (linguistic and world knowledge)

23 Some suggested research directions
Knowledge acquisition Unsupervised acquisition of linguistic and world knowledge from general corpora and web Acquiring larger entailment corpora Manual resources and knowledge engineering Inference Principled framework for inference and fusing information levels Are we happy with bags of features?

24 Complementary Evaluation Modes
Entailment subtasks evaluations Lexical, lexical-syntactic, logical, alignment… “Seek” mode: Input: h and corpus Output: All entailing t’s in corpus Captures information seeking needs, but requires post-run annotation (TREC style) Contribution to specific applications! QA – Harabagiu & Hickl, ACL-06; RE – Romano et al., EACL-06

25 Our Own Research Directions Acquisition Inference Applications
In which framework should we work to model semantic phenomena

26 Learning Entailment Rules
Q: What reduces the risk of Heart Attacks? Hypothesis: Aspirin reduces the risk of Heart Attacks Text: Aspirin prevents Heart Attacks Entailment Rule: X prevent Y ⇨ X reduce risk of Y Show the need of entailment relations Explain the entailment rule parts: a template – a parsed text fragment with variables, a relation direction State that a large collection of entailment rules, ~100000, is needed to provide good coverage for language variability template template  Need a large knowledge base of entailment rules

27 Anchor Set Extraction (ASE)
TEASE – Algorithm Flow Lexicon Input template: Xsubj-accuse-objY WEB TEASE Sample corpus for input template: Paula Jones accused Clinton… Sanhedrin accused St.Paul… Anchor Set Extraction (ASE) Anchor sets: {Paula Jonessubj; Clintonobj} {Sanhedrinsubj; St.Paulobj} State the name of the method – TEASE – and its acronym meaning Emphasize that the ASE part solves the supervision problem we had in previous web-based methods Finish with stating again the two parts of the TEASE method Sample corpus for anchor sets: Paula Jones called Clinton indictable… St.Paul defended before the Sanhedrin Template Extraction (TE) Templates: X call Y indictable Y defend before X … iterate

28 Sample of Extracted Anchor-Sets for X prevent Y
X=‘sunscreens’, Y=‘sunburn’ X=‘sunscreens’, Y=‘skin cancer’ X=‘vitamin e’, Y=‘heart disease’ X=‘aspirin’, Y=‘heart attack’ X=‘vaccine candidate’, Y=‘infection’ X=‘universal precautions’, Y=‘HIV’ X=‘safety device’, Y=‘fatal injuries’ X=‘hepa filtration’, Y=‘contaminants’ X=‘low cloud cover’, Y= ‘measurements’ X=‘gene therapy’, Y=‘blindness’ X=‘cooperation’, Y=‘terrorism’ X=‘safety valve’, Y=‘leakage’ X=‘safe sex’, Y=‘cervical cancer’ X=‘safety belts’, Y=‘fatalities’ X=‘security fencing’, Y=‘intruders’ X=‘soy protein’, Y=‘bone loss’ X=‘MWI’, Y=‘pollution’ X=‘vitamin C’, Y=‘colds’ Emphasize that the large amount of good anchor-sets enables the learning of many different entailment rules

29 Sample of Extracted Templates for X prevent Y
X reduce Y X protect against Y X eliminate Y X stop Y X avoid Y X for prevention of Y X provide protection against Y X combat Y X ward Y X lower risk of Y X be barrier against Y X fight Y X reduce Y risk X decrease the risk of Y relationship between X and Y X guard against Y X be cure for Y X treat Y X in war on Y X in the struggle against Y X a day keeps Y away X eliminate the possibility of Y X cut risk Y X inhibit Y State the richness Emphasize the impact of systems like question answering

30 Experiment and Evaluation
48 randomly chosen input verbs 1392 templates extracted ; human judgments Encouraging Results: Future work: precision, estimate probabilities Average Yield per verb 29 correct templates per verb Average Precision per verb 45.3% Explain why yield and not recall State that the precision is comparable to other unsupervised systems

31 Acquiring Lexical Entailment Relations
COLING-04, ACL-05 Lexical entailment via distributional similarity Individual features characterize semantic properties Obtain characteristic features via bootstrapping Test characteristic feature inclusion (vs. overlap) COLING-ACL-06 Integrate pattern-based extraction NP such as NP1, NP2, … Complementary information to distributional evidence Integration using ML with minimal supervision (10 words)

32 Acquisition Example Does not overlap traditional ontological relations
Top-ranked entailments for “company”: firm, bank, group, subsidiary, unit, business, supplier, carrier, agency, airline, division, giant, entity, financial institution, manufacturer, corporation, commercial bank, joint venture, maker, producer, factory … Does not overlap traditional ontological relations

33 Initial Probabilistic Lexical Co-occurrence Models
Alignment-based (RTE-1 & ACL-05 Workshop) The probability that a term in h is entailed by a particular term in t Bayesian classification (AAAI-05) The probability that a term in h is entailed by (fits in) the entire text of t An unsupervised text categorization setting – each term is a category Demonstrate directions for probabilistic modeling and unsupervised estimation

34 Manual Syntactic Transformations Example: ‘X prevent Y ’
Sunscreen, which prevents moles and sunburns, …. prevent sunscreen subj obj X Y which subj N1 N2 and rel mod conj prevents obj () moles mod conj and sunburns

35 Syntactic Variability Phenomena
Template: X activate Y Example Phenomenon Y is activated by X Passive form X activates its companion, Y Apposition X activates Z and Y Conjunction X activates two proteins: Y and Z Set X, which activates Y Relative clause X binds and activates Y Coordination X activates a fragment of Y Transparent head X is a kinase, though it activates Y Co-reference

36 Takeout Promising potential for creating huge entailment knowledge bases Mostly by unsupervised approaches Manually encoded Derived from lexical resources Potential for uniform representations, such as entailment rules, for different types of semantic and world knowledge

37 Inference Goal: infer hypothesis from text
Match and apply available entailment knowledge Heuristically bridge inference gaps Our approach: mapping language constructs Vs. semantic interpretation Lexical-syntactic structures as meaning representation Amenable for unsupervised learning Entailment rule transformations over syntactic trees

38 Application: Unsupervised Relation Extraction EACL 2006

39 Relation Extraction Subfield of Information Extraction
Identify different ways of expressing a target relation Examples: Management Succession, Birth - Death, Mergers and Acquisitions, Protein Interaction Traditionally performed in a supervised manner Requires dozens-hundreds examples per relation Examples should cover broad semantic variability Costly - Feasible??? Little work on unsupervised approaches

40 Our Goals Entailment Approach for Relation Extraction Unsupervised
System Evaluation Framework for Entailment Rule Acquisition and Matching

41 Proposed Approach Entailment Rule Acquisition Syntactic Matcher
Input Template X prevent Y Entailment Rule Acquisition TEASE Templates X prevention for Y, X treat Y, X reduce Y Transformation Rules Syntactic Matcher Relation Instances <sunscreen, sunburns>

42 Dataset Bunescu 2005 Recognizing interactions between annotated proteins pairs 200 Medline abstracts Gold standard dataset of protein pairs Input template : X interact with Y

43 Manual Analysis - Results
93% of interacting protein pairs can be identified with lexical syntactic templates Number of templates vs. recall (within 93%): # templates R(%) 39 60 2 10 73 70 4 20 107 80 6 30 141 90 11 40 175 100 21 50 Frequency of syntactic phenomena: % Phenomenon 8 relative clause 34 transparent head 7 co-reference 24 apposition coordination conjunction 2 passive form 13 set

44 TEASE Output for X interact with Y
A sample of correct templates learned: X binding to Y X bind to Y X Y interaction X activate Y X attach to Y X stimulate Y X interaction with Y X couple to Y X trap Y interaction between X and Y X recruit Y X become trapped in Y X associate with Y X Y complex X be linked to Y X recognize Y X target Y X block Y

45 TEASE algorithm - Potential Recall on Training Set
Experiment 39% input 49% input + iterative 63% input + iterative + morph Iterative - taking the top 5 ranked templates as input Morph - recognizing morphological derivations (cf. semantic role labeling vs. matching)

46 Results for Full System
Precision Recall 0.28 62% 18% Input 0.34 42% 29% input + iterative Error sources: Dependency parser and syntactic matching errors No morphological derivation recognition TEASE limited precision (incorrect templates)

47 Vs Supervised Approaches
180 training abstracts Mentions Hickl & Harabagiu ACL-2006 result for QA

48 Query expansion based on entailment rules – may be more precise than earlier attempts for (somewhat noisy) QE

49

50 Epmpahsize importance for small collections

51 Textual Entailment as a Framework for Investigating Semantics
In which framework should we work to model semantic phenomena

52 Classical Approach = Interpretation
Stipulated Meaning Representation (by scholar) Variability Language (by nature) Supposedly – map language into meaning; but it’s not really meaning It’s stipulated meaning representations, and we know this has been a dangerous business Maybe it’s not the most suitable framework - Maybe we’d like to stay away from the red area… Logical forms, word senses, semantic roles, named entity types, … - scattered tasks Feasible/suitable framework for applied semantics?

53 Textual Entailment = Text Mapping
Assumed Meaning (by humans) Variability Language (by nature) The only way to test understanding – as we do not have access to actual meaning: how do I know that you understand (by phone)? A version of Turing test (I ignore entailment direction here)

54 General Case – Inference
Meaning Representation Inference Interpretation Language Textual Entailment Interpretation approach: supposedly easier to infer entailment at the meaning representation language The only way to test understanding – as we do not have access to actual meaning: how do I know that you understand (by phone)? A version of Turing test (I ignore entailment direction here) Entailment mapping is the actual applied goal but also a touchstone for understanding! Interpretation becomes a possible mean

55 Some perspectives Issues with interpretation approach:
Hard to agree on a representation language Costly to annotate semantic representations for training Textual entailment refers to texts Texts are theory neutral Amenable for unsupervised learning “Proof is in the pudding” test

56 Opens up a framework for investigating semantic issues
Classical problems can be cast (linguistics) All boys are nice  All tall boys are nice But also… A new slant at old problems Exposing many new ones

57 Making sense of (implicit) senses
What is the RIGHT set of senses? Any concrete set is problematic/subjective … but WSD forces you to choose one A lexical entailment perspective: Instead of identifying an explicitly stipulated sense of a word occurrence … identify whether a word occurrence (i.e. its implicit sense) entails another word occurrence, in context ACL-2006

58 Lexical Matching for Applications
Sense equivalence Q: announcement of new models of chairs T1: IKEA announced a new comfort chair T2: MIT announced a new CS chair position Sense entailment in substitution Mention similarity to MT Q: announcement of new models of furnitures T1: IKEA announced a new comfort chair T2: MIT announced a new CS chair position

59 Source = record Target = disc
Synonym Substitution Source = record Target = disc This is anyway a stunning disc, thanks to the playing of the Moscow Virtuosi with Spivakov. He said computer networks would not be affected and copies of information should be made on floppy discs. Before the dead soldier was placed in the ditch his personal possessions were removed, leaving one disc on the body for identification purposes. positive negative negative

60 Investigated Methods Matching: indirect direct Learning: supervised unsupervised Task: classification ranking

61 Unsupervised Direct: kNN-ranking
Test example score: Average Cosine similarity of target example with k most similar instances of source word Rational: positive examples of target will be similar to some source occurrence (of corresponding sense) negative examples won’t be similar to source Rank test examples by score A classification slant on language modeling

62 Results (for synonyms): Ranking
Emphasize – no external information (sense repository, definitions, hierarchy) Potential for improvement – similarity assessment kNN improves 8-18% precision up to 25% recall

63 Other Projected and New Problems
Named Entity Classification – by any textual type Which pickup trucks are produced by Mitsubishi? Magnum  pickup truck Lexical semantic relationships (e.g. Wordnet) Which relations contribute to entailment inference? How? Semantic role mapping (vs. labeling) Recognize transparent heads Topical entailment – entailing textually defined topics

64 Textual Entailment as Goal
The essence of our proposal: Formulate various semantic problems as entailment tasks Base applied inference on entailment “engines” and KBs Interpretations and mapping methods may compete Open question: which inference can be represented at language level? requires logical or specialized representation and inference? (temporal, spatial, mathematical, …)

65 Meeting the knowledge challenge – by a coordinated effort?
A vast amount of “entailment rules” needed Speculation: is it possible to have a public effort for knowledge acquisition? Simple, uniform representations Assuming mostly automatic acquisition (millions of rules?) Human Genome Project analogy Preliminary: RTE-3 Resources Pool at ACLWiki

66 Textual Entailment ≈ Human Reading Comprehension
From a children’s English learning book (Sela and Greenberg): Reference Text: “…The Bermuda Triangle lies in the Atlantic Ocean, off the coast of Florida. …” Hypothesis (True/False?): The Bermuda Triangle is near the United States The common for method for testing human reading comprehension is by testing the entailment capability – either directly or via QA The difficulty – variability between question and text Knowledge needed – Florida in the US ???

67 Optimistic Conclusions: Textual Entailment…
is a promising framework for applied semantics: Defines new semantic problems to work on May be modeled probabilistically Appealing potential for knowledge acquisition Thank you!


Download ppt "Textual Entailment as a Framework for Applied Semantics"

Similar presentations


Ads by Google