FATE: a FrameNet Annotated corpus for Textual Entailment Marco Pennacchiotti, Aljoscha Burchardt Computerlinguistik Saarland University, Germany LREC 2008,

Slides:



Advertisements
Similar presentations
EVALITA 2009 Recognizing Textual Entailment (RTE) Italian Chapter Johan Bos 1, Fabio Massimo Zanzotto 2, Marco Pennacchiotti 3 1 University of Rome La.
Advertisements

Exploring the Effectiveness of Lexical Ontologies for Modeling Temporal Relations with Markov Logic Eun Y. Ha, Alok Baikadi, Carlyle Licata, Bradford Mott,
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
COGEX at the Second RTE Marta Tatu, Brandon Iles, John Slavick, Adrian Novischi, Dan Moldovan Language Computer Corporation April 10 th, 2006.
Recognizing Textual Entailment Challenge PASCAL Suleiman BaniHani.
Baselines for Recognizing Textual Entailment Ling 541 Final Project Terrence Szymanski.
FATE: a FrameNet Annotated corpus for Textual Entailment Marco Pennacchiotti, Aljoscha Burchardt Computerlinguistik Saarland University, Germany LREC 2008,
The SALSA experience: semantic role annotation Katrin Erk University of Texas at Austin.
Improving Machine Translation Quality via Hybrid Systems and Refined Evaluation Methods Andreas Eisele DFKI GmbH and Saarland University Helsinki, November.
Recognizing Implicit Discourse Relations in the Penn Discourse Treebank Ziheng Lin, Min-Yen Kan, and Hwee Tou Ng Department of Computer Science National.
LEDIR : An Unsupervised Algorithm for Learning Directionality of Inference Rules Advisor: Hsin-His Chen Reporter: Chi-Hsin Yu Date: From EMNLP.
Robust Textual Inference via Graph Matching Aria Haghighi Andrew Ng Christopher Manning.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Steven Schoonover.  What is VerbNet?  Levin Classification  In-depth look at VerbNet  Evolution of VerbNet  What is FrameNet?  Applications.
Semantic Frames: FrameNet. What is FrameNet? FrameNet is an ongoing project at the International Computer Science Institute located in Berkeley California.
Normalized alignment of dependency trees for detecting textual entailment Erwin Marsi & Emiel Krahmer Tilburg University Wauter Bosma & Mariët Theune University.
NaLIX: A Generic Natural Language Search Environment for XML Data Presented by: Erik Mathisen 02/12/2008.
Are Linguists Dinosaurs? 1.Statistical language processors seem to be doing away with the need for linguists. –Why do we need linguists when a machine.
Basi di dati distribuite Prof. M.T. PAZIENZA a.a
Automatic Classification of Semantic Relations between Facts and Opinions Koji Murakami, Eric Nichols, Junta Mizuno, Yotaro Watanabe, Hayato Goto, Megumi.
Shallow semantic parsing: Making most of limited training data Katrin Erk Sebastian Pado Saarland University.
UNED at PASCAL RTE-2 Challenge IR&NLP Group at UNED nlp.uned.es Jesús Herrera Anselmo Peñas Álvaro Rodrigo Felisa Verdejo.
Comments on Guillaume Pitel: “Using bilingual LSA for FrameNet annotation of French text from generic resources” Gerd Fliedner Computational Linguistics.
Extracting Interest Tags from Twitter User Biographies Ying Ding, Jing Jiang School of Information Systems Singapore Management University AIRS 2014, Kuching,
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.
Page 1 Relation Alignment for Textual Entailment Recognition Department of Computer Science University of Illinois at Urbana-Champaign Mark Sammons, V.G.Vinod.
Outline P1EDA’s simple features currently implemented –And their ablation test Features we have reviewed from Literature –(Let’s briefly visit them) –Iftene’s.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Overview of the Fourth Recognising Textual Entailment Challenge NIST-Nov. 17, 2008TAC Danilo Giampiccolo (coordinator, CELCT) Hoa Trang Dan (NIST)
UAM CorpusTool: An Overview Debopam Das Discourse Research Group Department of Linguistics Simon Fraser University Feb 5, 2014.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Fabio Massimo Zanzotto
NLP superficial and lexic level1 Superficial & Lexical level 1 Superficial level What is a word Lexical level Lexicons How to acquire lexical information.
“How much context do you need?” An experiment about context size in Interactive Cross-language Question Answering B. Navarro, L. Moreno-Monteagudo, E.
Assessing the Impact of Frame Semantics on Textual Entailment Authors: Aljoscha Burchardt, Marco Pennacchiotti, Stefan Thater, Manfred Pinkal Saarland.
Knowledge and Tree-Edits in Learnable Entailment Proofs Asher Stern, Amnon Lotan, Shachar Mirkin, Eyal Shnarch, Lili Kotlerman, Jonathan Berant and Ido.
The Impact of Grammar Enhancement on Semantic Resources Induction Luca Dini Giampaolo Mazzini
SALSA The Saarbrücken Lexical Semantics Annotation & Acquisition Project Aljoscha Burchardt, Katrin Erk, Anette Frank, Andrea Kowalski, Sebastian Pado,
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
RTE Planning Session Luisa Bentivogli, Peter Clark, Ido Dagan, Hoa Trang Dang, Danilo Giampiccolo.
Based on “Semi-Supervised Semantic Role Labeling via Structural Alignment” by Furstenau and Lapata, 2011 Advisors: Prof. Michael Elhadad and Mr. Avi Hayoun.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
Jennie Ning Zheng Linda Melchor Ferhat Omur. Contents Introduction WordNet Application – WordNet Data Structure - WordNet FrameNet Application – FrameNet.
Crowdsourcing Inference-Rule Evaluation Naomi Zeichner, Jonathan Berant, Ido Dagan Crowdsourcing Inference-Rule Evaluation Naomi Zeichner, Jonathan Berant,
Deriving Paraphrases for Highly Inflected Languages from Comparable Documents Kfir Bar, Nachum Dershowitz Tel Aviv University, Israel.
2010/2/4Yi-Ting Huang Pennacchiotti, M., & Zanzotto, F. M. Learning Shallow Semantic Rules for Textual Entailment. Recent Advances in Natural Language.
The Current State of FrameNet CLFNG June 26, 2006 Fillmore.
Combining terminology resources and statistical methods for entity recognition: an evaluation Angus Roberts, Robert Gaizauskas, Mark Hepple, Yikun Guo.
Recognizing textual entailment: Rational, evaluation and approaches Source:Natural Language Engineering 15 (4) Author:Ido Dagan, Bill Dolan, Bernardo Magnini.
Phrase Reordering for Statistical Machine Translation Based on Predicate-Argument Structure Mamoru Komachi, Yuji Matsumoto Nara Institute of Science and.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
MASC The Manually Annotated Sub- Corpus of American English Nancy Ide, Collin Baker, Christiane Fellbaum, Charles Fillmore, Rebecca Passonneau.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Summarization Focusing on Polarity or Opinion Fragments in Blogs Yohei Seki Toyohashi University of Technology Visiting Scholar at Columbia University.
Minimally Supervised Event Causality Identification Quang Do, Yee Seng, and Dan Roth University of Illinois at Urbana-Champaign 1 EMNLP-2011.
Relation Alignment for Textual Entailment Recognition Cognitive Computation Group, University of Illinois Experimental ResultsTitle Mark Sammons, V.G.Vinod.
GermaNet-WS II A WordNet “Detour” to FrameNet Aljoscha Burchardt Katrin Erk Anette Frank* Saarland University, DFKI* Saarbrücken
1 Reasoning with Infinite stable models Piero A. Bonatti presented by Axel Polleres (IJCAI 2001,
SALSA-WS 09/05 Approximating Textual Entailment with LFG and FrameNet Frames Aljoscha Burchardt, Anette Frank Computational Linguistics Department Saarland.
Cheap and Fast – But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks EMNLP 2008 Rion Snow CS Stanford Brendan O’Connor Dolores.
Human-Assisted Machine Annotation Sergei Nirenburg, Marjorie McShane, Stephen Beale Institute for Language and Information Technologies University of Maryland.
October 10, 2003BLTS Kickoff Meeting1 Transfer with Strong Decoding Learning Module Transfer Rules {PP,4894} ;;Score: PP::PP [NP POSTP] -> [PREP.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
Integrating linguistic knowledge in passage retrieval for question answering J¨org Tiedemann Alfa Informatica, University of Groningen HLT/EMNLP 2005.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
A Database of Narrative Schemas A 2010 paper by Nathaniel Chambers and Dan Jurafsky Presentation by Julia Kelly.
Identifying Expressions of Opinion in Context Eric Breck and Yejin Choi and Claire Cardie IJCAI 2007.
Automatic Ontology Extraction Miloš Husák RASLAN 2010.
Learning Textual Entailment from Examples
Information Retrieval
Presentation transcript:

FATE: a FrameNet Annotated corpus for Textual Entailment Marco Pennacchiotti, Aljoscha Burchardt Computerlinguistik Saarland University, Germany LREC 2008, Marrakech, 28 May 2008 SALSA II - The Saarbrücken Lexical Semantics Acquisition Project

Summary FrameNet and Textual Entailment FATE annotation schema Annotation examples and statistics Conclusions 28/05/20082 / 17FATE - Marco Pennacchiotti

Frame Semantics Frame: conceptual structure modeling a prototypical situation Frame Elements (FE): participants of the situation Frame Evoking elements (FEE): predicates evoking the situation [Fillmore 1976, 2003] 28/05/20083 / 17FATE - Marco Pennacchiotti Predicate-argument level normalizations FrameNet Berkeley Project 1 – Database of frames for the core lexicon of English – 800 frames, lemmas, annotated sentences (1) “Evelyn spoke about her past” “Evelyn’s statement about her past” STATEMENT( S PEAKER : Evelyn; T OPIC : her past )

Textual Entailment (TE) Given two text fragments, the Text T and the Hypothesis H, T entails H if the meaning of H can be inferred from the meaning of T, as would typically interpreted by people [Dagan 2005] Given two text fragments, the Text T and the Hypothesis H, T entails H if the meaning of H can be inferred from the meaning of T, as would typically interpreted by people [Dagan 2005] T: “Yahoo has recently acquired Overture” H: “Yahoo owns Overture” T  H Recognizing Textual Entailment (RTE) – recognize if entailment holds for a given (T,H) pair – Models core inferences of many NLP applications (QA, IE, MT,…) RTE Challenges [Dagan et al.,2005 ; Giampiccolo et al., 2007] – Compare systems for RTE – Corpus: 800 training pairs, 800 test pairs, evenly split in + and - pairs 28/05/20084 / 17FATE - Marco Pennacchiotti

Predicate-argument and RTE Predicate-level inference plays a relevant role in TE (20% of positive examples in RTE-2 [Garoufi, 2007] ) An avalanche has struck a popular skiing resort in Austria, killing at least 11 people. Humans died in an avalanche. Implementation gap : [Burchardt et al.,2007] : FrameNet system comparable to lexical overlap [Hickl et al.,2006] : PropBank-based features are not effective [Rana et al.,2005]: DIRT paraphrase repository does not help 28/05/20085 / 17FATE - Marco Pennacchiotti DEATH( P ROTAGONIST : 11 people / humans ; C AUSE : avalanche / avalanche )

FATE corpus Reference corpus: RTE-2 test set, 800 pairs, 29,000 tokens Frame resource : FrameNet version 1.3 Corpus Format : SALSA/TIGER XML [Burchardt et al.,2006] Pre-processing: annotation on top of Collins parser syntactic analysis : T and H are randomly reordered to avoid biases Annotation : performed by one highly experienced annotator : inter-annotator agreement over 5% of the corpus – FEE-agreement : 82% – Frame-agreement: 88% – Role-agreement: 91% : annotation carried out using the SALTO tool 1 (1) 28/05/20086 / 17FATE - Marco Pennacchiotti FATE: a manually frame-annotated Textual Entailment corpus, to study the role of frame semantics in RTE

FATE annotation process: an example 28/05/20087 / 17FATE - Marco Pennacchiotti Collins synt. an. full-text annotation (all words considered) [Ruppenhofer,2007]

FATE annotation process: an example 28/05/20088 / 17FATE - Marco Pennacchiotti frame FEE Collins synt. an.

FATE annotation process: an example 28/05/20089 / 17FATE - Marco Pennacchiotti frame FE Collins synt. an. FEE FE filler Maximization principle: chose the largest constituent possible when annotating

Annotation Schema Intuition: annotate as FEE only those words evoking a relevant situation (frame) in the sentence at hand – Very intuitive flavor, but high agreement: 83% on a pilot set of 15 sentences Relevance Principle “Authorities in Brazil hold 200 people as hostage” LEADERSHIPDETAINPEOPLE KIDNAPPING 28/05/ / 17FATE - Marco Pennacchiotti V ICTIM P LACE P ERPETRATOR

Annotation Schema On T of positive pairs, annotate only the fragments (spans) contributing to the inferential process – Spans are obtained from the ARTE annotation [Garoufi,2007] – For negative pairs it is not straightforward to derive spans, hence we do full annotation Span Annotation T: “Soon after the EZLN had returned to Chiapas, Congress approved a different version of the COCOPA Law, which did not include the autonomy clauses, claiming they were in contradiction with some constitutional rights (private property and secret voting); this was seen as a betrayal by the EZLN and other political groups.” H: “EZLN is a political group.” 28/05/ / 17FATE - Marco Pennacchiotti

Annotation Schema Unknown frames: use an U NKNOWN frame for words evoking situations not present in the FrameNet database Anaphora Copula and support verbs Modal expressions Metaphors Existential constructions … Other guidelines 28/05/ / 17FATE - Marco Pennacchiotti

Corpus statistics Annotated pairs : 800 (400 positive, 400 negatives) Annotated frames : 4,500 : avg. 5.6 frames per pair : 1,600 frames in positive pairs : 2,800 in negative pairs Annotated roles : 9,500 :avg. 2.1 roles per frame Annotation time: 230 hours : 90 h for positive pairs (13 min/pair) : 140 h for negative pairs (21 min/pair) 28/05/ / 17FATE - Marco Pennacchiotti

FrameNet and RTE (simple case) 28/05/ / 17FATE - Marco Pennacchiotti Syntactic normalization – Active / Passive EDUCATIONAL_TEACHING( S TUDENT : ground soldiers / soldiers; M ATERIAL : virtual reality/ virtual reality )

(1)Resource coverage is too low (2)Models for predicate-argument inference are weak (3)Automatic annotation models (SRL) are not good enough to be safely used in RTE Implementation gap insights 28/05/ / 17FATE - Marco Pennacchiotti FrameNet coverage is good: – 373 Unknown frames (8 % of total frames) – Unknown roles 1 % of total roles Coverage is unlikely to be a limiting factor for using FrameNet in applications

(1)Resource coverage is too low (2)Models for predicate-argument inference are weak (3)Automatic annotation models (SRL) are not good enough to be safely used in RTE 28/05/ / 17FATE - Marco Pennacchiotti To better study predicate-argument inference in RTE To experiment frame-RTE models on a gold-std corpus To learn better SRL models, by training on FATE Corpus is freely available on-line Why should you use FATE ?

Thank you! Questions? 28/03/2008FATE – Marco Pennacchiotti17 / 17 FATE download: