TimeBank Status Status of TimeML annotation for the ULA project James Pustejovsky and Marc Verhagen Brandeis University.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

PRACTICE CLASS #10 (#11) /30 Complex Sentence PRACTICE CLASS #10 (#11) /30.
Exploring the Effectiveness of Lexical Ontologies for Modeling Temporal Relations with Markov Logic Eun Y. Ha, Alok Baikadi, Carlyle Licata, Bradford Mott,
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Feature Structures and Unification.
Evaluating Heuristics for the Fixed-Predecessor Subproblem of Pm | prec, p j = 1 | C max.
Jointly Identifying Temporal Relations with Markov Logic Katsumasa Yoshikawa †, Sebastian Riedel ‡, Masayuki Asahara †, Yuji Matsumoto † † Nara Institute.
Syntax. Definition: a set of rules that govern how words are combined to form longer strings of meaning meaning like sentences.
Syntactic analysis using Context Free Grammars. Analysis of language Morphological analysis – Chairs, Part Of Speech (POS) tagging – The/DT man/NN left/VBD.
Sequence Classification: Chunking Shallow Processing Techniques for NLP Ling570 November 28, 2011.
Layering Semantics (Putting meaning into trees) Treebank Workshop Martha Palmer April 26, 2007.
June 6, 20073rd PIRE Meeting1 Tectogrammatical Representation of English in Prague Czech-English Dependency Treebank Lucie Mladová Silvie Cinková, Kristýna.
Adverbs Words which are used to modify verbs or adjectives are usually referred to as adverbs. For instance, the adverbs in the following sentences are.
Semantics and Time in Language MAS.S60 Rob Speer Catherine Havasi Some slides: James Pustejovsky.
Sentence Blending and Variation. Start with two simple sentences. My friend likes to play a game. The game is soccer.
April 26th, 2007 Workshop on Treebanking, HLT/NAACL, Rochester 1 Layering of Annotations in the Penn Discourse TreeBank (PDTB) Rashmi Prasad Institute.
Semantic Annotation Meeting April 14, 2005 NomBank & the Down-to-Earth Parts of Pie-in-the-Sky Adam Meyers New York University April 14, 2004.
Drawing TimeML Relations
 Christel Kemke 2007/08 COMP 4060 Natural Language Processing Word Classes and English Grammar.
Annotation Types for UIMA Edward Loper. UIMA Unified Information Management Architecture Analytics framework –Consists of components that perform specific.
1 Annotation Guidelines for the Penn Discourse Treebank Part B Eleni Miltsakaki, Rashmi Prasad, Aravind Joshi, Bonnie Webber.
DS-to-PS conversion Fei Xia University of Washington July 29,
TimeML Annotation Tool Suite Tutorial Using Callisto and Tango for TimeML Annotation 10/26/04.
NLP and Speech 2004 English Grammar
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Fall 2005-Lecture 2.
Natural Language Query Interface Mostafa Karkache & Bryce Wenninger.
Semi-Automatic Learning of Transfer Rules for Machine Translation of Low-Density Languages Katharina Probst April 5, 2002.
Workshop on Treebanks, Rochester NY, April 26, 2007 The Penn Treebank: Lessons Learned and Current Methodology Ann Bies Linguistic Data Consortium, University.
Thoughts on Treebanks Christopher Manning Stanford University.
Sentence Structure Ch. 7 p What is sentence structure? The structure of a sentence refers to the kinds and number of clauses it contains. There.
Embedded Clauses in TAG
Context Free Grammars Reading: Chap 12-13, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Paul Tarau, based on Rada.
Metadata generation and glossary creation in eLearning Lothar Lemnitzer Review meeting, Zürich, 25 January 2008.
Final Review.  Consists of 60 Multiple Choice Questions  Skills include:  Reading Comprehension  Commonly Confused Words  Subject-Verb Agreement.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
PHRASES & CLAUSES AND WHY COMMAS ARE IMPORTANT!. WORD CLASSES Every word in the English language belongs to a “class”. It will be one of the following:
Context Free Grammars Reading: Chap 9, Jurafsky & Martin This slide set was adapted from J. Martin, U. Colorado Instructor: Rada Mihalcea.
Motion Planning in Games Mark Overmars Utrecht University.
ENGLISH SYNTAX Introduction to Transformational Grammar.
Ideas for 100K Word Data Set for Human and Machine Learning Lori Levin Alon Lavie Jaime Carbonell Language Technologies Institute Carnegie Mellon University.
Splitting Complex Temporal Questions for Question Answering systems ACL 2004.
Semantic Construction lecture 2. Semantic Construction Is there a systematic way of constructing semantic representation from a sentence of English? This.
Minimally Supervised Event Causality Identification Quang Do, Yee Seng, and Dan Roth University of Illinois at Urbana-Champaign 1 EMNLP-2011.
TimeML compliant text analysis for Temporal Reasoning Branimir Boguraev and Rie Kubota Ando.
Rules, Movement, Ambiguity
Verbs That’s what’s happening!. A verb expresses an action, a feeling, or a state of being. Two main types of verbs are helping verbs and linking verbs.
Parts of Speech Major source: Wikipedia. Adjectives An adjective is a word that modifies a noun or a pronoun, usually by describing it or making its meaning.
Unit 8 Syntax. Syntax Syntax deals with rules for combining words into sentences, as well as with relationship between elements in one sentence Basic.
Supertagging CMSC Natural Language Processing January 31, 2006.
CS 484 Load Balancing. Goal: All processors working all the time Efficiency of 1 Distribute the load (work) to meet the goal Two types of load balancing.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
1 Introduction to Computational Linguistics Eleni Miltsakaki AUTH Spring 2006-Lecture 2.
Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.
Sentence Structure There will be a Sentence Structure Quiz this Friday, November 7th.
Form: Be+ V+ ing - For planned event or for definite intention, the present continuous may indicate future time. - The use of the present continuous is.
Learning Event Durations from Event Descriptions Feng Pan, Rutu Mulkar, Jerry R. Hobbs University of Southern California ACL ’ 06.
Annotating and measuring Temporal relations in texts Philippe Muller and Xavier Tannier IRIT,Université Paul Sabatier COLING 2004.
The Art About Statements Chapter 8 “Say what you mean and mean what you say” By Alexandra Swindell Class Four Philosophical Questions.
A Database of Narrative Schemas A 2010 paper by Nathaniel Chambers and Dan Jurafsky Presentation by Julia Kelly.
Writing 2 ENG 221 Norah AlFayez. Lecture Contents Revision of Writing 1. Introduction to basic grammar. Parts of speech. Parts of sentences. Subordinate.
Natural Language Processing Vasile Rus
the building blocks of sentences
COMMUNICATING IN THE WORKPLACE Sixth Canadian Edition
Welcome to M301 P2 Software Systems & their Development
Project Management (PS)
Intro to Grammar Notes: Conjunctions
Presentation by Julie Betlach 7/02/2009
Engleski jezik struke 3 Sreda,
CSCI 5832 Natural Language Processing
Linguistic Essentials
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
Presentation transcript:

TimeBank Status Status of TimeML annotation for the ULA project James Pustejovsky and Marc Verhagen Brandeis University

TimeML, TimeBank and TempEval

TimeML Annotation language for times, events and the links between them A TimeML annotation is a graph with events and times as the nodes and temporal links as the edges –Any eventuality in a document –Some states are annotated, some aren’t –Time expressions

TimeML Links –Subordinating relations –Perception, intentional states and actions, reporting, modal contexts He saw an explosion The meeting was canceled I expect no improvements –Temporal relations, using a set of relation types based on the interval algebra of James Allen

TimeML Example The Soviet Union said today it had sent an envoy to the Middle East. <MAKEINSTANCE eventID="e12" eiid="ei12" tense="PAST" aspect="NONE" pos="VERB"/>

Timebank Annotation in conjunction with emerging specifications, guidelines and annotation tools –Annotated corpus as proof of concept for the TimeML language –Dynamic specifications –Experimental tools (tango, timebank browser) Inline XML Available for free through the LDC

Timebank Issues (1) Small corpus (60K tokens) –Too small to be useful for machine learning –Small overlap with Propbank and Nombank Slow annotation process –No automatic pre-processing –No use of previous structure Informal quality control –No dual annotation, no 90% rule –Guidelines incomplete and not enforced

Timebank Issues (2) Lowish inter-annotator agreement –Annotators do not create the same Tlinks –If they do, they only agree 77% of the time Anecdotal evidence of inconsistencies in annotation –32 documents of TimeBank version 1.1 were inconsistent Annotators do not create the same Tlinks (regardless of relation type) Inline XML inhibits interoperability

TempEval SemEval 2007 workshop Three subtasks –Task A: Event-Time in same sentence –Task B: Event-DCT –Task C: Main events in consecutive sentences TimeML Light for TempEval corpus –Limited set of relations, defined as disjunctions over TimeML relations –before, after, overlap, before-or-overlap, overlap-or-after, vague

Annotator GUI - Task C

Judge GUI - Tasks A and B

TempEval Advantages Consistency in annotation –Data for each task prepared automatically –all annotators add the same TLINKs Discrete tasks are simple (in some sense) Easy pair-wise evaluation Much faster annotation –About 10 times faster than for Timebank

TempEval Issues (1) Still low inter-annotator agreement –Task A: 69% –Task B: 74% –Task C: 65% Choice of relations Need more than three tasks Ranking of tasks would be useful

TempEval Issues (2) Inconsistencies still possible –Task B walk e7 BEFORE DCT talk e8 AFTER DCT –Task C walk e7 SIMULTANEOUS talk e8

Task Decomposition

Decomposition Annotation as unstructured task is complex –Leaves a lot of freedom to annotators –Creation of guidelines is hard Split into subtasks –Annotation is faster on subtasks –Tasks can be evaluated separately which has advantages for automatic tagging –Guidelines for each task –Structures workflow

Annotation Tasks 1.anchoring a nominal event to a time expression in its immediate context the April blizzard 2.anchoring a verbal event to a time expression that is governed by the event (a temporal adjunct) we had lift-off at 8pm 3.ordering consecutive events in a sentence he walked over thinking about the consequences 4.determining the temporal relation between two dates

Annotation Tasks 5.ordering events that occur in syntactic subordination relations event subject with governing verb event the massive explosion shook the building verbal event with object event they observed the election reporting event with subordinated event the witness said it happened too fast perception event with subordinated event she heard an explosion an intentional process or state with subordinated event I want to sleep for a week

Annotation Tasks 6.ordering events in coordinations walking and talking 7.anchoring an event to the document creation time (can be split up according to the event's class) 8.ordering two main non-reporting events in consecutive sentences John fell after the marathon. He got hurt. 9.ordering two arguments in a discourse relation I am resting because I just lifted a barrel of rum.

Counts 1Nominal event to time expression1 2Verbal event to time expression13 3Consecutive events in sentence61 4Temporal relation between two dates6 5Event subject with governing verbal event6 Verbal event with event object12 Reporting event with subordinated event14 Perception event with subordinated event0 Intensional event with subordinated event18 6Events in coordinations2 7Event with document creation time104 8Two main non-reporting events35 9Two arguments in discourse relation9 (Measured over two TimeBank documents, ABC and ABC , with 104 events and 13 time expressions)

The 90% Rule Used in OntoNotes and PropBank Reshuffle senses for a word if IAA < 90%, mark word if IAA remains low Not possible for us since we cannot discard a relation if IAA is too low But this can be done on a task-by-task basis Try to pick relation sets for each task with high IAA in mind

Relation Sets Allow different relation sets for sub tasks –Time-Event in noun phrase could use specific relations –Event-Event in conjunctions uses more vague TempEval-like relations Restriction: each relation in a relation set can be mapped to a disjunction of TimeML relations

Composition Collect all tlinks and check for inconsistencies –We know there are no task internal inconsistencies –Semi-automatically resolve conflicts Some tasks have higher IAA and precision –Constraint propagation (aka temporal closure) Global annotation with a graphical tool

TBox

Connecting Subgraphs

Layered Annotation

Using Syntax Syntactic definition of most tasks Use TreeBank annotation Allows automatic creation of tasks using scripts that traverse the tree PP inside VP with event verb –(wsj_0032 and wsj_0135) –is scheduled VG [to expire] PP [at the end of November] –also said it VG [expects to post] NP [sales] PP [in the current fiscal year]

Argument Relations NomBank support verbs –Give a demonstration Argument relation between two events –Sometimes indicates that there is an SLINK PropBank modifier ARGM-TMP –ARGM-TMP usually is a TIMEX3 –TLINK with the head of the ARGM-TMP Discourse Treebank args –The guest ran away because dinner was served late

Last Words