Presentation is loading. Please wait.

Presentation is loading. Please wait.

TimeBank Status Status of TimeML annotation for the ULA project James Pustejovsky and Marc Verhagen Brandeis University.

Similar presentations


Presentation on theme: "TimeBank Status Status of TimeML annotation for the ULA project James Pustejovsky and Marc Verhagen Brandeis University."— Presentation transcript:

1 TimeBank Status Status of TimeML annotation for the ULA project James Pustejovsky and Marc Verhagen Brandeis University

2 TimeML, TimeBank and TempEval

3 TimeML Annotation language for times, events and the links between them A TimeML annotation is a graph with events and times as the nodes and temporal links as the edges –Any eventuality in a document –Some states are annotated, some aren’t –Time expressions

4 TimeML Links –Subordinating relations –Perception, intentional states and actions, reporting, modal contexts He saw an explosion The meeting was canceled I expect no improvements –Temporal relations, using a set of relation types based on the interval algebra of James Allen

5 TimeML Example The Soviet Union said today it had sent an envoy to the Middle East. <MAKEINSTANCE eventID="e12" eiid="ei12" tense="PAST" aspect="NONE" pos="VERB"/>

6 Timebank Annotation in conjunction with emerging specifications, guidelines and annotation tools –Annotated corpus as proof of concept for the TimeML language –Dynamic specifications –Experimental tools (tango, timebank browser) Inline XML Available for free through the LDC http://timeml.org

7 Timebank Issues (1) Small corpus (60K tokens) –Too small to be useful for machine learning –Small overlap with Propbank and Nombank Slow annotation process –No automatic pre-processing –No use of previous structure Informal quality control –No dual annotation, no 90% rule –Guidelines incomplete and not enforced

8 Timebank Issues (2) Lowish inter-annotator agreement –Annotators do not create the same Tlinks –If they do, they only agree 77% of the time Anecdotal evidence of inconsistencies in annotation –32 documents of TimeBank version 1.1 were inconsistent Annotators do not create the same Tlinks (regardless of relation type) Inline XML inhibits interoperability

9 TempEval SemEval 2007 workshop Three subtasks –Task A: Event-Time in same sentence –Task B: Event-DCT –Task C: Main events in consecutive sentences TimeML Light for TempEval corpus –Limited set of relations, defined as disjunctions over TimeML relations –before, after, overlap, before-or-overlap, overlap-or-after, vague

10 Annotator GUI - Task C

11 Judge GUI - Tasks A and B

12 TempEval Advantages Consistency in annotation –Data for each task prepared automatically –all annotators add the same TLINKs Discrete tasks are simple (in some sense) Easy pair-wise evaluation Much faster annotation –About 10 times faster than for Timebank

13 TempEval Issues (1) Still low inter-annotator agreement –Task A: 69% –Task B: 74% –Task C: 65% Choice of relations Need more than three tasks Ranking of tasks would be useful

14 TempEval Issues (2) Inconsistencies still possible –Task B walk e7 BEFORE DCT talk e8 AFTER DCT –Task C walk e7 SIMULTANEOUS talk e8

15 Task Decomposition

16 Decomposition Annotation as unstructured task is complex –Leaves a lot of freedom to annotators –Creation of guidelines is hard Split into subtasks –Annotation is faster on subtasks –Tasks can be evaluated separately which has advantages for automatic tagging –Guidelines for each task –Structures workflow

17 Annotation Tasks 1.anchoring a nominal event to a time expression in its immediate context the April blizzard 2.anchoring a verbal event to a time expression that is governed by the event (a temporal adjunct) we had lift-off at 8pm 3.ordering consecutive events in a sentence he walked over thinking about the consequences 4.determining the temporal relation between two dates

18 Annotation Tasks 5.ordering events that occur in syntactic subordination relations event subject with governing verb event the massive explosion shook the building verbal event with object event they observed the election reporting event with subordinated event the witness said it happened too fast perception event with subordinated event she heard an explosion an intentional process or state with subordinated event I want to sleep for a week

19 Annotation Tasks 6.ordering events in coordinations walking and talking 7.anchoring an event to the document creation time (can be split up according to the event's class) 8.ordering two main non-reporting events in consecutive sentences John fell after the marathon. He got hurt. 9.ordering two arguments in a discourse relation I am resting because I just lifted a barrel of rum.

20 Counts 1Nominal event to time expression1 2Verbal event to time expression13 3Consecutive events in sentence61 4Temporal relation between two dates6 5Event subject with governing verbal event6 Verbal event with event object12 Reporting event with subordinated event14 Perception event with subordinated event0 Intensional event with subordinated event18 6Events in coordinations2 7Event with document creation time104 8Two main non-reporting events35 9Two arguments in discourse relation9 (Measured over two TimeBank documents, ABC19980120.1830.0957 and ABC19980108.1830.0711, with 104 events and 13 time expressions)

21 The 90% Rule Used in OntoNotes and PropBank Reshuffle senses for a word if IAA < 90%, mark word if IAA remains low Not possible for us since we cannot discard a relation if IAA is too low But this can be done on a task-by-task basis Try to pick relation sets for each task with high IAA in mind

22 Relation Sets Allow different relation sets for sub tasks –Time-Event in noun phrase could use specific relations –Event-Event in conjunctions uses more vague TempEval-like relations Restriction: each relation in a relation set can be mapped to a disjunction of TimeML relations

23 Composition Collect all tlinks and check for inconsistencies –We know there are no task internal inconsistencies –Semi-automatically resolve conflicts Some tasks have higher IAA and precision –Constraint propagation (aka temporal closure) Global annotation with a graphical tool

24 TBox

25 Connecting Subgraphs

26 Layered Annotation

27 Using Syntax Syntactic definition of most tasks Use TreeBank annotation Allows automatic creation of tasks using scripts that traverse the tree PP inside VP with event verb –(wsj_0032 and wsj_0135) –is scheduled VG [to expire] PP [at the end of November] –also said it VG [expects to post] NP [sales] PP [in the current fiscal year]

28 Argument Relations NomBank support verbs –Give a demonstration Argument relation between two events –Sometimes indicates that there is an SLINK PropBank modifier ARGM-TMP –ARGM-TMP usually is a TIMEX3 –TLINK with the head of the ARGM-TMP Discourse Treebank args –The guest ran away because dinner was served late

29 Last Words


Download ppt "TimeBank Status Status of TimeML annotation for the ULA project James Pustejovsky and Marc Verhagen Brandeis University."

Similar presentations


Ads by Google