Jointly Identifying Temporal Relations with Markov Logic Katsumasa Yoshikawa †, Sebastian Riedel ‡, Masayuki Asahara †, Yuji Matsumoto † † Nara Institute of Science and Technology, Japan ‡ University of Massachusetts, Amherst ACL-IJCNLP 2-7 August, 2009 Suntec Singapore
2 Outline Background and Motivation Related work of temporal relation identification Proposed global approach with Markov Logic Experimental setup and highlighted data Summary and future work
3 Background and Motivation Temporal Relation Identification (temporal ordering) Identifying temporal orders of events and time expressions in a document introduction PresentPastFuture Document Creation Time (August 2009) became BEFORE 2003 Essential work for document understanding With the introduction of the TimeBank corpus (Pustejovsky et al., 2003), machine learning approaches to temporal ordering became possible.
4 Outline Background and Motivation Related work of temporal relation identification Proposed global approach with Markov Logic Experimental setup and highlighted data Summary and future work
5 EVENT / TIME AFTER IAFTER ENDS DURING BEGUN_BY SIMULTANEOUS BEGINS ENDED_BY IBEFORE BEFORE > mi oi f d si = s c fi o m < after met-by overlapped-by finishes during started-by equal starts contains finished-by overlaps meets before TimeML (11 Labels) Allen’s (13 Labels) INCLUDES Allen‘s Temporal Logic [Allen 1983] TimeML and TimeBank [Pustejovsky et al. 2003] We regard temporal ordering as a classification task With TimeML, the TimeBank corpus was created
6 TempEval (SemEval 2007 Task 15) Temporal Relation Identification in SemEval 2007 Shared Task (TempEval) Six temporal relation labels Main Label (BEFORE, AFTER , OVERLAP) Sub-Label (BEFORE-OR-OVERLAP, OVERLAP-OR-AFTER, VAGUE) TempEval includes three types of tasks (A, B, and C)
7 introduction DCT (August 2009) became 2003 OVERLAP Task A of TempEval Temporal relations between events and time expressions that occur within the same sentence PastFuturePresent With the introduction of the TimeBank corpus (Pustejovsky et al., 2003), machine learning approaches to temporal ordering became possible.
8 introduction DCT (August 2009) became 2003 Task B of TempEval Temporal relations between events and the Document Creation Time (DCT) BEFORE PastFuturePresent With the introduction of the TimeBank corpus (Pustejovsky et al., 2003), machine learning approaches to temporal ordering became possible.
9 BEFORE created DCT (August 2009) became 2003 Task C of TempEval Temporal relations between the main events of adjacent sentences PastFuturePresent The TimeBank corpus was created (Pustejovsky et al., 2003). As a result, machine learning approaches to temporal ordering became possible.
10 Issues of the TempEval Participants Local approaches with machine learning are employed by many participants in TempEval Considering only a single relation at a time Local approach cannot take into account the other relations A global approach can be useful in that case EVENT 1 EVENT 2 BEFORE (Task C)AFTER ? (Task C) DCT EVENT 1 BEFORE (Task B) EVENT 2 AFTER (Task B) DCT
11 A global approach can be useful in that case Issues of the TempEval Participants Local approaches with machine learning are employed by many participants in TempEval Considering only a single relation at a time Local approach cannot take into account the other relations EVENT 1 EVENT 2 BEFORE (Task C) DCT BEFORE (Task B) AFTER (Task B)
12 Outline Background and Motivation Related work and task reviews of temporal relation identification Proposed global approach with Markov Logic Experimental setup and highlighted data Summary and future work
13 Overview of Our Global Approach Ensure consistency among the multiple relations with hard and soft constraints based on the transition rules Jointly identify the three types of relations in TempEval Learning one global model for the three tasks Global approach with Markov Logic
14 Markov Logic [Richardson and Domingos, 2006] A Statistical Relational Learning framework An expressive template language of Markov Networks Not only hard but also soft constraints A Markov Logic Network (MLN) is a set of pairs (φ, w) where φ is a formula in first-order logic w is a real number weight Higher weight stronger constraint
15 ※ e1 and e2 are events An Example of Markov Logic Networks hasPastTense(a) : indicates that an event a has past tense beforeDCT(a) : indicates that an event a happens before the DCT before(a,b) : indicates that an event a happens before another event b IDWeight functionWeigh valueGround formula (A1) w a (e1) 3.1 hasPastTense(e1) ⇒ beforeDCT(e1) (A2) w a (e2) -0.9 hasPastTense(e2) ⇒ beforeDCT(e2) (B1)w b (e1,e2)1.7 beforeDCT(e1) ^ ¬ beforeDCT(e2) ⇒ before(e1, e2) hasPastTense(e1) beforeDCT(e1) w a (e1) beforeDCT(e2) w b (e1,e2) before (e1,e2)hasPastTense(e2) w a (e2) grounding
16 Global Feature Representation (Predicate Definition) relE2T(e, t, r) : the relation r between an event e and a time expression t relDCT(e, r) : the relation r between an event a and the DCT relE2E(e1, e2, r) : the relation r between two events e1 and e2 relT2T(t1, t2, r) : the relation r between two time expressions t1 and t2 dctOrder(t, r) : the relation r between a time expression t and the DCT EVENT (e1) DCT TIME (t1) EVENT (e2) TIME (t2) relE2E (C) relDCT (B) relE2T (A) dctOrder relT2T relDCT (B) relE2T (A)
17 We jointly solve the three tasks of TempEval We use global features named Joint formulae A joint formula is based on a transition rule EVENT (e1) DCT EVENT(e2) BEFOREAFTER BEFORE BEFORE & AFTER ⇒ BEFORE EVENT (e2) DCT EVENT(e1) AFTER BEFORE BEFORE & AFTER ⇒ BEFORE If e1 happens before DCT and e2 happens after DCT => then e1 is before e2 If e1 happens before DCT and e1 happens after e2, => then e2 happens before DCT Global Feature Representation (Transition Rules) B→CC→B
18 Global Feature Representation (Templates of the all Joint Formulae) TasksJoint Formula (first-order logic) A→B dctOrder(t1,r) & relE2T(e1, t1, r1) ⇒ relDCT(e1,r2) B→A dctOrder(t1,r) & relDCT(e1,r1) ⇒ relE2T(e1, t1, r2) B →C relDCT(e1, r1) & relDCT(e2, r2) ⇒ relE2E(e1, e2, r3) C→B relDCT(e2, r1) & relE2E(e1,e2, r2) ⇒ relDCT(e1, r3) A→C relE2T(e1,t1,r1) & relT2T(t1,t2,r2) & relE2T(e2,t2,r3) ⇒ relE2E(e1,e2,r4) C→A relE2T(e2,t2,r2) & relT2T(t1,t2,r1) & relE2E(e1,e2,r3) ⇒ relE2T(e1,t1,r4) They are developed with events, time expressions and relations
19 Global Feature Representation (Templates of the all Joint Formulae) TasksJoint Formula (first-order logic) A→B dctOrder(t1,r) & relE2T(e1, t1, r1) ⇒ relDCT(e1,r2) B→A dctOrder(t1,r) & relDCT(e1,r1) ⇒ relE2T(e1, t1, r2) B →C relDCT(e1, BEFORE) & relDCT(e2, AFTER) ⇒ relE2E(e1, e2, BEFORE) C→B relDCT(e1, BEFORE) & relE2E(e1,e2, AFTER) ⇒ relDCT(e2, BEFORE) A→C relE2T(e1,t1,r1) & relT2T(t1,t2,r2) & relE2T(e2,t2,r3) ⇒ relE2E(e1,e2,r4) C→A relE2T(e2,t2,r2) & relT2T(t1,t2,r1) & relE2E(e1,e2,r3) ⇒ relE2T(e1,t1,r4) They are developed with events, time expressions and relations
20 Outline Background and Motivation Related work and task reviews of temporal relation identification Proposed global approach with Markov Logic Experimental setup and highlighted data Summary and future work
21 Experimental Setup Use a MLN Engine “Markov thebeast” Weight learning : MIRA Inference : Cutting Plane Inference (base solver: ILP) [Riedel, 2008] Employ the local features referred to the early work in TempEval [SemEval, 2007] Select joint formulae as global features Use the same data and evaluation schemes of TempEval
22 Comparison of Local and Global LocalGlobal Task A (+0.049) Task B (+0.010) Task C (+0.019) All (+0.022) Results with 10-fold cross validation on training data Over all tasks, Global is better than Local On Task A, Global model outperformed Local one. ρ< 0.01 (McNemar’s test, 2-tailed) ※ All scores denote F1-value
23 Results with the other systems on test data (F1-value) Comparison to State-of-the-art Outperformed the others on Tasks A and C Always performed better than the best pure machine-learning based system (CU-TMP [Bethard and Martin, 2007]) Other Systems TempEval Best TempEval AverageCU-TMP Task A Task B Task C Our Systems LocalGlobal ※ All scores denote F1-value
24 Outline Background and motivation Related work and task reviews of temporal relation identification Proposed global approach with Markov Logic Experimental setup and highlighted data Summary and future work
25 Summary We proposed a global framework with Markov Logic for Temporal Relation Identification Our global model with joint formulae successfully improved the performances of the identifications Our approach reported the competitive results among all participants in TempEval
26 Future Work Issues inherent to the task and the dataset Low inter annotator agreement Low transitive connectivity Small size Semi-supervised approaches ease some issues TRAINDEVTESTTOTAL Task A Task B Task C Numbers of labeled relations for all tasks and datasets
27
28 Previous Global Framework of Temporal Relation Identification Used Integer Linear Programming (ILP) [Chambers and Jurafsky, 2008] Minimize contradictions of local classifiers’ outputs by building ILP constraint problems Target only relations between events Identify only BEFORE, AFTER, UNKNOWN Manually construct ILPs Manually constructing ILP is often painful work, especially when we need too many constraints
29 29 / 22 Used Data (TempEval) TimeML format (base on TimeBank) events, time expressions, temporal relations Inter annotator agreement scores 72% on Tasks A and B, 68% on Task C TRAINDEVTESTTOTAL Task A Task B Task C Numbers of labeled relations for all tasks and datasets
30 The Distribution on the labels in TempEval TypeTask ATask BTask C BEFORE OVERLAP AFTER BEFOR-OR-OVERLAP OVERLAP-OR-AFTER35 54 VAGUE
31 31 / 22 Evaluation Schemes Strict scoring scheme Give a full credit if the relations match, and no credit otherwise Relaxed Scoring Scheme Give credit based on the score table BeforeOverlapAfterB-OO-AVague Before Overlap After Before-Or-Overlap Overlap-Or-After Vague
32 Comparison of Local and Global LocalGlobal strict (F1)relaxed (F1)strict (F1)Relaxed (F1) Task A (+0.049)0.691 (+0.046) Task B (+0.010)0.819 (+0.009) Task C (+0.019)0.623 (+0.015) All (+0.022)0.727 (+0.020) Results with 10-fold cross validation on training data Over all tasks, Global is better than Local On Task A, Global model outperformed Local one. ρ< 0.01 (McNemar’s test, 2-tailed)
33 Results with the other systems on test data Comparison to State-of-the-art Global Model outperformed the others, especially on Tasks A and C Our system always performed better than the best pure machine-learning based system (CU-TMP) Task ATask BTask C strictrelaxedstrictrelaxedstrictrelaxed TempEval Best TempEval Average CU-TMP Local Model Global Model