Midterm Progress Report Stanley Roberts July 17, 2009
TimeML Time Tagging Attempt to identify time references in text. Interpreted the identified cases Convert information into standard template Extract time, date, duration Tag using TimeML standard
Segmenting Sentences I will see you tomorrow morning. I will see you tomorrow morning. All information is preserved Words separated by spaces Segmentation is indiscriminate and blind to special cases
Chunking Compound Words The site dates to the stone age. The site dates to the stone age. The site dates to the stone age.
Tagging Words IN – Prepositions MONEY NUM – pure numbers POINT – this, that, next, last POSTPROP – Postpositions “ago” QTY – Quantity “many”, “few” RELATIVE – “later” TIME – “year”, “month” TIMEPROP – proper names, “Wednesday”
Tagging Words (page 2) on Saturday January 29, 1955 we went to the park. OnIN SaturdayTP JanuaryTP 29NUM 1955NUM we went toIN the park. I will be here tomorrow. I will be here tomorrow.TIME
Chunking Time Related Phrases Combinations of tagged words are matched to predefined templates The templates attempt to find relevant results and filter noise.
Chunking Time Related Phrases on Saturday January 29, 1955 we went to the park. OnIN SaturdayTP JanuaryTP 29NUM 1955NUM we went toIN the park. INTTNN – matched to template IN – not matched, reference ignored
Value Extraction on Saturday January 29, 1955 we went to the park. OnIN SaturdayTP JanuaryTP 29NUM 1955NUM we went toIN the park. INTTNN – matched to template Converts extracted data to standard format defined by TimeML Annotation Guidelines. value=“YYYY-MM-DD” value=“ ”
Value Extraction – Smart Tag Tomorrow we will go to the park. TomorrowTIME we will go toIN the park. T – matched to template Using contextual date from document From last slide value=“ ” value=“ ” ->uses context Attempts to update contextual data with most recent information
Value Extraction – Smart Tag Monday we will go to the park. MondayTIMEPROP we will go toIN the park. TP – matched to template Using contextual date from document From last slide value=“ ” Saturday value=“ ” ->uses context Attempts to update contextual data with most recent information
Value Extraction – Smart Tag Monday we will go to the park. MondayTIMEPROP 10:30TIMEPROP we will go toIN the park. TPTP – matched to template Using contextual date from document From last slide value=“ ” Saturday value=“ T10:30” ->uses context Attempts to update contextual data with most recent information
Type Extraction – TimeX3 std. TimeX3 specifies time phrases should be tagged with one of three types Date - value=“ ” Time - value=“ T24:00” Time - value=“T24:00” Duration – “4 months” -> value=“P4M” Duration – “20 minutes” -> value=“PT20M”