Presentation is loading. Please wait.

Presentation is loading. Please wait.

Midterm Progress Report Stanley Roberts July 17, 2009.

Similar presentations


Presentation on theme: "Midterm Progress Report Stanley Roberts July 17, 2009."— Presentation transcript:

1 Midterm Progress Report Stanley Roberts July 17, 2009

2 TimeML Time Tagging  Attempt to identify time references in text.  Interpreted the identified cases  Convert information into standard template  Extract time, date, duration  Tag using TimeML standard

3 Segmenting Sentences I will see you tomorrow morning. I will see you tomorrow morning. All information is preserved Words separated by spaces Segmentation is indiscriminate and blind to special cases

4 Chunking Compound Words The site dates to the stone age. The site dates to the stone age. The site dates to the stone age.

5 Tagging Words  IN – Prepositions  MONEY  NUM – pure numbers  POINT – this, that, next, last  POSTPROP – Postpositions “ago”  QTY – Quantity “many”, “few”  RELATIVE – “later”  TIME – “year”, “month”  TIMEPROP – proper names, “Wednesday”

6 Tagging Words (page 2) on Saturday January 29, 1955 we went to the park. OnIN SaturdayTP JanuaryTP 29NUM 1955NUM we went toIN the park. I will be here tomorrow. I will be here tomorrow.TIME

7 Chunking Time Related Phrases  Combinations of tagged words are matched to predefined templates  The templates attempt to find relevant results and filter noise.

8 Chunking Time Related Phrases on Saturday January 29, 1955 we went to the park. OnIN SaturdayTP JanuaryTP 29NUM 1955NUM we went toIN the park. INTTNN – matched to template IN – not matched, reference ignored

9 Value Extraction on Saturday January 29, 1955 we went to the park. OnIN SaturdayTP JanuaryTP 29NUM 1955NUM we went toIN the park. INTTNN – matched to template Converts extracted data to standard format defined by TimeML Annotation Guidelines. value=“YYYY-MM-DD” value=“1955-01-29”

10 Value Extraction – Smart Tag Tomorrow we will go to the park. TomorrowTIME we will go toIN the park. T – matched to template Using contextual date from document From last slide value=“1955-01-29” value=“1955-01-30” ->uses context Attempts to update contextual data with most recent information

11 Value Extraction – Smart Tag Monday we will go to the park. MondayTIMEPROP we will go toIN the park. TP – matched to template Using contextual date from document From last slide value=“1955-01-29” Saturday value=“1955-01-31” ->uses context Attempts to update contextual data with most recent information

12 Value Extraction – Smart Tag Monday we will go to the park. MondayTIMEPROP 10:30TIMEPROP we will go toIN the park. TPTP – matched to template Using contextual date from document From last slide value=“1955-01-29” Saturday value=“1955-01-31T10:30” ->uses context Attempts to update contextual data with most recent information

13 Type Extraction – TimeX3 std.  TimeX3 specifies time phrases should be tagged with one of three types  Date - value=“1955-1-29”  Time - value=“1955-1-29T24:00”  Time - value=“T24:00”  Duration – “4 months” -> value=“P4M”  Duration – “20 minutes” -> value=“PT20M”


Download ppt "Midterm Progress Report Stanley Roberts July 17, 2009."

Similar presentations


Ads by Google