Download presentation
Presentation is loading. Please wait.
1
Illinois CCG LoReHLT16 Situation Frame System
Yangqiu Song, Chen-Tse Tsai, Stephen Mayhew, Mark Sammons, Dan Roth Department of Computer Science University of Illinois at Urbana-Champaign
2
Overview We focus on two metrics
SFType Evaluate (document ID, SF type) SFType + Place Evaluate (document ID, SF type, place mention) Dataless Classification [Chang et al., 2008; Song and Roth, 2014] Build label descriptions Compare similarity between document and labels in Wikipedia title space
3
Two Dataless Classifiers
Earthquake classifier (Binary) Whether the input text is earthquake related Collect all Wikipedia articles that contain “earthquake” in the title Use TF-IDF scores to select top 1,000 words as the description SF Type classifier (8 classes) Classify input text into the 8 SF types Use the descriptions of labels in the annotation guideline
4
Supervised Classifiers
Data Collection Google/Bing/Yahoo: “earthquake” + SF label names 3,588 documents 1,168 documents after removing documents containing more than one SF label keywords Negative documents 10,000 general documents not related to earthquake in Google News Features are TF-IDF scores of words Earthquake Classifier (80% training, 20% testing) F1: SF Type Classifier (80% training, 20% testing) Avg F1: (dataless: ) 1,168 documents Avg F1:
5
Cross-Lingual Classification
Chinese (Dry run) Use the parallel corpus, directly mapped to English Uyghur Use Uyghur-English dictionary to map each word to English The word-to-word translation we used in NER
6
Two Mechanisms Checkpoint 1, 2 Checkpoint 3
Take GPE and LOC mentions from NER Sentence Classification (Binary Classifier) Is there GPE or LOC in the sentence? No Use binary classifier to remove negative examples Yes SF Type classification Use binary classifier to choose the best entity Use the nearest entity before the sentence Output SF Type classification Output
7
Results The best score before checkpoint 3 Checkpoint 3 SFType: 1.285
SFType + Place: 1.5 Use NER results from CP1 (29.4 F1) Checkpoint 3 Use NER results from CP3 (60.2 F1) Chinese Uyghur Model Parameter Prec. Rec. F1 SFType+Place SFType Supervised 27.1 16.5 20.5 1.280 5.050 5.299 Dataless 0.002 20.7 23.4 21.9 1.663 3.535 4.065 0.006 20.6 14.1 16.7 1.403 1.929 2.337 0.010 24.1 12.6 1.271 1.604 1.896 0.014 24.3 10.1 14.3 1.214 1.491 1.694
8
Discussion Entailment is needed
Annotators should always create a Need Frame for needs that are explicitly discussed in the document. The needs don’t have to be mentioned by name to be included. For instance, if the document says that “People haven’t eaten in days”, then it’s obvious that there is a Food Supply need even though the words “Food Supply” are not used. Annotators should also create a Need Frame if they believe, after reading the document, that the need exists -- even if the document does not directly discuss the need. For instance, the following example contains a Medical Assistance need: Following seasonal monsoons, more than 43,000 people were being treated for diarrhea in Bangladesh, said government health adviser Matiur Rahman.
9
Discussion Background knowledge is needed
The annotator should create Shelter and Infrastructure Need Frames, Annotators should not go too far with inferring needs that aren’t directly discussed or strongly implied by the document. It’s entirely plausible that the citizens fled to safety in advance of the incident and that the food and water supply chains in this region are designed to withstand typhoon damage. Therefore, no Need Frames for Food Supply, Water Supply or Medical Assistance should be created for this example. A typhoon has demolished the city of Tacloban.
10
Discussion: Place Mention
Co-reference is needed For instance, the document might contain the following text in the first paragraph: Several paragraphs later, the document may say: Guinsaugon is the Place for the Shelter Need Frame This example also highlights the issue of entity granularity for GPE and LOC entities. The village Guinsaugon (GPE) is located on the island Leyte (LOC) in the country Philippines (GPE). The annotator should not create separate Shelter Need Frames for Leyte or the Philippines. Eyewitnesses said a landslide hit the village of Guinsaugon in the south of the Philippine island of Leyte. Governor Rosette Lerias described the village as totally flattened with virtually all housing destroyed.
11
Discussion: Place Mention
Sometimes the place is not mentioned explicitly Besides the Shelter Need Frame with Guinsaugon as the Place, this passage also discusses housing issues in “the surrounding region”. We know from reading the document that Guinsaugon is located on the island of Leyte. Therefore, we should create another Shelter Need Frame with Leyte as the Place. Eyewitnesses said a landslide hit the village of Guinsaugon in the south of the Philippine island of Leyte. Governor Rosette Lerias described the village as totally flattened with virtually all housing destroyed both in Guinsaugon and in the surrounding region.
12
Discussion: Place Mention
Sometimes the need exists in multiple places simultaneously Based on this passage, annotators should create separate Infrastructure Need Frames for all of these Places: Ayeyawaddy, Yangon, Bago, Mon, Kayin. A deadly tropical cyclone Nargis, which occurred over the Bay of Bengal, hit five divisions and states -- Ayeyawaddy, Yangon, Bago, Mon and Kayin on May 2 and 3, of which Ayeyawaddy and Yangon inflicted the heaviest casualties and infrastructural damage.
13
Conclusion and Discussion
Our current system is purely data-driven Observe patterns in the data Leverage external knowledge to enrich the representation Leverage external data (from Google) to build classifiers Annotation has more background knowledge Textual entailment Hierarchy of GPE/LOC Background knowledge about the disasters Human inference process Future work Better IL representation How to incorporate background knowledge and reasoning into models?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.