Presentation is loading. Please wait.

Presentation is loading. Please wait.

Summarization using Event Extraction Base System 01/12 KwangHee Park.

Similar presentations


Presentation on theme: "Summarization using Event Extraction Base System 01/12 KwangHee Park."— Presentation transcript:

1 Summarization using Event Extraction Base System 01/12 KwangHee Park

2 Research goal  Summarize the article by categorize the subject of article  Not just extract key sentence but rearrange the sentence by subject of event  Easily understand what happen each subject

3 Research goal  Extract event and rearrange them by subject The north Launched 170 artillery shells Used both direct-firing guns and howitzers … South Korean forces Fired back only 80 shells … South Korean marines First evacuated to safe places … Summarization from raw text

4 Architecture On the other hand, it’s turning out to be another very bad financial week for Asia. The financial assistance from the World Bank and the International Monetary Fund are not helping. In the last twenty four hours, the value of the Indonesian stock market has fallen by twelve percent. The Indonesian currency has lost twenty six percent of its value. In Singapore, stocks hit a five year low. In the Philippines, a four year low. And in Hong Kong, a three percent drop. More in Hong Kong for a place, for an economy, that many experts thought was once invincible Event recognizer Subject assigner Categorizer Raw text

5 Architecture Event recognizer Subject assigner Categorizer Raw text On the other hand, it’s turning out to be another very bad financial week for Asia. The financial assistance from the World Bank and the International Monetary Fund are not helping. In the last twenty four hours, the value of the Indonesian stock market has fallen by twelve percent. The Indonesian currency has lost twenty six percent of its value. In Singapore, stocks hit a five year low. In the Philippines, a four year low. And in Hong Kong, a three percent drop. More problems in Hong Kong for a place, for an economy, that many experts thought was once invincible

6 Architecture Event recognizer Subject assigner Categorizer Raw text On the other hand, it’s turning out to be another very bad financial week for Asia. The financial assistance from the World Bank and the International Monetary Fund are not helping. In the last twenty four hours, the value of the Indonesian stock market has fallen by twelve percent. The Indonesian currency has lost twenty six percent of its value. In Singapore, stocks hit a five year low. In the Philippines, a four year low. And in Hong Kong, a three percent drop. More problems in Hong Kong for a place, for an economy, that many experts thought was once invincible

7 Architecture Event recognizer Subject assigner Categorizer Raw text Indonesian stock market Fallen by twelve percent Indonesian currency Lost twenty six percent Singapore stock Five year low The Philippines stocks Four year low Hong Kong stock Three percent drop

8 Event Extraction  Event  An instance of a topic identified at document level describing something that happen  Event extraction  Extract event with their argument from the text  Example :  The Nasdaq Financial index lost about 1%,or 3.95, to 448.80.  The Nasdaq Financial Index lost about 1%, or 3.95, to 448.80.

9 Event recognizer  Recognize whether the word is used as event or not  The Nasdaq Financial Index lost about 1%, or 3.95, to 448.80.  In this example, only the word ‘lost’ is used as event word.

10 Event recognizer  Rule-based recognition  Training Feature  POS tag only  Any verb pos tagged word except be verb and have verb  Word dependency with POS tag – standard Stanford word dependency  55 number of grammatical binary relations.  Bi-gram POS tagged context

11 Experiment  Corpus  Timebank 1.1 annotated corpus  176 number of document  2603 number of sentences  7168 number of events  Use  Stanford parser  Stanford POS tagger  3-fold cross validation

12 Result PrecisionRecallF-measure Dependency rule53.457.455.2 Pos tag rule71.9645.0154.7 Both60.4153.3755.4

13 Subject assigner  Select Subject of given event word or phase  Subject means the main agent of given event  Step1  Make set of candidate subject  Step2  make relevant subject-event fair

14 Subject assigner – Baseline feature  Step1  Make deepest depth NP chunk from parser tree  Step2  Assign right foreword NP chunk to Event word  EX) Finally today, we learned that the space agency has finally taken a giant leap forward. NP NP1 Event NP2 Event NP3 We – learned The space agency - taken Result

15 Experiment result  Corpus  Manually annotated corpus based on TimeBank 1.1 Corpus  100 sentence containing 158 number of event  Result  82 / 158 = 52% accuracy

16 Conclusion  So far I Implement base line System  Need to improve each component by accuracy  Each of component need to be solved different problem  Event recognizer, Subject assigner : need more suitable feature  Categorizer : how to treat the pronoun type subject Event recognizer Subject assigner Categorizer Event recognizer Subject assigner Categorizer

17 Thanks


Download ppt "Summarization using Event Extraction Base System 01/12 KwangHee Park."

Similar presentations


Ads by Google