Progress Update Lin Ziheng 6/26/2016. Outline  Update Summarization  Opinion Summarization  Discourse Analysis 6/26/2016.

Progress Update Lin Ziheng 6/26/2016

Outline  Update Summarization  Opinion Summarization  Discourse Analysis 6/26/2016

Update Summarization  TAC 2008 update summarization task slightly differ from the DUC 2007 update task  The documents will be from the AQUAINT-2 collection rather than the AQUAINT collection  Cluster format:  There will only be two sets per cluster (Set A and Set B)  Each document set will have exactly 10 documents  The summary for document Set A should be a regular topic-focused summary  The summary for Set B should be written under the assumption that the user has already read all the documents in Set A 6/26/2016

 Tarsqi: a tool for event/time anchoring/ordering  Recognizes events and times  Creates event/event, event/time, time/time temporal links John fell after Mary pushed him. They heard an explosion on Monday, but not in 2007. This reminded them of the 1968 war, which ravaged the countryside in 1969. He slept on Friday night. She hopes to succeed before noon. Gonzalez said he would resign on Tuesday. He thought it was a great deal. John leaves today. John does not leave today. 6/26/2016

1 2 3 4 5 6 7 8 9 D1 D2 1 2 3 4 5 6 7 8 9 Tarsqi 1 34 2 6 57 8 9 Graph Layering 6/26/2016

D0703A-A 6/26/2016

BFS 6/26/2016

Topmost layering 6/26/2016

Optimal layering 6/26/2016

Opinion Summarization  Input:  Output: a summary for each target that summarizes the answers to the questions Why did readers support Time's inclusion of Bono for Person of the Year? Why did readers not support the inclusion of Bill Gates as Person of the Year? Why did readers not support the inclusion of Melinda Gates as Person of the Year? 6/26/2016

 Existing opinion corpus: Movie Review corpus  Document level:  1000 +ve documents and 1000 –ve documents  Problem: coarse grain level  Sentence level:  5331 +ve sentences and 5331 –ve sentences  Problem: not enough data  We collected data from productreview.com.au and rateitall.com  Fine grain:  Productreview.com.au: each review has pros, cons, overall, and a rating  Rateitall.com: each review has a rating  Large datasets  Productreview.com.au: 2.4G  Rateitall.com: 2.0G  http://wing.comp.nus.edu.sg/~hung/productreview/ http://wing.comp.nus.edu.sg/~hung/productreview/  http://wing.comp.nus.edu.sg/~hung/rateitall/ http://wing.comp.nus.edu.sg/~hung/rateitall/ 6/26/2016

Discourse Analysis  Penn Discourse Treebank 2.0  Based on PTB 2  18459 Explicit relations,16053 Implicit relations TEMPORAL(950::3696) Asynchronous (697::2090) precedence succession Synchronous (251::1594) CONTINGENCY (4255::3417) Cause (4172::2240) reason Result Pragmatic Cause (83::13) Justification Condition (1::1416) hypothetical general unreal present unreal past factual present factual past Pragmatic Condition (1::67) relevance implicit assertion COMPARISON (2503::5589) Contrast (2120::3928) juxtaposition opposition Pragmatic Contrast (4::32) Concession (223::1213) expectation contra-expectation Pragmatic Concession (1::15) EXPANSION (8861::6423) Conjunction (3534::5320) Instantiation (1445::302) Restatement (3206::162) specification equivalence generalization Alternative (185::351) conjunctive disjunctive chosen alternative Exception (2::14) List (400::250) 6/26/2016

 Marcu and Echihabi baseline  Used word-pairs in a Naive Bayes model  Wellner et al. baseline  Used totally 7 feature classes  Claimed that proximity and connective are the most useful feature classes  prox: 0.60  prox + conn: 0.7677  I only implemented prox and conn in the baseline system Accuracy exp0.3466 imp0.5474 exp+imp0.4119 proxconnprox+conn exp0.34880.94040.9414 imp0.5435 exp+imp0.43730.76 6/26/2016

Progress Update Lin Ziheng 6/26/2016. Outline  Update Summarization  Opinion Summarization  Discourse Analysis 6/26/2016.

Similar presentations

Presentation on theme: "Progress Update Lin Ziheng 6/26/2016. Outline  Update Summarization  Opinion Summarization  Discourse Analysis 6/26/2016."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Progress Update Lin Ziheng 6/26/2016. Outline  Update Summarization  Opinion Summarization  Discourse Analysis 6/26/2016.

Similar presentations

Presentation on theme: "Progress Update Lin Ziheng 6/26/2016. Outline  Update Summarization  Opinion Summarization  Discourse Analysis 6/26/2016."— Presentation transcript:

Similar presentations

About project

Feedback