Download presentation
Presentation is loading. Please wait.
Published byGiles O’Connor’ Modified over 9 years ago
1
1 CERATOPS Center for Extraction and Summarization of Events and Opinions in Text Janyce Wiebe, U. Pittsburgh Claire Cardie, Cornell U. Ellen Riloff, U. Utah
2
2 Outline Sample of research results since March 2008 –Coreference resolution –Topics and discourse Topic identification for fine-grained opinion analysis Discourse-level opinion graphs for polarity classification Resources and systems –Upcoming releases –Jodange system for extracting and aggregating opinions from on-line content Interactions Research plans for the remainder of the grant
3
3 NP Coreference Resolution Australian press has launched a bitter attack on Italy after seeing their beloved Socceroos eliminated on a controversial late penalty.
4
4 NP Coreference Resolution Australian press has launched a bitter attack on Italy after seeing their beloved Socceroos eliminated on a controversial late penalty. Italian coach Lippi has also been blasted for his comments after the game. In the opposite camp Lippi is preparing his side for the upcoming game with Ukraine. He hailed 10-man Italy's determination to beat Australia and said the penalty was rightly given.
5
5 Coreference and Opinion Holders Stoyanov & Cardie [2006] Australian press has launched a bitter attack on Italy after seeing their beloved Socceroos eliminated on a controversial late penalty. Italian coach Lippi has also been blasted for his comments after the game. In the opposite camp Lippi is preparing his side for the upcoming game with Ukraine. He hailed 10-man Italy's determination to beat Australia and said the penalty was rightly given.
6
6 Reconcile NP coreference resolution toolbox/testbed –Public domain Joint effort of Cornell, LLNL, Univ of Utah Algorithms for –Training NP coref classifiers –Generating standard feature vectors –Testing/training on standard data sets (MUC, ACE) –Evaluating w.r.t. multiple metrics (MUC, B 3, CEAF, etc.) Trained versions of state-of-the-art systems
7
7 Outline Sample of research results since March 2008 –Coreference resolution –Topics and discourse Topic identification for fine-grained opinion analysis Discourse-level opinion graphs for polarity classification Resources and systems –Upcoming releases –Jodange system for extracting and aggregating opinions from on-line content Interactions Research plans for the remainder of the grant
8
8 Topics in Fine-grained Opinion Analysis –Opinion expression –Polarity positive negative neutral –Strength/intensity low..extreme –Source (opinion holder) –Target (topic) “The Australian Press launched a bitter attack on Italy” Opinion Frame Polarity: negative Intensity: high Source: “The Australian Press” Target: “Italy”
9
9 Opinion Summaries Australian Press Italy Marcello Lippi penalty Socceroos
10
10 Definitions Topic - the real-world object, event or abstract entity that is the subject of the opinion as intended by the opinion holder Topic span - the closest, minimal span of text that mentions the topic Target span - the span of text that covers the syntactic surface form comprising the contents of the opinion Stoyanov & Cardie [LREC 2008, Coling 2008]
11
11 Definitions (1) [ OH John] likes Marseille for its weather and cultural diversity. (1) [ OH John] likes [ TARGET+TOPIC SPAN Marseille] for its weather and cultural diversity.
12
12 Definitions (2) [ OH Al] thinks that the government should tax more in order to curb CO2 emissions. (2) [ OH Al] thinks that [ TARGET SPAN the government should tax gas more in order to curb CO2 emissions]. (2) [ OH Al] thinks that [ TARGET SPAN [ TOPIC SPAN? the government] should [ TOPIC SPAN? tax gas] more in order to [ TOPIC SPAN? curb [ TOPIC SPAN? CO2 emissions]]]. (3) Although he doesn’t like government imposed taxes, he thinks that a fuel tax is the only effective solution.
13
13 Issues in Opinion Topic Identification Multiple potential topics Opinion topics are not always explicitly mentioned (3) [ OH John] believes the violation of Palestinian human rights is one of the main factors. Topic: ISRAELI-PALESTINIAN CONFLICT (4) [ OH I] disagree entirely!
14
14 A Coreference Approach Hypothesize that the notion of topic coreference will facilitate identification of opinion topics Easier than specifying the topic of each opinion in isolation Two opinions are topic-coreferent if they share the same opinion topic.
15
15 A Coreference Approach to Opinion Topic Identification Find the clusters of topic-coreferent opinions –the critical step for topic identification Label the clusters with the name of the topic –we currently ignore this step –address in future work through frequency analysis of the terms in each of the clusters
16
16 The Topic Coreference Algorithm Adapt a standard machine learning-based approach to NP coreference resolution (Soon et al., 2001; Ng & Cardie, 2002) 1.identify topic spans 2.perform pairwise classification of the associated opinions and topic spans to determine whether or not they are topic-cofererent 3.cluster the opinions according to the results of (2)
17
17 Results Annotated subset of the MPQA corpus for topic- coreferent opinions –Acceptable inter-annotator agreement 10-fold cross-validation Baselines B3B3 CEAF all-in-one.37-.10.30 1 opinion per cluster.29.22.27 same paragraph.55.31.50 Choi 2000.54.37.54
18
18 Results sentence spans.57.40.54 automatic (heuristics).57.41.54 modified manual topic spans.64.51.61 manual topic spans.71.66.62 B3B3 CEAF all-in-one.37-.10.30 1 opinion per cluster.29.22.27 same paragraph.55.31.50 Choi 2000.54.37.54
19
19 Results sentence spans.57.40.54 automatic (heuristics).57.41.54 modified manual topic spans.64.51.61 manual topic spans.71.66.62 B3B3 CEAF all-in-one.37-.10.30 1 opinion per cluster.29.22.27 same paragraph.55.31.50 Choi 2000.54.37.54
20
20 Outline Sample of research results since March 2008 –Coreference resolution –Topics and discourse Topic identification for fine-grained opinion analysis Discourse-level opinion graphs for polarity classification Resources and systems –Upcoming releases –Jodange system for extracting and aggregating opinions from on-line content Interactions Research plans for the remainder of the grant
21
21 Discourse level opinion graphs for polarity classification Somasundaran et al. Discourse level Opinion Graphs for Polarity Classification. Submitted. Somasundaran et al. (2008). Discourse Level Opinion Interpretation. COLING- 2008. Somasundaran et al. (2008). Discourse Level Opinion Relations: An Annotation Study. SIGdial Workshop on Discourse and Dialogue-2008.
22
22 Motivation The interpretations of opinions often depend on the other opinions in the discourse
23
23 Motivation I like the LCD feature We must implement the LCD I think the LCD is hot
24
24 Motivation I like the LCD feature We must implement the LCD I think the LCD is hot Interdependent Interpretation of opinions in the discourse
25
25 Shapes should be curved, so round shapes. Nothing square-like.... So we shouldn’t have too square corners and that kind of thing. Motivation
26
26 Shapes should be curved, so round shapes Nothing square-like.... So we shouldn’t have too square corners and that kind of thing. Arguing Negative Arguing Positive Arguing Negative Sentiment Negative Motivation
27
27 Shapes should be curved, so round shapes Nothing square-like.... So we shouldn’t have too square corners and that kind of thing. Arguing Negative Arguing Positive Arguing Negative Sentiment Negative What were the opinions regarding the curved shape? Will the curved shape be accepted? Motivation
28
28 Shapes should be curved, so round shapes Nothing square-like.... So we shouldn’t have too square corners and that kind of thing. Arguing Negative Arguing Positive Arguing Negative Sentiment Negative Direct opinion Motivation
29
29 Shapes should be curved, so round shapes Nothing square-like.... So we shouldn’t have too square corners and that kind of thing. Arguing Negative Arguing Positive Arguing Negative Sentiment Negative Direct opinion Opinions towards mutually exclusive option (alternative) Motivation
30
30 Shapes should be curved, so round shapes Nothing square-like.... So we shouldn’t have too square corners and that kind of thing. Arguing Negative Arguing Positive Arguing Negative Sentiment Negative Direct opinion Opinions towards mutually exclusive option (alternative) Opinions towards the square shapes reveal more about the speaker’s opinion regarding the curved shape Motivation
31
31 Shapes should be curved, so round shapes Nothing square-like.... So we shouldn’t have too square corners and that kind of thing. Arguing Negative Arguing Positive Arguing Negative Sentiment Negative Direct opinion Opinions towards mutually exclusive option (alternative) More opinion information Motivation Opinions towards the square shapes reveal more about the speaker’s opinion regarding the curved shape
32
32 Opinion Frames This blue remote is cool. What’s more, the rubbery material is ergonomic. We must really go for the blue remote. I feel the red remote is a better choice. The blue remote will be too expensive. Sentiment Pos Arguing Pos Sentiment Pos Sentiment Neg
33
33 Opinion Frames This blue remote is cool. What’s more, the rubbery material is ergonomic. We must really go for the blue remote. I feel the red remote is a better choice. The blue remote will be too expensive. Sentiment Pos Arguing Pos Sentiment Pos Sentiment Neg same
34
34 Opinion Frames This blue remote is cool. What’s more, the rubbery material is ergonomic. We must really go for the blue remote. I feel the red remote is a better choice. The blue remote will be too expensive. Sentiment Pos Arguing Pos Sentiment Pos Sentiment Neg same
35
35 Opinion Frames This blue remote is cool. What’s more, the rubbery material is ergonomic. We must really go for the blue remote. I feel the red remote is a better choice. The blue remote will be too expensive. Sentiment Pos Arguing Pos Sentiment Pos Sentiment Neg alternative same
36
36 Opinion Frames Target Relations Same relation: –Targets that refer to the same entity or proposition –Covers identity, part-whole, generalization, and other relations involved in co-reference Alternative Relation : –The alternative relation holds between targets that are related by virtue of being opposing (mutually exclusive) options in the context of the discourse.
37
37 Annotation tool
38
38 Opinion Frames This blue remote is cool. What’s more, the rubbery material is ergonomic. We must really go for the blue remote. I feel the red remote is a better choice. The blue remote will be too expensive. Sentiment Pos Arguing Pos Sentiment Pos Sentiment Neg alternative same
39
39 Opinion Frames This blue remote is cool. What’s more, the rubbery material is ergonomic. We must really go for the blue remote. I feel the red remote is a better choice. The blue remote will be too expensive. Sentiment Pos Arguing Pos Sentiment Pos Sentiment Neg SPSPsame alternative same
40
40 Opinion Frames This blue remote is cool. What’s more, the rubbery material is ergonomic. We must really go for the blue remote. I feel the red remote is a better choice. The blue remote will be too expensive. Sentiment Pos Arguing Pos Sentiment Pos Sentiment Neg SPAPsame alternative same
41
41 Opinion Frames This blue remote is cool. What’s more, the rubbery material is ergonomic. We must really go for the blue remote. I feel the red remote is a better choice. The blue remote will be too expensive. Sentiment Pos Arguing Pos Sentiment Pos Sentiment Neg SPSNalt alternative same
42
42 Data AMI meeting corpus (Carletta et al., 2005) –Rich in opinions –Rich interplay between opinion types, targets, and polarities –Same and alternative target relations are prominent –Targets are relatively concrete
43
43 The Project roadmap Accomplished –Developed a multi-stage annotation scheme –Tested human reliability in recognizing opinion frames –Showed that the frame presence between two opinion bearing sentences can be automatically detected –Showed that interdependent interpretation improves automatic detection of opinion polarity To Do –Fully automate effective discourse level opinion interpretation –Text data Do the opinion frames behave the same way? What knowledge sources can we use for our task in text?
44
44 Discourse Level Opinion Graphs (DLOGs) Pair-wise relationship from Opinion frames are used to make Discourse Level Opinion Graph (DLOGs)
45
45 Legend Alternative Target relation ------------ Same Target relation ------------ Reinforcing frame relation ------------ Non-reinforcing frame relation ------------
46
46 Legend Alternative Target relation ------------ Same Target relation ------------ Reinforcing frame relation ------------ Non-reinforcing frame relation ------------
47
47 Improving Polarity Classification using DLOG Computationally model discourse level relations between opinions Leverage these relations to improve polarity classification
48
48 Improving Polarity Classification using DLOG Initial step: –Individual classifiers for recognizing opinions, polarities, and the 4 types of links Iterative steps –Relational classifiers that use information of the neighboring nodes The classes assigned to neighbors thus far
49
49 Illustration I like the LCD feature We must implement the LCD I think the LCD is hot
50
50 DLOG I like the LCD feature We must implement the LCD I think the LCD is hot ?
51
51 DLOG I like the LCD feature We must implement the LCD I think the LCD is hot same
52
52 DLOG I like the LCD feature We must implement the LCD I think the LCD is hot same Reinforcing
53
53 DLOG I like the LCD feature We must implement the LCD I think the LCD is hot same Reinforcing
54
54 DLOG I like the LCD feature We must implement the LCD I think the LCD is hot same Reinforcing
55
55 Improving Polarity Classification using DLOG: Relational Features Number of neighbors with polarity type x linked via frame link z Number of neighbors with polarity type x linked via target link y Etc. Where x = {non-neutral,positive, negative} y = {same, alt}, z = {reinforcing, non-reinforcing}
56
56 Experiments Is DLOG information useful and non- redundant with lexicon and dialog-act information for polarity classification Test how the DLOG performs for varying amounts of available annotations: from full neighborhood information to absolutely no neighborhood information
57
57 Findings Link prediction is hard due to data skewness fully automatic system does not significantly improve polarity classification
58
58 Findings Can discourse-level opinion graphs be exploited to improve the performance of opinion polarity classification? Yes! 8-16 percentage point improvement in F-measure when the graph structure is known as well as the neighbor’s polarity is known 3-5 percentage point improvement in F-measure when only the graph structure is known
59
59 Upcoming releases Reconcile NP coreference system OpinionFinder version 2 –Improved performance on opinion and polarity recognition MPQA opinion annotated corpus version 3 –Attitude and target span annotations added –Opinion topic coreference annotations added New lexicon representation and toolkit –Uniform representation for different types of subjectivity clues –Attributes, such as polarity, intensity –Toolkit for adding entries to the dictionary and finding instances of them in a corpus
60
60 Jodange Demo www.jodange.comwww.jodange.com Commercialization of methods to identify and aggregate opinions and their attributes Opinion Frame Expression: “ blasted ” Polarity: negative Source: “The Australian Press” Topic: “Italy” NP Coref Resolution Source: “The Australian Press” = “their” = “the reporters”
61
61 Interactions UACs –Riloff and Hovy have continued their collaboration on Information Extraction –Riloff was on sabbatical at ISI through the summer Other –Riloff and Cardie are engaged in coreference resolution work with researchers at LLNL –Wiebe is in the planning stages of work with researchers at LLNL on a news search project –Riloff is collaborating with the Veterinary Information Network on research in pet health surveillance –Wiebe is collaborating with researchers in the U. Pittsburgh Medical School on an NIH syndromic surveillance project –Cardie is collaborating with researchers in the Cornell U. Information Science program on a project on the detection of deception in on-line communication
62
62 Research Plans Information extraction for pet health surveillance –collaborating with the Veterinary Information Network to study message board discussions between veterinary professionals. –goal is information extraction of entities and relations associated with pet health events, to discover: associations between symptoms and diseases spikes or trends in diseases (e.g., kidney failure cases during pet food recall) unusual or unexplained clusters of symptoms or diseases
63
63 Research Plans Coreference resolution –Developing methods to incorporate knowledge from the web as corroborating evidence for coreference resolution. Ex: Orrin Hatch and Ralph Becker are possible antecedents for “the mayor” … issue web queries to decide! –Experimenting with methods to automatically generate training data for domain-specific coreference resolution. –Developing state-of-the-art coreference resolver called Reconcile that will be made publicly available. testbed for research and analysis easily configurable to use different subcomponents
64
64 Research Plans Selective information extraction with relevant regions –developing weakly supervised learners to create relevant region classifiers –learning several different types of clues with varying levels of reliability primary indicators are reliable enough to apply everywhere secondary indicators are only applied within relevant regions of text Domains: terrorist events & disease outbreak reports
65
65 Architecture for Selective IE Unstructured Text Relevant Sentence Classifier IE System Relevant Sentences IE Patterns and Clues Extractions Pattern & Clue Learner Relevant Region Training Seed IE Patterns Relevant and Irrelevant Documents
66
66 Research Plans Complete the releases of the MPQA corpus, OpinionFinder, & lexicon and toolkit
67
67 Discourse level opinion interpretation Move from task-oriented meetings to blogs and other texts Challenges –Other types of relations between targets are more dominant than in task-oriented meetings –Targets are often more abstract Opportunities –Richer lexically –Can exploit parse trees –Explicit discourse cues are more common –Sentences are longer and more self contained
68
68 Lexicon Explore different uses of words, to zero in on the subjective ones Example: benefit
69
69 Lexicon Example: benefit Very often objective, as a Verb: Children with ADHD benefited from a 15-course of fish oil
70
70 Lexicon Noun uses look more promising: The innovative economic program has shown benefits to humanity
71
71 Lexicon However, there are objective noun uses too: …tax benefits. …employee benefits. …tax benefits to provide a stable economy. …health benefits to cut costs.
72
72 Lexicon Pattern: benefits as the head of a noun phrase containing a prepositional phrase Matches this: The innovative economic program has shown proven benefits to humanity But none of these: …tax benefits. …employee benefits. …tax benefits to provide a stable economy. …health benefits to cut costs.
73
73 Entry representation be soft on crime be 2 soft J 1 3 on 1 4 crime N
74
74 Attributive information be soft on crime true h sen The Obama campaign rejected the notion that the senator might be vulnerable to accusations that he is soft on crime. vp s n m h 1:[morph:[lemma="be"] order:[distance="2" landmark="2"]] 2:[morph:[word="soft" majorClass="J"] order:[distance="1" landmark="3"]] 3:[morph:[word="on"] order:[distance="1" landmark="4"]] 4:[morph:[word="crime" majorClass="N"]] ngramPattern
75
75 Research Plans Learn subjective uses from corpora (bodies of texts) Capture longer subjective constructions Incorporate word senses –OntoNotes (Hovy) Add relevant knowledge about expressions Exploit the richer knowledge resource to improve opinion extraction
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.