Evaluating an Opinion Annotation Scheme Using a New Multi- perspective Question and Answer Corpus (AAAI 2004 Spring) Veselin Stoyanov Claire Cardie Diane.

Slides:

Advertisements

Similar presentations

Critical Reading Strategies: Overview of Research Process

Advertisements

Playing the Telephone Game: Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Department of Computer.

Jan Wiebe University of Pittsburgh Claire Cardie Cornell University Ellen Riloff University of Utah Opinions in Question Answering.

Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis Theresa Wilson Janyce Wiebe Paul Hoffmann University of Pittsburgh.

Annotating Topics of Opinions Veselin Stoyanov Claire Cardie.

Teaching American History

Multi-Perspective Question Answering Using the OpQA Corpus Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University of Pittsburgh.

Annotating Expressions of Opinions and Emotions in Language Wiebe, Wilson, Cardie.

Rutgers’ HARD Track Experiences at TREC 2004 N.J. Belkin, I. Chaleva, M. Cole, Y.-L. Li, L. Liu, Y.-H. Liu, G. Muresan, C. L. Smith, Y. Sun, X.-J. Yuan,

Reference Collections: Task Characteristics. TREC Collection Text REtrieval Conference (TREC) –sponsored by NIST and DARPA (1992-?) Comparing approaches.

1 Attributions and Private States Jan Wiebe (U. Pittsburgh) Theresa Wilson (U. Pittsburgh) Claire Cardie (Cornell U.)

Combining Low-Level and Summary Representations of Opinions for Multi- Perspective Question Answering Claire Cardie, Janyce Wiebe, Theresa Wilson, Diane.

Learning Subjective Adjectives from Corpora Janyce M. Wiebe Presenter: Gabriel Nicolae.

Research Proposal Development of research question

ISP 433/633 Week 6 IR Evaluation. Why Evaluate? Determine if the system is desirable Make comparative assessments.

Scaling and Attitude Measurement in Travel and Hospitality Research Research Methodologies CHAPTER 11.

Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.

CANKAYA UNIVERSITY FOREIGN LANGUAGES UNIT

Research Methods and Design

AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.

Preparing for Data Collection Need to recognize that data collection is a high level activity that cannot be just passed off to graduate assistant Need.

Challenges in Information Retrieval and Language Modeling Michael Shepherd Dalhousie University Halifax, NS Canada.

Proposal Type One: Corpus-Based. The following is a list of items typically included in a Type One research proposal for MA in translation studies. The.

Evaluation Experiments and Experience from the Perspective of Interactive Information Retrieval Ross Wilkinson Mingfang Wu ICT Centre CSIRO, Australia.

2007. Software Engineering Laboratory, School of Computer Science S E Towards Answering Opinion Questions: Separating Facts from Opinions and Identifying.

Evaluating a Research Report

1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.

Organizing Your Information

Processing of large document collections Part 7 (Text summarization: multi- document summarization, knowledge- rich approaches, current topics) Helena.

1 Determining the Hierarchical Structure of Perspective and Speech Expressions Eric Breck and Claire Cardie Cornell University Department of Computer Science.

Exploiting Subjectivity Classification to Improve Information Extraction Ellen Riloff University of Utah Janyce Wiebe University of Pittsburgh William.

 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.

Collecting primary data: use of questionnaires Lecture 20 th.

How to Prepare an Annotated Bibliography

1 Web-Page Summarization Using Clickthrough Data* JianTao Sun, Yuchang Lu Dept. of Computer Science TsingHua University Beijing , China Dou Shen,

Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.

1 Multi-Perspective Question Answering Using the OpQA Corpus (HLT/EMNLP 2005) Veselin Stoyanov Claire Cardie Janyce Wiebe Cornell University University.

Summarization Focusing on Polarity or Opinion Fragments in Blogs Yohei Seki Toyohashi University of Technology Visiting Scholar at Columbia University.

1 Toward Opinion Summarization: Linking the Sources Veselin Stoyanov and Claire Cardie Department of Computer Science Cornell University Ithaca, NY 14850,

1/21 Automatic Discovery of Intentions in Text and its Application to Question Answering (ACL 2005 Student Research Workshop )

Assessment and Testing

Copyright © Allyn & Bacon 2008 Intelligent Consumer Chapter 14 This multimedia product and its contents are protected under copyright law. The following.

A Repetition Based Measure for Verification of Text Collections and for Text Categorization Dmitry V.Khmelev Department of Mathematics, University of Toronto.

Multilingual Opinion Holder Identification Using Author and Authority Viewpoints Yohei Seki, Noriko Kando,Masaki Aono Toyohashi University of Technology.

Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.

Have we had Hard Times or Cosy Times? A Discourse Analysis of Opinions Expressed over Socio-political Events in News Editorials Bal Krishna Bal Information.

UWMS Data Mining Workshop Content Analysis: Automated Summarizing Prof. Marti Hearst SIMS 202, Lecture 16.

Mining Dependency Relations for Query Expansion in Passage Retrieval Renxu Sun, Chai-Huat Ong, Tat-Seng Chua National University of Singapore SIGIR2006.

Criteria for selection of a data collection instrument. 1.Practicality of the instrument: -Concerns its cost and appropriateness for the study population.

1 Evaluating High Accuracy Retrieval Techniques Chirag Shah,W. Bruce Croft Center for Intelligent Information Retrieval Department of Computer Science.

DiscAn : Towards a Discourse Annotation system for Dutch language corpora or why and how we would want to annotate corpora on the discourse level Ted Sanders.

Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff School of Computing University of Utah Janyce Wiebe, Theresa Wilson Computing.

1 Adaptive Subjective Triggers for Opinionated Document Retrieval (WSDM 09’) Kazuhiro Seki, Kuniaki Uehara Date: 11/02/09 Speaker: Hsu, Yu-Wen Advisor:

Annotating Opinions in the World Press Theresa Wilson and Janyce Wiebe University of Pittsburgh Intelligent Systems Program and Department of Computer.

7/2003EMNLP031 Learning Extraction Patterns for Subjective Expressions Ellen Riloff Janyce Wiebe University of Utah University of Pittsburgh.

Your Research Paper Guidelines Spectra of Science.

Topic by Topic Performance of Information Retrieval Systems Walter Liggett National Institute of Standards and Technology TREC-7 (1999)

Relevance Models and Answer Granularity for Question Answering W. Bruce Croft and James Allan CIIR University of Massachusetts, Amherst.

From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:

CONTEXTUAL SEARCH AND NAME DISAMBIGUATION IN USING GRAPHS EINAT MINKOV, WILLIAM W. COHEN, ANDREW Y. NG SIGIR’06 Date: 2008/7/17 Advisor: Dr. Koh,

Using Semantic Relations to Improve Information Retrieval

Word Sense and Subjectivity (Coling/ACL 2006) Janyce Wiebe Rada Mihalcea University of Pittsburgh University of North Texas Acknowledgements: This slide.

International English Language Testing System (IELTS) Listening WritingSpeaking.

Identifying Expressions of Opinion in Context Eric Breck and Yejin Choi and Claire Cardie IJCAI 2007.

Improving a Pipeline Architecture for Shallow Discourse Parsing

Paradigms, Corpora, and Tools in Discourse and Dialog Research

Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang

Evaluating Arguments: Determining Viewpoint and Bias

Discrete Event Simulation - 4

Presentation transcript:

Evaluating an Opinion Annotation Scheme Using a New Multi- perspective Question and Answer Corpus (AAAI 2004 Spring) Veselin Stoyanov Claire Cardie Diane Litman Janyce Wiebe Dept. of Comp. Science, Cornell University Dept. of Comp. Science, Univ. of Pittsburgh

2 Abstract 2 tasks: Constructing a data collection for MPQA. Evaluating the hypothesis that low-level perspective information can be useful for MPQA. Low-level perspective information Corpus creation Evaluation: Answer probability Answer rank Conclusion: low-level perspective information can be an effective predictor of whether a text segment contains an answer to a question.

3 Introduction (1/2) Hypothesize opinion representations will be useful for practical NLP applications like MPQA. Multi-perspective question answering (MPQA): answer opinion-oriented question ( “ What is the sentiment in the Middle East towards war on Iraq? ” ) rather than fact-based questions ( “ What is the primary substance used in producing chocolate? ” ).

4 Introduction (2/2) Goal: two-fold Present a new corpus of multi-perspective questions and answers. Present the results of two experiments that employ the new Q&A corpus to investigate the usefulness of the opinion annotation scheme for multi-perspective vs. fact- based question answering.

5 Low-Level Perspective Information (1/3) Suggested by: Wiebe et al. (2002) Provides: a basis for annotating opinions, beliefs, emotions, sentiment, and other private states expressed in text. Private state: a general term used to refer to mental and emotional states that cannot be directly observed or verified. Explicitly stated ( “ John is afraid that Sue might fall. ” ) Indirectly expressed by the selection of words and the style of language that the speaker or writer uses ( “ It is about time that we end Saddam ’ s oppression. ” ) Expressive subjective elements

6 Low-Level Perspective Information (2/3) Source: the experiencer of that state = the person/entity whose opinion or emotion is being conveyed in the text. Overall source: the writer The writer may write about the private states of other people  multiple sources in a single text segment (nesting of sources: deep and complex) “ Mary believes that Sue is afraid of the dark. ” Sue is afraid of the dark  Mary Mary believes …  the writer

7 Low-Level Perspective Information (3/3) Annotations: On: the text span that constitutes the private state or speech event phrase itself. Inside: the text segment inside the scope of the private state/speech event phrase. “ Tom believes that Ken is an outstanding individual. ” Attributes: Attributes Fact annotation: onlyfactive = yes Opinion annotation: onlyfactive = no / expressive subjective element oninside

8 The MPQA NRRC Corpus Source: U.S. foreign broadcast information service (FBIS) Using the perspective annotation framework, Wiebe et al. (2003) have manually annotated a considerable number of documents to form the NRRC (Northeast Regional Research Center) corpus. Interannotator agreements: Using measure agr (a||b) : the proportion of a ’ s annotations that were found by b. 85% on explicit private states 50% on expressive subjectivity Conclusion: good agreement results indicate that annotating opinions is a feasible task.

9 MPQA Corpus Creation (1/3) The creation of the question and answer (Q&A) corpus  used to evaluate the low- level perspective annotations in the context of opinion-oriented (opinion) and fact-based (fact) question answering. 98 documents, 4 topics (kyoto, mugabe, humanrights, venezuela)  19~33 documents for each topic. SMART 98 documents 270,000 documents

10 MPQA Corpus Creation (2/3) Question creation: Question creation Difficulties: The classification associated with each question (fact/opinion) did not always seem appropriate. “ Did any prominent Americans plan to visit Venezuela immediately following the 2002 coup? ” Fact? Opinion? A volunteer 2 documents on each topic & a set of instructions 15 opinion (o) & 15 fact (f) questions for each topic

11 MPQA Corpus Creation (3/3) Annotating answers: Manually added answer annotations for each text segment in the Q&A corpus that constituted/ contributed to an answer to any question. Attributes: topic, question number, confidence Difficulties: Opinionated documents often express answers to the questions only very indirectly. It is hard even for humans to decide what constitutes an answer to a question. It was hard for human annotators to judge what can be considered an expression of the opinion of collective entities and often the conjecture required a significant amount of background information.

12 Evaluation of Perspective Annotations for MPQA (1/5) 2 different experiments to evaluate the usefulness of the perspective annotations in the context of fact- and especially opinion- based QA. Answer probability The # of answer segments classified as FACT & OPINION, respectively, that answer each question. Answer rank Determine the rank of the first retrieved sentence that correctly answers the question.

13 Evaluation of Perspective Annotations for MPQA (2/5) Multiple criteria: to determine whether a text segment should be considered FACT or OPINION based on the underlying perspective annotations. 2 association criteria: to determine which perspective annotations should be considered associated with an arbitrary text segment. 2 association criteria 4 classification criteria: to categorize the segment as one of FACT or OPINION. 4 classification criteria Bias towards opinion annotations  expect opinion annotations to be more discriminative

14 Evaluation of Perspective Annotations for MPQA (3/5) Answer probability: Answer probability Procedure: P(FACT/OPINION answer | fact/opinion question) Each answering Text segment Categorize Based on Criteria Opinion / Fact Count how many fact/opinion segments answer FACT/OPINION questions

15 Evaluation of Perspective Annotations for MPQA (4/5) Answer rank: Answer rank Procedure: documents A set of text segments Evaluation Ranked list of sentences divide Run an IR algorithm Each question as the query Modified ranked list of answers One of two filters to remove OPINION answers for fact questions & vice versa (opinion: overlap any fact: cover (all) ) Determine the rank of the first correct (any part of it is annotated as an A to the Q) retrieved sentence

16 Evaluation of Perspective Annotations for MPQA (5/5) Discussion: Low-level perspective information can be a reliable predictor of whether a given segment of a document answers an opinion/fact question. Low-level perspective information may be used to re-rank potential answers by using the knowledge that the probability that a fact answer appears in an OPINION segment, and vice versa, is very low. Using filters can sometimes cause all answering segments for a particular question to be discarded  unrealistic to use the FACT/OPINION segment classification as an absolute indicator of whether the segment can answer fact/opinion question.

17 Conclusion and Future Work Both tasks (constructing a data collection & evaluating usefulness) provided insights into potential difficulties of the task of MPQA and the usefulness of the low-level perspective information. Main problems: Deciding what constitutes answer The presence of indirect answer (expressive subjectivity) Most answers to opinion questions have to be deduced Low-level perspective information can be an effective predictor of whether a text segment contains an answer to a question (given the type of the question), but should NOT be used as an absolute indicator, especially in a limited number of documents.

18 ThanQ

19 Table 1: Attributes

20 Table 2: Questions in the Q&A collection by topic

21 2 association criteria

22 4 classification criteria

23 Table 3: Answer Probability 120/415 A annotated for fact/opinion Q P(F|f) >> P(O|f) P(O|o) > P(F|o) P(F|f) >> P(F|o) P(O|o) >> P(O|f) Max P(F|f) Max P(O|o) P(ANSWER|question)

24 Table 4: Answer Rank Filter all answer segments Rank(overlap)<=Rank(unfilt) for opinion Q Rank(cover)<=Rank(unfilt) for fact Q Mixed at least as well as unfiltered