REACTION REACTION Workshop 2011.01.06 Task 1 – Progress Report & Plans Lisbon, PT and Austin, TX Mário J. Silva University of Lisbon, Portugal.

Slides:



Advertisements
Similar presentations
1 OOA-HR Workshop, 11 October 2006 Semantic Metadata Extraction using GATE Diana Maynard Natural Language Processing Group University of Sheffield, UK.
Advertisements

REACTION REACTION Workshop Task 2 – Progress Report & Plans Lisbon, PT and Austin, TX Mário J. Silva University of Lisbon, Portugal.
Polarity Analysis of Texts using Discourse Structure CIKM 2011 Bas Heerschop Erasmus University Rotterdam Frank Goossen Erasmus.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Extract from various presentations: Bing Liu, Aditya Joshi, Aster Data … Sentiment Analysis January 2012.
Sentiment Analysis An Overview of Concepts and Selected Techniques.
CIS630 Spring 2013 Lecture 2 Affect analysis in text and speech.
IVITA Workshop Summary Session 1: interactive text analytics (Session chair: Professor Huamin Qu) a) HARVEST: An Intelligent Visual Analytic Tool for the.
April 22, Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Doerre, Peter Gerstl, Roland Seiffert IBM Germany, August 1999 Presenter:
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
REACTION REACTION Workshop Directions Lisbon, PT and Austin, TX Mário J. Silva University of Lisbon, Portugal.
REACTION POWER: Political Ontology for Web Entity Retrieval Sílvio Moreira
Open Information Extraction From The Web Rani Qumsiyeh.
REACTION REACTION Workshop Overview Lisbon, PT and Austin, TX Mário J. Silva University of Lisbon, Portugal.
1 Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang, Assistant Professor Dept. of Computer Science & Information Engineering National Central.
Information Retrieval and Extraction 資訊檢索與擷取 Chia-Hui Chang National Central University
Learning Subjective Adjectives from Corpora Janyce M. Wiebe Presenter: Gabriel Nicolae.
Toward Semantic Web Information Extraction B. Popov, A. Kiryakov, D. Manov, A. Kirilov, D. Ognyanoff, M. Goranov Presenter: Yihong Ding.
Automatically Constructing a Dictionary for Information Extraction Tasks Ellen Riloff Proceedings of the 11 th National Conference on Artificial Intelligence,
The LC-STAR project (IST ) Objectives: Track I (duration 2 years) Specification and creation of large word lists and lexica suited for flexible.
Text mining tool for ontology engineering based on use of product taxonomy and web directory Jan Nemrava and Vojtech Svatek Department of Information and.
Extracting Opinions, Opinion Holders, and Topics Expressed in Online News Media Text Soo-Min Kim and Eduard Hovy USC Information Sciences Institute 4676.
Sentiment Analysis with a Multilingual Pipeline 12th International Conference on Web Information System Engineering (WISE 2011) October 13, 2011 Daniëlla.
More than words: Social networks’ text mining for consumer brand sentiments A Case on Text Mining Key words: Sentiment analysis, SNS Mining Opinion Mining,
(ACM KDD 09’) Prem Melville, Wojciech Gryc, Richard D. Lawrence
Result presentation. Search Interface Input and output functionality – helping the user to formulate complex queries – presenting the results in an intelligent.
Attention and Event Detection Identifying, attributing and describing spatial bursts Early online identification of attention items in social media Louis.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Automatic Detection of Tags for Political Blogs Khairun-nisa Hassanali and Vasileios Hatzivassiloglou Human Language Technology Research Institute The.
Introduction to Text and Web Mining. I. Text Mining is part of our lives.
Scott Duvall, Brett South, Stéphane Meystre A Hands-on Introduction to Natural Language Processing in Healthcare Annotation as a Central Task for Development.
1 A Unified Relevance Model for Opinion Retrieval (CIKM 09’) Xuanjing Huang, W. Bruce Croft Date: 2010/02/08 Speaker: Yu-Wen, Hsu.
Péter Schönhofen – Ad Hoc Hungarian → English – CLEF Workshop 20 Sep 2007 Performing Cross-Language Retrieval with Wikipedia Participation report for Ad.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
14/12/2009ICON Dipankar Das and Sivaji Bandyopadhyay Department of Computer Science & Engineering Jadavpur University, Kolkata , India ICON.
Opinion Holders in Opinion Text from Online Newspapers Youngho Kim, Yuchul Jung and Sung-Hyon Myaeng Reporter: Chia-Ying Lee Advisor: Prof. Hsin-Hsi Chen.
Using Several Ontologies for Describing Audio-Visual Documents: A Case Study in the Medical Domain Sunday 29 th of May, 2005 Antoine Isaac 1 & Raphaël.
ACE Automatic Content Extraction A program to develop technology to extract and characterize meaning from human language.
How Useful are Your Comments? Analyzing and Predicting YouTube Comments and Comment Ratings Stefan Siersdorfer, Sergiu Chelaru, Wolfgang Nejdl, Jose San.
Computational linguistics A brief overview. Computational Linguistics might be considered as a synonym of automatic processing of natural language, since.
Metadata Common Vocabulary a journey from a glossary to an ontology of statistical metadata, and back Sérgio Bacelar
Named Entity Disambiguation on an Ontology Enriched by Wikipedia Hien Thanh Nguyen 1, Tru Hoang Cao 2 1 Ton Duc Thang University, Vietnam 2 Ho Chi Minh.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
CSC 594 Topics in AI – Text Mining and Analytics
Recognizing Stances in Online Debates Unsupervised opinion analysis method for debate-side classification. Mine the web to learn associations that are.
Intelligent Database Systems Lab Presenter : WU, MIN-CONG Authors : YUNG-MING LI, TSUNG-YING LI 2013, DSS Deriving market intelligence from microblogs.
Exploring in the Weblog Space by Detecting Informative and Affective Articles Xiaochuan Ni, Gui-Rong Xue, Xiao Ling, Yong Yu Shanghai Jiao-Tong University.
Support Vector Machines and Kernel Methods for Co-Reference Resolution 2007 Summer Workshop on Human Language Technology Center for Language and Speech.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
Extracting and Ranking Product Features in Opinion Documents Lei Zhang #, Bing Liu #, Suk Hwan Lim *, Eamonn O’Brien-Strain * # University of Illinois.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Extracting Opinion Topics for Chinese Opinions using Dependence Grammar Guang Qiu, Kangmiao Liu, Jiajun Bu*, Chun Chen, Zhiming Kang Reporter: Chia-Ying.
Using Semantic Relations to Improve Information Retrieval
Conversational role assignment problem in multi-party dialogues Natasa Jovanovic Dennis Reidsma Rutger Rienks TKI group University of Twente.
2014 Lexicon-Based Sentiment Analysis Using the Most-Mentioned Word Tree Oct 10 th, 2014 Bo-Hyun Kim, Sr. Software Engineer With Lina Chen, Sr. Software.
Twitter as a Corpus for Sentiment Analysis and Opinion Mining
© NCSR, Frascati, July 18-19, 2002 CROSSMARC big picture Domain-specific Web sites Domain-specific Spidering Domain Ontology XHTML pages WEB Focused Crawling.
Reading literacy. Definition of reading literacy: “Reading literacy is understanding, using and reflecting on written texts, in order to achieve one’s.
An Effective Statistical Approach to Blog Post Opinion Retrieval Ben He, Craig Macdonald, Jiyin He, Iadh Ounis (CIKM 2008)
WP4 Models and Contents Quality Assessment
Automatically Labeled Data Generation for Large Scale Event Extraction
Sentiment analysis algorithms and applications: A survey
张昊.
University of Computer Studies, Mandalay
Aspect-based sentiment analysis
Social Knowledge Mining
Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang
Text Mining & Natural Language Processing
Presentation transcript:

REACTION REACTION Workshop Task 1 – Progress Report & Plans Lisbon, PT and Austin, TX Mário J. Silva University of Lisbon, Portugal

REACTION Grants (paid by Reaction) Sílvio Moreira (BI: Oct 1, 2010 – March 31, 2011 ) João Ramalho (BIC: Jan 1, 2011 – April 31, 2011)

REACTION Mining resources Development of robust linguistic resources to process different types and genres of texts knowledge resources about media personalities: recognizing and resolving references to named- entities; sentiment lexicons and grammars: detecting the polarity of opinions about media personalities annotated corpora: training different text classifiers and evaluating classification procedures

REACTION Mining resources POWER - Political Ontology for Web Entity Retrieval SentiLex-PT01 – Sentiment Lexicon for Portuguese SentiCorpus-PT09 – Sentiment annotated corpus of user comments to political debates

REACTION POWER POWER is an ontology that formalizes the domain knowledge defining a political landscape, i.e., the political actors and their roles in the political scene, their relationships and interactions. The ontology is foccused in describing: Politicians Political Institutions with different levels of authority (International, National, Regional,...) Political Associations Political Affiliations and Endorsements Elections Mandates

REACTION POWER Currently, the ontology describes: 587 Political actors 17 (editions) of Political Institutions 16 Political Associations 900 Mandates 1 Election 6 Candidate Lists from the Portuguese political scene

REACTION SentiLex-PT01 SentiLex-PT01 is a sentiment lexicon for Portuguese made up of 6,321 adjective lemmas, and 25,406 inflected forms. The sentiment entries correspond to human predicate adjectives The sentiment attributes described in SentiLex-PT01 concern: the predicate polarity, the target of sentiment, and the polarity assignment (which was performed manually or automatically, by JALC)

REACTION SentiLex-lem-PT01 8 6,321 lemmas abatido.PoS=Adj;TG=HUM;POL=-1;ANOT=MAN abelhudo.PoS=Adj;TG=HUM;POL=-1;ANOT=MAN abençoado. PoS=Adj;TG=HUM;POL=1;ANOT=JALC atrevido, PoS=Adj;TG=HUM;POL=0;ANOT=MAN bem-educado.PoS=Adj;TG=HUM;POL=1;ANOT=MAN brega.PoS=Adj;TG=HUM;POL=-1;ANOT=JALC violento, PoS=Adj;TG=HUM;POL=-1;ANOT=JALC Recently made publicly available on:

REACTION SentiLex-flex-PT ,406 inflected forms abatida,abatido.PoS=Adj;GN=fs;TG=HUM;POL=-1;ANOT=MAN abatidas,abatido.PoS=Adj;GN=fp;TG=HUM;POL=-1;ANOT=MAN abatido,abatido.PoS=Adj;GN=ms;TG=HUM;POL=-1;ANOT=MAN abatidos,abatido.PoS=Adj;GN=mp;TG=HUM;POL=-1;ANOT=MAN bem-educada,bem-educado.PoS=Adj;GN=fs;TG=HUM;POL=1;ANOT=MAN bem-educadas,bem-educado.PoS=Adj;GN=fp;TG=HUM;POL=1;ANOT=MAN bem-educado,bem-educado.PoS=Adj;GN=ms;TG=HUM;POL=1;ANOT=MAN bem-educados,bem-educado.PoS=Adj;GN=mp;TG=HUM;POL=1;ANOT=MAN brega,brega.PoS=Adj;GN=fs;TG=HUM;POL=-1;ANOT=JALC brega,brega.PoS=Adj;GN=ms;TG=HUM;POL=-1;ANOT=JALC bregas,brega.PoS=Adj;GN=mp;TG=HUM;POL=-1;ANOT=JALC bregas,brega.PoS=Adj;GN=fp;TG=HUM;POL=-1;ANOT=JALC Recently made publicly available on:

REACTION SentiCorpus-PT09 SentiCorpus-PT09 is a collection of comments posted by the readers of the Público newspaper to a series of 10 news articles, each covering a televised face-to-face debate between the main candidates to the 2009 parliamentary elections. The collection is composed by 2,795 comments (~8,000 sentences). 3,537 sentences, from 736 comments (27% of the corpus), were manually labeled with sentiment information. Sentiment annotation involves different relevant dimensions, such as polarity, opinion target, target mention and verbal irony.

REACTION SentiCorpus-PT09 The sentence is the minimum unit of analysis, but some annotations span a comment; Each sentence may convey different opinions; Each opinion may have different specific targets; The targets, which can be omitted in text, correspond to human entities; The entity mentions are classifiable into 7 syntactic-semantic categories; The opinionated sentences may be characterized according to their polarity and intensity (ranging from -2 to 2); Each opinionated sentence may have a literal or ironic interpretation.

REACTION

Main findings Real challenge in performing opinion mining in user- generated content is correctly identifying the positive opinions Positive opinions are less frequent than negative opinions (20%) Positive opinions particularly exposed to verbal irony (11%) Other opinion mining challenges are related to the entity recognition and co-reference resolution sub-tasks mentions to human targets are frequently made through pronouns, definite descriptions and nicknames. The most frequent type of mention is the person name, but it only covers 36% of the analyzed cases.

REACTION Next steps April 2011: POWER Populating the ontology, using text-mining approaches Internal release SentiLex-PT01 Exploring other methods and algoritms (SVM, Active Learning) for automatic polarity classification Enlarging the sentiment lexicon (verbs, predicate nouns, idiomatic expressions)

REACTION Next steps August 2011: POWER First release to the general public via SPARQL endpoint and web user interface SentiCorpus-PT09 Publically available Analysis and (semi-automated) annotation of a collection of documents from industrial and social media, over a period of 6 months