Towards Semantic Affect Sensing in Sentences Alexander Osherenko.

Slides:



Advertisements
Similar presentations
Rationale for a multilingual corpus for machine translation evaluation Debbie Elliott Anthony Hartley Eric Atwell Corpus Linguistics 2003, Lancaster, England.
Advertisements

Polarity Analysis of Texts using Discourse Structure CIKM 2011 Bas Heerschop Erasmus University Rotterdam Frank Goossen Erasmus.
Identifying Sarcasm in Twitter: A Closer Look
Engeniy Gabrilovich and Shaul Markovitch American Association for Artificial Intelligence 2006 Prepared by Qi Li.
Exploiting Discourse Structure for Sentiment Analysis of Text OR 2013 Alexander Hogenboom In collaboration with Flavius Frasincar, Uzay Kaymak, and Franciska.
TÍTULO GENÉRICO Concept Indexing for Automated Text Categorization Enrique Puertas Sanz Universidad Europea de Madrid.
Facial expression as an input annotation modality for affective speech-to-speech translation Éva Székely, Zeeshan Ahmed, Ingmar Steiner, Julie Carson-Berndsen.
Uncertainty Corpus: Resource to Study User Affect in Complex Spoken Dialogue Systems Kate Forbes-Riley, Diane Litman, Scott Silliman, Amruta Purandare.
CS Word Sense Disambiguation. 2 Overview A problem for semantic attachment approaches: what happens when a given lexeme has multiple ‘meanings’?
Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition Thurid Vogt, Elisabeth André ICME 2005 Multimedia concepts.
Sentiment Lexicon Creation from Lexical Resources BIS 2011 Bas Heerschop Erasmus School of Economics Erasmus University Rotterdam
Learning Subjective Nouns using Extraction Pattern Bootstrapping Ellen Riloff, Janyce Wiebe, Theresa Wilson Presenter: Gabriel Nicolae.
1 Lending a Hand: Sign Language Machine Translation Sara Morrissey NCLT Seminar Series 21 st June 2006.
A Framework for Named Entity Recognition in the Open Domain Richard Evans Research Group in Computational Linguistics University of Wolverhampton UK
1 Statistical NLP: Lecture 13 Statistical Alignment and Machine Translation.
Yoonjung Choi.  The Knowledge Discovery in Databases (KDD) is concerned with the development of methods and techniques for making sense of data.  One.
Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews K. Dave et al, WWW 2003, citations Presented by Sarah.
A Joint Model of Feature Mining and Sentiment Analysis for Product Review Rating Jorge Carrillo de Albornoz Laura Plaza Pablo Gervás Alberto Díaz Universidad.
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-Language Information Retrieval Paul Clough 1 and Mark Stevenson 2 Department.
The use of machine translation tools for cross-lingual text-mining Blaz Fortuna Jozef Stefan Institute, Ljubljana John Shawe-Taylor Southampton University.
Carmen Banea, Rada Mihalcea University of North Texas A Bootstrapping Method for Building Subjectivity Lexicons for Languages.
LREC 2010, Malta Maj Centre for Language Technology The DAD corpora and their uses Costanza Navarretta Funded by Danish Research.
A Compositional Context Sensitive Multi-document Summarizer: Exploring the Factors That Influence Summarization Ani Nenkova, Stanford University Lucy Vanderwende,
Welcome to English III Wednesday Week Determine the meaning of unknown words using textual clues.
Detecting Promotional Content in Wikipedia Shruti Bhosale Heath Vinicombe Ray Mooney University of Texas at Austin 1.
1 Statistical NLP: Lecture 9 Word Sense Disambiguation.
ACBiMA: Advanced Chinese Bi-Character Word Morphological Analyzer 1 Ting-Hao (Kenneth) Huang Yun-Nung (Vivian) Chen Lingpeng Kong
Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop Nizar Habash and Owen Rambow Center for Computational Learning.
A Language Independent Method for Question Classification COLING 2004.
Experiments of Opinion Analysis On MPQA and NTCIR-6 Yaoyong Li, Kalina Bontcheva, Hamish Cunningham Department of Computer Science University of Sheffield.
A Bootstrapping Method for Building Subjectivity Lexicons for Languages with Scarce Resources Author: Carmen Banea, Rada Mihalcea, Janyce Wiebe Source:
CS 4705 Lecture 19 Word Sense Disambiguation. Overview Selectional restriction based approaches Robust techniques –Machine Learning Supervised Unsupervised.
Nespole!’s Experiment on Multimodality (Summer 2001) Erica Costantini (University of Trieste) Fabio Pianesi (ITC-irst, Trento) Susanne Burger (CMU)
*Erasmus University Rotterdam P.O. Box 1738, NL-3000 DR Rotterdam, the Netherlands † Teezir BV Wilhelminapark 46, NL-3581 NL, Utrecht, the Netherlands.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
What you have learned and how you can use it : Grammars and Lexicons Parts I-III.
Exploiting Wikipedia Categorization for Predicting Age and Gender of Blog Authors K Santosh Aditya Joshi Manish Gupta Vasudeva Varma
Towards Cross-Language Sentiment Analysis through Universal Star Ratings KMO 2012 Malissa Bal Erasmus University Rotterdam Flavius.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
UNSUPERVISED CV LANGUAGE MODEL ADAPTATION BASED ON DIRECT LIKELIHOOD MAXIMIZATION SENTENCE SELECTION Takahiro Shinozaki, Yasuo Horiuchi, Shingo Kuroiwa.
Multimodality, universals, natural interaction… and some other stories… Kostas Karpouzis & Stefanos Kollias ICCS/NTUA HUMAINE WP4.
Why predict emotions? Feature granularity levels [1] uses pitch features computed at the word-level Offers a better approximation of the pitch contour.
Department of Software and Computing Systems Research Group of Language Processing and Information Systems The DLSIUAES Team’s Participation in the TAC.
An Iterative Approach to Extract Dictionaries from Wikipedia for Under-resourced Languages G. Rohit Bharadwaj Niket Tandon Vasudeva Varma Search and Information.
Summarizing Encyclopedic Term Descriptions on the Web from Coling 2004 Atsushi Fujii and Tetsuya Ishikawa Graduate School of Library, Information and Media.
Multimedia Concepts and Applications Multimedia Concepts and Applications Differentiated Semantic Analysis in Lexical Affect Sensing Alexander Osherenko,
Emotions from text: machine learning for text-based emotion prediction Cecilia Alm, Dan Roth, Richard Sproat UIUC, Illinois HLT/EMPNLP 2005.
Improving Named Entity Translation Combining Phonetic and Semantic Similarities Fei Huang, Stephan Vogel, Alex Waibel Language Technologies Institute School.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Multi-level Bootstrapping for Extracting Parallel Sentence from a Quasi-Comparable Corpus Pascale Fung and Percy Cheung Human Language Technology Center,
Tools for Linguistic Analysis. Overview of Linguistic Tools  Dictionaries  Linguistic Inquiry and Word Count (LIWC) Linguistic Inquiry and Word Count.
Labelling Emotional User States in Speech: Where's the Problems, where's the Solutions? Anton Batliner, Stefan Steidl University of Erlangen HUMAINE WP5-WS,
PET Examination OVERVIEW John Scullion Guadalajara 1.
FILTERED RANKING FOR BOOTSTRAPPING IN EVENT EXTRACTION Shasha Liao Ralph York University.
Exploiting Named Entity Taggers in a Second Language Thamar Solorio Computer Science Department National Institute of Astrophysics, Optics and Electronics.
From Words to Senses: A Case Study of Subjectivity Recognition Author: Fangzhong Su & Katja Markert (University of Leeds, UK) Source: COLING 2008 Reporter:
Multimedia Concepts and Applications Multimedia Concepts and Applications Affect Sensing in Speech: Studying Fusion of Linguistic and Acoustic Features.
Lexical Affect Sensing: Are Affect Dictionaries Necessary to Analyze Affect? Alexander Osherenko, Elisabeth André University of Augsburg.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Evaluating NLP Features for Automatic Prediction of Language Impairment Using Child Speech Transcripts Khairun-nisa Hassanali 1, Yang Liu 1 and Thamar.
Prosodic Cues to Disengagement and Uncertainty in Physics Tutorial Dialogues Diane Litman, Heather Friedberg, Kate Forbes-Riley University of Pittsburgh.
Statistical NLP: Lecture 9
Building Body Paragraphs
Extracting Why Text Segment from Web Based on Grammar-gram
Statistical NLP : Lecture 9 Word Sense Disambiguation
Introduction Dataset search
Presentation transcript:

Towards Semantic Affect Sensing in Sentences Alexander Osherenko

Goal –Language independent approach for affect sensing in textual corpora containing spontaneous emotional dialogues Method –Extracting features and evaluating resulting datasets by standard data mining approaches considering language independence

Overview Properties of classified corpora Feature extraction Results: –SAL –AIBO –SmartKom Conclusions Outlook

Properties of classified dialogues Corpora may be in different languages No obvious signs of emotional meaning. Utterances are short. Can be grammatically incorrect, contain repairs, repetitions and inexact wordings. Can convey contradicting emotional meaning. Utterances are interdependent (can be seen as a continuous stream of information).

Feature extraction Most frequent (stemmed) utterance words in the current corpus (in most cases only seventh/eighth of the whole list) History as most frequent (stemmed) words in the current and n previous utterances (ditto) No dependence on an affect words‘ list e.g. Whissell‘s dictionary of affect

Dialogue corpora SAL –QUB (Cowie 2006) AIBO –Univ. of Erlangen (Batliner et.al. 2004) SmartKom –Univ. of Munich (Steininger et al. 2002)

SAL Instance – utterance in transliteration 670 in FEELTRACE annotated utterances Agreement – % 3 affect states English corpus FEELTRACE scores (mapped onto classes pos./neutral/neg.)

Evaluation for a three class problem in SAL SMO in WEKA Cross-validation (10 fold) Overall number of words rev.precisionrecallfMeasure#wordshistory maj cc dr em jd

AIBO Instance – paragraph in transliteration 3990 instances Sparse transliteration texts (commands to AIBO) 4 affect states German corpus

Evaluation for a four class problem in AIBO SMO in WEKA Learning/testing sets (1738/2252 resp. 2252/1738) Only words’, not history features Overall number of words – 488 (!) precisionrecallfMeasure#wordshistory

SmartKom Wizard of Oz scenario Instance – turn 817 annotated instances 11 user states German corpus

Evaluation for n class problem in SmartKom SMO in WEKA Cross-validation (10 fold) Overall number of words different affect states#whistoryPRfM joyful- strong joyful- weak surprisedneutralhelplessangry- weak angry- strong joyfulsurprisedneutralhelplessangry joyfulneutralhelplessangry joyfulneutralproblem no problemhelplessangry no problemproblem not angryangry

Conclusions Higher number of words and longer history don’t induce better classification, rather their combination Extracted features can serve as a basis (AIBO results – sparse data, repetitious content) Erroneous classification could have been caused by the discrepancy between the rating and the corresponding text Language-independent features

Outlook Further feature extraction (combination, history of POS groups?) Studying erroneous instances (esp. in SMARTKOM) Multimodality (prosodic/lexical) Application for journalistic articles e.g. movie reviews Is 100% precision the goal?