Mark Cieliebak Jan Deriu Dominik Egger Fatih Uzdilli

Slides:



Advertisements
Similar presentations
Entity-Centric Topic-Oriented Opinion Summarization in Twitter Date : 2013/09/03 Author : Xinfan Meng, Furu Wei, Xiaohua, Liu, Ming Zhou, Sujian Li and.
Advertisements

Sentiment Analysis on Twitter Data
Tweet Classification for Political Sentiment Analysis Micol Marchetti-Bowick.
Distant Supervision for Emotion Classification in Twitter posts 1/17.
Handle] [Person Handle 1] [Person Handle 2] [Person Handle 3] [###] Handle] [Description.
Subjectivity and Sentiment Analysis of Arabic Tweets with Limited Resources Supervisor Dr. Verena Rieser Presented By ESHRAG REFAEE OSACT 27 May 2014.
Great Food, Lousy Service Topic Modeling for Sentiment Analysis in Sparse Reviews Robin Melnick Dan Preston
Problem Semi supervised sarcasm identification using SASI
Sarcasm Detection on Twitter A Behavioral Modeling Approach
LingPipe Does a variety of tasks  Tokenization  Part of Speech Tagging  Named Entity Detection  Clustering  Identifies.
Every Term Has Sentiment: Learning from Emoticon Evidences for Chinese Microblog Sentiment Analysis Jiang Fei State Key Laboratory.
Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, Bing Qin
Natural Language Processing
1 Emotion Classification Using Massive Examples Extracted from the Web Ryoko Tokuhisa, Kentaro Inui, Yuji Matsumoto Toyota Central R&D Labs/Nara Institute.
1 Co-Training for Cross-Lingual Sentiment Classification Xiaojun Wan ( 萬小軍 ) Associate Professor, Peking University ACL 2009.
Experiments of Opinion Analysis On MPQA and NTCIR-6 Yaoyong Li, Kalina Bontcheva, Hamish Cunningham Department of Computer Science University of Sheffield.
TEXT ANALYTICS - LABS Maha Althobaiti Udo Kruschwitz Massimo Poesio.
Sentiment Analysis with Incremental Human-in-the-Loop Learning and Lexical Resource Customization Shubhanshu Mishra 1, Jana Diesner 1, Jason Byrne 2, Elizabeth.
Your Sentiment Precedes You: Using an author’s historical tweets to predict sarcasm Anupam Khattri 2, Aditya Joshi 1,3, Pushpak Bhattacharyya 1, Mark James.
Extracting Opinion Topics for Chinese Opinions using Dependence Grammar Guang Qiu, Kangmiao Liu, Jiajun Bu*, Chun Chen, Zhiming Kang Reporter: Chia-Ying.
Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon -Smit Shilu.
Sentiment Analysis on Tweets. Thumbs up? Sentiment Classification using Machine Learning Techniques Classify documents by overall sentiment. Machine Learning.
A Sentiment-Based Approach to Twitter User Recommendation BY AJAY ABDULPUR RAJARAM NIKKAM.
Deep Learning for Text Analysis Where do we stand?
Language Identification and Part-of-Speech Tagging
Event Detection and Opinion Mining
Kim Schouten, Flavius Frasincar, and Rommert Dekker
Like It or Not: A Survey of Twitter Sentiment Analysis Methods
Sentiment Analysis of Twitter Messages Using Word2Vec
Name: Sushmita Laila Khan Affiliation: Georgia Southern University
Twitter Data Mining and Sentiment Analysis
A Survey Of Topic And Sentiment Analysis In Unstructured Text
Jingcheng Du, B.S., Jun Xu, Ph.D., Hsingyi Song, MPH, Cui Tao, Ph.D.
Aspect-Based Sentiment Analysis Using Lexico-Semantic Patterns
Sentence Modeling Representation of sentences is the heart of Natural Language Processing A sentence model is a representation and analysis of semantic.
Grey Sentiment Analysis
Sentiment analysis tools
Half life.
The Resolution of Speculation and Negation
Relation Extraction CSCI-GA.2591
Influence detection of famous personalities using Politeness and Likeability Navita Jain.
Hijacking the Hashtag: A Case Study of #BreakTheInternet on Twitter
Sentiment Analyzer Using a Multi-Level Classifier
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
Sentiment Analysis Study
Studying Humour Features - Bolla, Whelan
Lexical: Words vs. Characters Syntactic and Stylistic
Proportion of Original Tweets
Quanzeng You, Jiebo Luo, Hailin Jin and Jianchao Yang
Bilingual Term Extraction with Big Data
The Open World of Micro-Videos
Transformer result, convolutional encoder-decoder
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
Sentiment/opinion analysis
Seminar Topics and Projects
Text Mining & Natural Language Processing
Predicting Prevalence of Influenza-Like Illness From Geo-Tagged Tweets
Text Mining & Natural Language Processing
Ngram frequency smooting
Giuseppe Attardi Dipartimento di Informatica Università di Pisa
Analyzing Twitter Discourse On Genetic Testing
Twitter Stance Detection with Bidirectional Conditional Encoding
IST256 : Applications Programming for Information Systems
Corpus Size and the Robustness of Measures of Corpus Distance
Higher Physical Education Specimen Paper
Introduction to Sentiment Analysis
Sentiment Classification
Tokenizing Search/regex Statistics
From Unstructured Text to StructureD Data
Natural Language Processing Is So Difficult
Presentation transcript:

Mark Cieliebak Jan Deriu Dominik Egger Fatih Uzdilli A Twitter Corpus and Benchmark Resources for German Sentiment Analysis Zurich University of Applied Sciences (ZHAW) - Winterthur, Switzerland SpinningBytes AG, Küsnacht, Switzerland Mark Cieliebak Jan Deriu Dominik Egger Fatih Uzdilli New Corpus: SB-10k 9738 German tweets Labels: "positive", "neutral", "negative" and "mixed" Each tweet annotated by 3 annotators Designed to cover a wide variety of unigrams and topics Available Corpora Previously existing corpora in German: DAI Tweets (1800 samples): Too small for training complex models MGS Corpus (109’130 samples): Low quality of annotations PotTS Corpus (7992 samples): Annotations only on phrase-level Benchmark for Sentiment Analysis in German SVM System CNN System Features n-grams, n = 1...4, POS-n-grams, n = 3...5, non-contiguous n-grams, n = 3...5 Character n-grams, n = 3...6 # upper-cased tokens, # of hashtags, # of POS tags # continuous punctuations (max), last token punctuation (?, !) # elongated words, # negated tokens Lexicons: NRC-emotion, BingLiu, MQA, NRC-HashtagSentiment, Sentiment140, Sentiment140-3-class, RottenTomatoes-3-class) Importance of high-quality Word Embeddings Results CNN outperforms SVM in all but one case (red) SB-10k generalizes better than MGS to unseen data Resulting F1-Scores match state-of-the-art Corpus and source code are publicly available at www.spinningbytes.com/resources