Homework 3 Progress Presentation -Meet Shah
Goal Identify whether tweet is sarcastic or not.
Data Set and Ground Truth 3 Dataset in which one dataset contains all the sarcastic tweets, second one contains all the tweets with which I can differentiate sarcastic tweets and non sarcastic tweets. The last one contains regular tweets.
Feature choosen Punctuation based features Replace user and hashtags with [USER] and [HASHTAGS] respectively Replace location, link with [LOCATION] and [LINK] respectively Find tweets with capital words. emnlp_dict.txt used to replace all occurrences of words given with their proper spelling Find emoticons [regex or emoji4j library] If tweet contains two or more sentences, check polarity of both sentence. #sarcasm #sarcastic
Features Extraction All the punctuation based features Extract tweets which contains capital words Remove tweets which contains link Replace username with [USER] and #tags with [HASHTAG].
Thank You