Austin Karingada, Jacob Handy, Adviser : Dr

Analyzing #POTUS Sentiment on Twitter to Predict Public Opinion on Presidential Issues
Austin Karingada, Jacob Handy, Adviser : Dr. Dongchul Kim Department of Computer Science, College of Engineering and Computer Science ACKNOWLEDGEMENTS Acknowledge collaborators, partners and funding agencies, either with text or with their logos, as appropriate. GOALS Some of the goals of this project are: predict public opinion on a presidential policy by searching for sentiment patterns in past tweets using #POTUS. Implement an SVM and Naïve Bayes Algorithm and compare the accuracy of the prediction with each other INTRODUCTION Abstract For this project we are using twitter data to predict the public opinion on a presidential policy by searching for sentiment patterns in past tweets using #POTUS. In this project we used 2 machine learning methods: Naïve Bayes Algorithm and Support Vector Machines or SVM for short. We used these 2 methods in order to perform sentiment analysis on the tweets and then compare results to each other to see which method yielded the higher accuracy rate. Introduction #POTUS is a very trendy tweet ever since Donald Trump took office and is very active on twitter. From the interactivity from other twitter users, we can either distinguish them from negative, positive or neutral. Here we used a twitter API to get the tweets that used #POTUS and from there distinguished the tweets based on what words that are being used in order to feed it into the machine learning algorithm. METHOD For Collecting the data: We used Tython query with parameters: Searching for #POTUS Switch between mixed and recent results 100 tweets at a time Tweets in English Used one-hot encoded data into dictionary of id, classes and sentiment and wrote that to a csv file without label and id for calculating. Naive Bayes Algorithm Simple and effective classification algorithm Supervised learning Popular uses include: spam filters, text analysis and medical diagnosis. Assumes that the probability of each attribute belonging to a given class value is independent of all other attributes Calculates the probability of each instance of each class and selects the highest probability Process of Naïve bayes: Before the prediction: Preprocess the data into the table format from earlier Split the data set with 67% for training set and 33% for test set Separate data by classes to calculate the statistics for each class Calculate the mean Calculate the standard deviation Collect the values After the prediction: Calculate probabilities using the equation on last slide Summarize all the probabilities for each class Make a prediction based on the best probability Test the probabilities with the actual values Get the accuracy as a percentage RESULTS We had achieved a accuracy rate of 67.55% accuracy with Naïve Bayes. But since SVM was not working properly to display a proper accuracy rate due to not being implemented properly upon the time of this poster being made. DISCUSSION Limitations: One of the biggest limitations was sarcasm, there was not a good method to detect sarcasm in any libraries or any algorithms that could pick it up. We did try to identify both negative and positive words in a tweet but it doesn’t really fix the problem of trying to detect sarcasm. Another limitation we had was the amount of data we could get, the API we used only gave us a 100 tweets and in order to get a years worth of tweets, we have to pay for that and it is very expensive. CONCLUSIONS In conclusion, we were able to predict the sentiment at a rate of 67.55% with Naïve Bayes but not with SVMs and hopefully in the future, we can fix the SVM and compare that to Naïve Bayes and see which model can predict the sentiment better. Another future work is that we can implement bigrams, bigrams are a pair of consecutive written units such as letters, syllables, or words. Bigrams would help pick up sarcasm and would help improve accuracy rate of the program. Another future work is to implement Bernoulli Naïve Bayes which would work better since we dropped neutral sentiment. The math behind Naives Bayes This the example table, this table shows what keywords popped up in the POTUS and what the overall sentiment was from other twitter users. 1 = positive, 0 = negative Above here the picture is the how the one-hot encoded file looks like. The 0 is negative sentiment and the 1 is positive sentiment

Austin Karingada, Jacob Handy, Adviser : Dr

Similar presentations

Presentation on theme: "Austin Karingada, Jacob Handy, Adviser : Dr"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Austin Karingada, Jacob Handy, Adviser : Dr

Similar presentations

Presentation on theme: "Austin Karingada, Jacob Handy, Adviser : Dr"— Presentation transcript:

Similar presentations

About project

Feedback