Twitter Mood Predicts the Stock Market Authors: Johan Bollen, Huina Mao, Xiao-Jun Zeng Presented By: Krishna Aswani Computing ID: ka5am
Is it possible to predict Stock Markets?? Early research: Stock markets are based on the Efficient Market Hypothesis (by new information, i.e. news, rather than present and past prices) and random walk theory Recent research: News may be unpredictable but early indicators can be extracted from online social media (blogs, Twitter feeds, etc) to predict changes in various economic and commercial indicators
Method: Twitter Feed DJIA Text Analysis Normaliz- ation Mood Indicators (Daily) Stock Markets (Daily) Granger Causality SOFNN F-statistics p-value MAPE Direction% t-1 t-2 t-3 t=0 value Predicted Value Phase 1
Step 1 – Collecting Public Tweets (February 28 to December 19th, ,853,498 tweets posted by approximately 2.7M users), removing stopwords, normalizing them etc. Step2- Pass it through Opinion Finder and Google Profile of Mood States (GPOMS) to create time series. Step3 – To have a comparison of time series from Opinion Finder and Google Profile of Mood States z-score is used to normalize each: Step 4 – Cross Validating against large socio-cultural events. Phase1: Creating sentiment time series Google Profile of Mood States classifies tweets into 6 types: Calm, Alert, Sure, Vital, Kind & Happy. Opinion Finder is a software package that classifies tweets into Positive and Negative. For each day ratio of total no. of Positive tweets to total no. of negative tweets is calculated
Method: Twitter Feed DJIA Text Analysis Normaliz- ation Mood Indicators (Daily) Stock Markets (Daily) Granger Causality SOFNN F-statistics p-value MAPE Direction% t-1 t-2 t-3 t=0 value Predicted Value Phase 2
Phase 2 – Correlation between mood time series and DJIA Step1- Collect DJIA data for the same time duration, normalize it and plot a time series. Step2 - Use Granger causality analysis on model 1 & 2: Granger causality analysis rests on the assumption that if a variable X causes Y then changes in X will systematically occur before changes in Y
Correlation does not mean causation
Method: Twitter Feed DJIA Text Analysis Normaliz- ation Mood Indicators (Daily) Stock Markets (Daily) Granger Causality SOFNN F-statistics p-value MAPE Direction% t-1 t-2 t-3 t=0 value Predicted Value Phase 3
Phase 3- Non-linear models for accurate stock prediction As the relationship between DJIA and Mood time series doesn’t look linear, to predict with better accuracy Self Organizing Fuzzy Neural Network (SOFNN) are used. Different Permutations of input variables (Mood Time series) are used:
Results: Calm Calm and Happy
Factors not considered Geographic Location of Tweets. This approach worked because twitter base is predominantly located in the US. These results are strongly indicative of a predictive correlation between measurements of the public mood states from Twitter feeds, but offer no information on the causative mechanisms that may connect online public mood states with DJIA values It is highly vulnerable to twitter bombing campaigns, which very easily become viral.
Applications: Companies like Tower Research Capital (computational investment trading) Dataminr (social analytics company)