Download presentation
Presentation is loading. Please wait.
Published byMaud Newton Modified over 9 years ago
1
Forecasting with Twitter data Presented by : Thusitha Chandrapala 20064923 MARTA ARIAS, ARGIMIRO ARRATIA, and RAMON XURIGUERA
2
What information does twitter messages have? Twitter information ▫Sentiment analysis: Are people happy or unhappy about a certain topic? ▫Volume: Number of tweets about a given topic Does twitter really help in predicting time series data? ▫Moving stream of info.
3
This motivation of the paper Use three different forecasting model families, vary parameters systematically and analyze under which conditions twitter information is actually useful Testing non-linearity and causality between twitter data and the target Introduction of summery tree
4
Related work Stock market prediction ▫Bollen et al: Twitter -> sentiment->predict Dow Jones Industrial average ▫Wolfram et al. Twitter as an additional source of features, no sentiment analysis Movie box office income ▫Mishne et al: correlation, blog posts ▫Asur et al: predict sales
5
Work flow 1) Collecting data 2) Cleaning and preprocessing 3) Sentiment analysis 4) Prediction model
6
Preprocessing: Language detection Negation handling: considering “I like this…” and “I don’t like this… “ to be 2 features Relevance filtering and topic classification: using LDA ▫Latent Dirichlet Allocation
7
Sentiment classification Whether the text contains negative or positive impressions on a given subject Approach 1: ▫Automatic tagging to extract training instances :) :D - Happy sentiment :( - Unhappy sentiment ▫Binary classification problem: Use naïve Bayes to train the classifier ▫Use different dictionaries as features
8
Sentiment classification Whether the text contains negative or positive impressions on a given subject Approach 1: ▫Automatic tagging to extract training instances :) :D - Happy sentiment :( - Unhappy sentiment ▫Binary classification problem: Use naïve Bayes to train the classifier ▫Use different dictionaries as features
9
Sentiment index A time-series of sentiment values ▫The daily value is calculated based on the daily % of +/- tweets over the total number of messages on a specific topic
10
Training the model ARMA : Auto Regressive Moving Average ▫y[t] = a.x[t]+b.x[t-1]+… +m.y[t-1]+n.y[t-2]….. Simplified prediction: ▫A binary prediction, which says if y[t]>y[t-1] ▫Use past values of self, and twitter time series
11
Model parameters Target Time seriesShare Market :Returns Movie box office: Revenue Twitter seriesVolume Sentiment Index Forecasting model familyLinear models Support vector machines Neural networks Result: Does including Twitter data increase classification accuracy by 5%?
12
Study details Stock market prediction targets ▫Companies: Apple, google, … ▫General market indices: S&P100, S&P500 Box office data ▫Daily sales revenue series
13
Summery Tree Helps to identify model parameters that leads to consistently +/- results Decision Tree structure ▫Nodes are different parameters ▫Leaves : Result
14
Summery Tree
15
Results: Stock market data Summery of prediction results: ▫Generally Linear models do not provide a significance performance improvement either for twitter volume or sentiment analysis based info. ▫Non-linear models can give an improvement! ▫Neural network based models gave the best performance
16
Results: Stock market data
17
Results: Movie box office Summary: ▫Sentiment analysis did not have a positive impact ▫Volume information had a positive impact with Linear regression and SVM
18
Conclusion In general, twitter information when used with non-linear models increase the prediction accuracy for long term stock market predictions Twitter volume had a linear relationship with movie sales, but sentiment analysis had none
19
Appendix Logarithmic returns of the series
20
Testing model adequacy Testing the relationship between twitter time series and the time series that has to be forecasted Neglected nonlinearity ▫Are the 2 Time series non-linearly related? Granger causality ▫X->Y OR Y->X ?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.