Coletto, Lucchese, Orlando, Perego ELECTORAL PREDICTIONS WITH TWITTER: A MACHINE-LEARNING APPROACH M. Coletto 1,3, C. Lucchese 1, S. Orlando 2, and R. Perego 1 1 ISTI-CNR, Pisa 2 University Ca’ Foscari of Venice 3 IMT Institute for Advanced Studies, Lucca May 2015
Coletto, Lucchese, Orlando, Perego In this work we study how Twitter can provide some interesting insights concerning the primary elections of an Italian political party. INTRODUCTION 26/05/152
Coletto, Lucchese, Orlando, Perego STATE-OF-THE-ART DATA BASELINE METHODS AGE BIAS CONCLUSION AGENDA 26/05/153
Coletto, Lucchese, Orlando, Perego Twitter for predictive tasks: from prediction of stock market [1] to movie sales [2], and pandemics detection [3]. Many articles propose quantitative approaches to predict the electoral results in different countries: US [4], Germany [5], Holland [6], Italy [7]. STATE-OF-THE-ART [1] Bollen, J., Mao, H., Zeng, X.: Twitter mood predicts the stock market. Journal of Computa- tional Science 2(1), 1–8 (2011) [2] Asur, S., Huberman, B.A.: Predicting the future with social media. In: Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on. vol. 1, pp. 492–499. IEEE (2010) [3] Lampos, V., De Bie, T., Cristianini, N.: Flu detector-tracking epidemics on twitter. In: Ma- chine Learning and Knowledge Discovery in Databases, pp. 599–602. Springer (2010) [4] O’Connor, B., Balasubramanyan, R., Routledge, B.R., Smith, N.A.: From tweets to polls: Linking text sentiment to public opinion time series. ICWSM 11, 122–129 (2010) [5] Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: Predicting elections with twitter: What 140 characters reveal about political sentiment. ICWSM 10, 178–185 (2010) [6] Sang, E.T.K., Bos, J.: Predicting the 2011 dutch senate election results with twit- ter. In: Proceedings of the Workshop on Semantic Analysis in Social Media. pp. 53–60. Association for Computational Linguistics, Stroudsburg, PA, USA (2012) [7] Caldarelli,G.,Chessa,A.,Pammolli,F.,Pompa,G.,Puliga,M.,Riccaboni,M.,Riotta,G.:A multi-level geographical study of italian political elections from twitter data. PloS one 9(5), e95809 (2014) 26/05/154
Coletto, Lucchese, Orlando, Perego 26/05/155
Coletto, Lucchese, Orlando, Perego DATA 26/05/156
Coletto, Lucchese, Orlando, Perego 26/05/157
Coletto, Lucchese, Orlando, Perego Tumasjan, A., Sprenger, T.O., Sandner, P.G., Welpe, I.M.: Predicting elections with twitter: What 140 characters reveal about political sentiment. ICWSM 10, 178–185 (2010) TweetCount DiGrazia, J., McKelvey, K., Bollen, J., Rojas, F.: More tweets, more votes: Social media as a quantitative indicator of political behavior. PloS one 8(11), e79449 (2013) UserCount BASELINE 26/05/158
Coletto, Lucchese, Orlando, Perego EVALUATION: -MAE (mean absolute error) -RMSE (root-mean-square error) -MRM (mean rank match) 26/05/159
Coletto, Lucchese, Orlando, Perego Proposed classification methods -UserShare -ClassTweetCount -ClassUserCount METHODS 26/05/1510
Coletto, Lucchese, Orlando, Perego 26/05/1511
Coletto, Lucchese, Orlando, Perego Training correcting factors through ML – Per candidate – Learning weights to evaluate Twitter user/ voters ratio – Metrics: UserShare, ClassTweetCount Content Analysis (100 most frequent hash- tags) – 1 feature per word – Sentiment Analysis per candidate METHODS 2 26/05/1512
Coletto, Lucchese, Orlando, Perego 26/05/1513
Coletto, Lucchese, Orlando, Perego 26/05/1514
Coletto, Lucchese, Orlando, Perego AGE BIAS 26/05/1515
Coletto, Lucchese, Orlando, Perego 26/05/1516
Coletto, Lucchese, Orlando, Perego New predictors Machine learning approach Age bias analysis LIMITATIONS AND FUTURE WORK Twitter bias Single dataset (European) Arbitrariness (window, keywords,..) CONCLUSION 26/05/1517
Coletto, Lucchese, Orlando, Perego THANK YOU QUESTIONS?