Sentiment analysis of news articles for financial signal prediction Anand Atreya Nicholas Cohen Jinjiang James Zhai
Motivation Financial markets can be swayed by sentiment Bearish sentiment can make a down market worse and lessen the impact of positive news Vice versa for bullish sentiment Firms which take advantage of sentiment information quickly can gain an edge Computers analyzing sentiment can work far faster (and for less money) than human analysts Our hypothesis: sentiment can be discovered in news articles about finance
Methods Data sets: New York Times articles about finance (from the business section, containing the word “stock”, and with the metatag “financial desk”) from the LDC corpus Articles from 2006 were used S&P 500 data used as representative of market Stanford MaxEnt classifier was used
Methods (continued) Two approaches were tried Manual sentiment training: manually classified articles into positive, neutral, or negative sentiment, used these sets as training and test Automatic: used the market return for the day preceding the news article with thresholds for positive, neutral, negative
Results: classification F1 for manual classification (positive, neutral, negative): 0.581, 0.614, (141 test cases) F1 results for automatic classification with and without metadata filtering: Decent results for manual classification; mixed results for automatic classification Using metadata filtering appears to help in most cases (except negative sentiment)
Results: correlation with market Not clear that article sentiment is correlated with market movements
Future work Classify different portions of an article Some articles discuss several stocks or events with different sentiment Select news articles only discussing companies in the S&P 500 index Classify articles that come in throughout the day (i.e. over a wire) and correlate with market movements intra-day Use a time window of more than one day for market returns: sentiment may correlate with longer term movements Could use a moving average for this