Download presentation
Presentation is loading. Please wait.
Published byNancy French Modified over 7 years ago
1
John A. Paulson School of Engineering and Applied Sciences
Stock Trading Using Reinforcement Learning By Jonah Varon and Anthony Soroka CS 182: Artificial Intelligence Fall 2016 Introduction Abstract: We applied Q-Learning to stock trading. Specifically we attempted to place our agent in states to find profitable strategies through technical analysis and mean reversion. Our results were mixed, leading us to question the applicability of reinforcement learning in the stock market – a space that differs in significant ways from those where reinforcement learning has proven suitable. However, our multi-agent technical analysis approach highlighted specific stocks where Q-Learning proved effective for certain time periods. Considering the interest, there is surprisingly limited available research on reinforcement learning applied to stock trading. Hoping to either further public research or find a profitable strategy, we applied Q-Learning to stock trading. We used Quantopian to test our strategies. Quantopian is a crowd-sourced investment firm which provides market data and a python-based development platform for algorithmic investors. Results Approach Figure 1. YELP Tech-Analysis Agent Figure 2. ATML Tech-Analysis Agent Results Summarized: Most of our strategies did not outperform reasonable benchmarks (S&P 500 / specific stock). However, using a multi-agent approach on the technical analysis agent, we were able to scan the market for specific stocks and specific periods where the agent was profitable. Example successful agents are illustrated above. State Representation Q-Learning is defined by the following update function: The state space for the stock market is vast. Hence, limiting the space to allow the agent to learn is crucial. Given such, we represented states in hopes of positioning our agent to find known trading strategies: Technical Analysis: last 3 minutes of price and volume data {-1: Decreased, 0: Unchanged, 1: Increased}. # of States = 3^6 = 729 # of Actions per Year = 98,280 Mean Reversion: Previous week’s stock performance {-1: Stock was in the bottom X Percentile, 1: Stock was in top X percentile} For our project, we typically applied the below: a = {Buy, Sell} r = Stock Performance The stock market however has a few interesting characteristics that slightly alter Q-Learning: a) Future State: Due to the ability to rebalance, max Q(s’,a’) is the same regardless of Q(s,a) (presuming you are a small enough investor with minimal market impact). b) Fully Observable: Limiting the agent to actions of Buy and Sell, the reward from Q(s,a) is fully observable regardless of the actual action taken. We tried both approaches to updating Q(s,a): 1) Only for action taken 2) For both taken and not taken actions Conclusions Our attempts to find alpha using Q-Learning illustrate how competitive and efficient markets are. Additionally it highlights that reinforcement learning is perhaps not the best suited for stock trading given: Stochasticity of the market Limited amount of train data Agent can observe rewards without taking actions Agent has little impact on the environment That being said the possible methods to define a state space are limitless and should still be investigated. Moreover, our multi-agent approach proved useful in highlighting specific stocks where Q-Learning might be applicable. Lastly, we believe Deep Learning techniques such as LSTM might be better suited for this problem. Data Quantopian provides a wealth of technical, fundamental, and event information from thousands of stocks. Code Snippet. Updating Technical Analysis Agent’s State John A. Paulson School of Engineering and Applied Sciences
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.