Download presentation
Presentation is loading. Please wait.
Published byEunice Hampton Modified over 8 years ago
1
GP FOR ADAPTIVE MARKETS Jake Pacheco 6/11/2010
2
The Goal Produce a system that can create novel quantitative trading strategies for the stock market. A quantitative trading strategy is a strategy that relies only on market data to make predictions. It was decided that using genetic programming to create strategies would be the best option.
3
Genetic Programming Genetic programming (GP) is a type of algorithm that is based off of ideas gained from evolution. Essentially, a “generation” of “individuals” is evaluated, and then the “fittest” are chosen to “crossover” and “mutate”, creating the next generation. This process continues until a specified number of generations are completed or until the fitness reaches a desired value.
4
GP Fitness Possibly the most important aspect of a GP. Determines how trees are compared. The fitness function is defined differently for each problem Ex: In symbolic regression, the fitness can be measured by the distance between the guesses and the actual values.
5
GP Individuals An individual consists of one or more trees, each of which represents a function. Trees consist of function nodes and leaf nodes. Function nodes define functions such as addition or subtraction. They each take one or more child nodes. Variable nodes are the leaves of the tree. They are given values by the program.
6
GP Crossover To perform crossover, randomly pick a node on each of the individuals. Swap those nodes and their subtrees.
7
GP Mutation To perform mutation, pick a node in the tree at random. Remove that node and its subtree and grow a new subtree in its place.
8
ECJ There are many frameworks in existence for performing evolutionary computation. It was decided that ECJ, a framework for evolutionary computation written in Java, would be used. ECJ has a good set of features, good documentation and, importantly, is written in Java.
9
ECJ Architecture: Evolve To begin a GP run in ECJ, you must call ec.Evolve with the desired parameter file passed as a command line argument. ECJ uses parameter files to control almost every parameterized aspect of the program. They are essentially sequences of declarations. Evolve sets up the program and initializes the population, and is responsible for conducting the run, calling evaluate on each individual, etc.
10
ECJ Architecture: GPProblem GPProblem is the class where the actual problem is defined. In it, the fitness function and the evaluate method is specified. The GPProblem evaluate method determines how the output of the function tree is evaluated, and is also responsible for setting an individual’s fitness. GPProblem is subclassed to specify application specific evaluation.
11
ECJ Architecture: GPIndividual
12
ECJ Architecture: Crossover and Mutation Crossover and mutation are performed by the BreedingPipeline class, which is responsible for generating individuals for the new generation. ECJ has built in subclasses to provide crossover and mutation pipelines.
13
Function Nodes There were a total of 30 nodes defined, of which 17 were variable nodes and 13 were function nodes. Variable nodes: Open, High, Low, Close Function nodes: Add, Subtract, Multiply, Divide, Zero-Lag Exponential Moving Average, Log, Sine, Cosine, Standard Deviation, Less Than, Greater Than, Equal To, If.
14
Fitness Function and Evaluation The fitness function in this case is immediately apparent: The more money an individual makes, the better its fitness should be. Less apparent is how to convert the output of an individual’s function to a buy/sell decision during evaluation. Decided to normalize the results over the past 60 days. Positions are opened and closed at the end of each day.
15
Training Dataset Data is from Yahoo finance. S&P 500 and IBM from 1971-2010 The data is divided into a training dataset as follows: Each generation, a random 2/3 of the data is selected to be the training portion. During evaluation, the system stays out of the market on days that are not in the training dataset. The goal of using random training sets is to encourage more general strategies.
16
Results On average, the strategies generated have an 8- 11% annualized rate of return trading on the S&P 500, while buy and hold has an annualized rate of return of 6.6%.
17
Results
18
Future Work Add more functions Multiple trees for each individual Adaptive selection Change evaluation method Interface directly with Yahoo API
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.