Forecasting Financial Time Series using Neural Networks, Genetic Programming and AutoRegressive Models
Mohammad Ali Farid Faiza Ather M. Omer Sheikh Umair H. Siddiqui
Financial time series data : non linear non trivial Stochastic data makes prediction difficult Efficient Market Hypothesis AIM of the project : To test the predictability of financial time series data using both parametric and nonparametric models
EFFICIENT MARKET HYPOTHESIS The possibility of arbitrage makes it impossible to predict the future values The prediction which is realized by everyone in the market will not turn true Assumes that every investor has perfect and equal information If EMH is true then investing in speculative trade is no better than gambling If EMH is NOT true then the market can be predicted
AGENDA Understanding the Data : Data Analysis Modeling and Prediction Using: Neural Networks Econometrical Methods Genetic Programming Conclusion
DATA SET FINANCIAL TIME SERIES: Daily Exchange Rate DATA: Exchange Rate between US Dollar and Pakistani Rupee. SERIES: 371 Points FROM 31 Jan 2002 to 4 th Feb 2003 ( Period Selected due to stability and lack of external shocks) DATA PROVIDERS: ONADA Currency Exchange
DATA ANALYSIS PROBLEM: Non Stationarity 2% difference in mean between the first half and second half SOLUTION: Preprocessing Mean Min Max Range0.001
DATA PREPROCESSING Unprepocessed data exhibits non - stationarity The trend does not allow modeling First Order Differencing required
STATISTICAL ANALYSIS Mean : E-06 Standard deviation: Skewness: (negatively skewed) Kurtosis: (leptokurtic) Min: Max: Range:
MODELING EXCHANGE RATE Numerous factors affect exchange rates Impossible to factor in all the variables Solution: Use Correlation Use past values to predict the future
Data Sets Windowing: Selecting the input and output set The project used five different types of data sets for the training. The variations in the data are based on different window sizes and different level of data processing.
DATA SETS Data sets used for modeling were: Data Set A: Primary series, change in exchange rate, daily values with 7-1 window. Data Set B: Primary series with 14-1 window. Data Set C: Moving averages, average of three days with 7-1 window. Data Set D: Primary series with 7-3 window.
MODELING AND PREDICTION USING NEURAL NETWORKS Feed Forward Networks Radial Basis Networks Recurrent Elman Networks
COMPARISON OF DATA SET A AND DATA SET B
FEED FORWARD NETWORKS Universal Approximators: Capable of representing non-linear functional mappings between inputs and outputs Can be trained with a powerful and computationally efficient training algorithm called the Error Back Propagation Architecture:
COMPARISON OF ALL THE FFNs FFNS with varied: Activation Functions, Training algorithms and number of hidden layers
FEED FORWARD NETWORKS BEST: Single hidden layer Activation function: logsig and linear Training algorithm:gradient descent with momentum
What are radial basis networks? Radial Basis Function Networks are based on the viewpoint that learning is similar to finding a surface in a multi-dimensional space that provides a best fit to the training data Hidden units provide a set of “functions” that constitute an arbitrary “basis” for the input patterns when they are expanded into the hidden-unit space; these functions are called radial-basis functions. Architecture : The input layer: source nodes. The hidden layer: high enough dimension The output layer: supplies a response of the network to the activation patterns applied to the input layer. RADIAL BASIS NETWORK
In the radial basis network the Gaussian function was used as the basis function in its hidden layer Φ (r) = exp ( - r 2 / 2σ 2 ) for σ> 0, & r >= 0 (σ ) Spread of the radial basis is a significant factor in design of the network RADIAL BASIS NETWORK
COMPARISON OF RADIAL BASIS NETWORKS RB4 spread = 0.3 RB5 spread = 1.5
RADIAL BASIS NETWORK BEST: ( σ ) Spread = 0.3 i.e 1/3 * range
RECURRENT ELMAN NETWORKS A modification of the feedforward architecture A “context” layer is added, which retains information between observations. New inputs are fed into the RNN that is, previous contents of the hidden layer are passed into the context layer. These are then fed back into the hidden layer in the next time step
COMPARISON OF ELMAN NETWORK
RESULTS WITH ELMAN NETWORK BEST: Elman 7 Hidden layers : 2 ; 10 neurons in the first and 5 in the second Activation function for both layers: logsigmoid Training algorithm: gradient descent with momentum
A COMPRISON OF ALL THE NEURAL NETS
AUTO REGRESSIVE MODEL WITH 16 LAGS
CoefficientsStandard Errort StatP-value Intercept E E deltaE E-17 deltaE E-21 deltaE E-13 deltaE E-12 deltaE E-10 deltaE E-08 deltaE deltaE deltaE deltaE deltaE deltaE deltaE deltaE deltaE deltaE
CoefficientsStandard Errort StatP-value Intercept E E deltaE E-17 deltaE E-21 deltaE E-13 deltaE E-12 deltaE E-10 deltaE E-08 deltaE deltaE deltaE deltaE deltaE deltaE deltaE deltaE deltaE
CoefficientsStandard Errort StatP-value Intercept E E deltaE E-17 deltaE E-21 deltaE E-13 deltaE E-12 deltaE E-10 deltaE E-08 deltaE deltaE deltaE deltaE deltaE deltaE deltaE deltaE E-05
CoefficientsStandard Errort StatP-value Intercept E E deltaE E-20 deltaE E-22 deltaE E-14 deltaE E-12 deltaE E-08 deltaE E-07 deltaE deltaE deltaE deltaE deltaE deltaE deltaE
CoefficientsStandard Errort StatP-value Intercept E E deltaE E-19 deltaE E-21 deltaE E-14 deltaE E-14 deltaE E-09 deltaE E-11 deltaE deltaE deltaE deltaE deltaE deltaE
CoefficientsStandard Errort StatP-value Intercept 5.1E E deltaE E-19 deltaE E-22 deltaE E-15 deltaE E-15 deltaE E-12 deltaE E-11 deltaE deltaE deltaE deltaE deltaE
CoefficientsStandard Errort StatP-value Intercept E E deltaE E-19 deltaE E-24 deltaE E-15 deltaE E-19 deltaE E-12 deltaE E-11 deltaE deltaE deltaE deltaE
CoefficientsStandard Errort StatP-value Intercept E E deltaE E-19 deltaE E-24 deltaE E-18 deltaE E-19 deltaE E-12 deltaE E-11 deltaE deltaE deltaE
AUTO REGRESSIVE MODEL WITH 9 LAGS
SkewnessKurtosisK - SL1 Upper Bound Lower Bound Original SIMULATION TESTS FOR NORMALITY
Lag 1Lag 2Lag 3Lag 4 Upper Bound Lower Bound Original SIMULATION TESTS FOR AUTOCORRELATION
Goldfeld Quandt Upper Bound Lower Bound Original SIMULATION TEST FOR HETEROSKEDASTICITY
Chow Upper Bound Lower Bound Original SIMULATION TEST FOR STRUCTURAL STABILITY
Inspired by Darwin's Theory of Evolution Inventor John Holland: ‘Adaptation in Natural and Artificial Systems’ (1975) Solution: an Evolved Solution Applications: music, military, optimization techniques etc. GENETIC ALGORITHMS
Genetic Programming (GP): a branch of GA’s John Koza (1992): GA’s to evolve programs to perform certain tasks LISP programs used Prefix notation GENETIC PROGRAMMING (GP)
Generate an initial population of random functions Execute each program Assign a fitness value to the program Create a new population of computer programs by Crossover (sexual reproduction) Mutation Solution is the best computer program in any generation STEPS IN GP
TSGP by Mahmoud Kaboudan (School of Business, University of Redlands) Fitness criteria: minimize the sum of squared error (SSE) Variables in the program: data points in Historical (Training) set: T number of data points to Forecast: k data points for ex post Forecast: f population size: p number of generations: g number of explanatory variables: n number of searches desired (to prevent local minima): s GP & FINANCIALTIME SERIES FORECASTING
The variables we manipulated Data in Historical set (T) (Increase T search time increases exponentially) Total points to forecast (k) Population size (p) (interesting observations) Number of generations (g) Upon completion files generated having results of each search A Results files with forecasts based on the best evolved model found RESULTS & OBSERVATIONS
RESULTS FROM A SAMPLE RUN T = 80 K = 15 P = 1000 G = 150 N = 7 S = 50
T = 80 K = 15 P = 2000 G = 200 N = 14 S = 100 AN IMPORTANT OBSERVATION Reason for the anomaly: Research shows that after some limit it is not useful to use very large populations This is exactly what we have seen from our results
CONCLUSION Overview of the Project Neural Networks AutoRegressive Models Genetic Programming Results: Prediction with Radial Basis and Feedforward can give profitable returns Milestones Significance Further research
COMPARISON OF ALL MODELS
MILESTONES More than 100 neural networks trained and tested First documented study on Pakistani currency market One of the most broad studies on the subject Results can be used for policy making, investment decisions and financial speculative trading Developed a system that can make PROFITS
Developing hybrid models to improve the predictability of the system Developing trading rules for investing in the currency market Making the system resilient to external non market shocks FURTHER RESERACH
Q & A