1 Entropy Rate and Profitability of Technical Analysis: Experiments on the NYSE US 100 Stocks Nicolas NAVET – INRIA France Shu-Heng CHEN.

Slides:



Advertisements
Similar presentations
High Resolution studies
Advertisements

Towards Data Mining Without Information on Knowledge Structure
Quantitative Methods Topic 5 Probability Distributions
Chapter 5: CPU Scheduling
Value-at-Risk: A Risk Estimating Tool for Management
Optimal Algorithms for k-Search with Application in Option Pricing Julian Lorenz, Konstantinos Panagiotou, Angelika Steger Institute of Theoretical.
Optimal Adaptive Execution of Portfolio Transactions
Chapter Outline 9.1 Returns 9.2 Holding-Period Returns
Copyright © 2014 by McGraw-Hill Higher Education. All rights reserved.
Rachel T. Johnson Douglas C. Montgomery Bradley Jones
1 Copyright © 2010, Elsevier Inc. All rights Reserved Fig 2.1 Chapter 2.
Effective Change Detection Using Sampling Junghoo John Cho Alexandros Ntoulas UCLA.
By D. Fisher Geometric Transformations. Reflection, Rotation, or Translation 1.
Business Transaction Management Software for Application Coordination 1 Business Processes and Coordination.
Energy-Efficient Distributed Algorithms for Ad hoc Wireless Networks Gopal Pandurangan Department of Computer Science Purdue University.
Risk and return (1) Class 9 Financial Management,
Describing Data: Measures of Dispersion
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
Jeopardy Q 1 Q 6 Q 11 Q 16 Q 21 Q 2 Q 7 Q 12 Q 17 Q 22 Q 3 Q 8 Q 13
0 - 0.
ALGEBRAIC EXPRESSIONS
DIVIDING INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
SUBTRACTING INTEGERS 1. CHANGE THE SUBTRACTION SIGN TO ADDITION
MULT. INTEGERS 1. IF THE SIGNS ARE THE SAME THE ANSWER IS POSITIVE 2. IF THE SIGNS ARE DIFFERENT THE ANSWER IS NEGATIVE.
Addition Facts
Year 6 mental test 5 second questions
C82MST Statistical Methods 2 - Lecture 2 1 Overview of Lecture Variability and Averages The Normal Distribution Comparing Population Variances Experimental.
Assumptions underlying regression analysis
On Sequential Experimental Design for Empirical Model-Building under Interval Error Sergei Zhilin, Altai State University, Barnaul, Russia.
Correctness of Gossip-Based Membership under Message Loss Maxim GurevichIdit Keidar Technion.
Simple Linear Regression 1. review of least squares procedure 2
Independent Demand Inventory Systems
CS16: Introduction to Data Structures & Algorithms
BT Wholesale October Creating your own telephone network WHOLESALE CALLS LINE ASSOCIATED.
Experimental and Quasiexperimental Designs Chapter 10 Copyright © 2009 Elsevier Canada, a division of Reed Elsevier Canada, Ltd.
Pattern Finding and Pattern Discovery in Time Series
Pushing the limits of CAN - Scheduling frames with offsets provides a major performance boost Nicolas NAVET INRIA / RealTime-at-Work
Analysis of the interrelationship between listed real estate share index and other stock market indexes The Swedish stock market S VANTE M ANDELL.
Université du Québec École de technologie supérieure Face Recognition in Video Using What- and-Where Fusion Neural Network Mamoudou Barry and Eric Granger.
ABC Technology Project
Mental Math Math Team Skills Test 20-Question Sample.
Random Walk Models for Stock Prices Statistics and Data Analysis Professor William Greene Stern School of Business Department of IOMS Department of Economics.
Slide 1 ILLINOIS - RAILROAD ENGINEERING Railroad Hazardous Materials Transportation Risk Analysis Under Uncertainty Xiang Liu, M. Rapik Saat and Christopher.
Chapter 4 Inference About Process Quality
Squares and Square Root WALK. Solve each problem REVIEW:
Quantitative Analysis (Statistics Week 8)
Chapter 7 Classification and Regression Trees
GG Consulting, LLC I-SUITE. Source: TEA SHARS Frequently asked questions 2.
Applied Hydrology Regional Frequency Analysis - Example Prof. Ke-Sheng Cheng Department of Bioenvironmental Systems Engineering National Taiwan University.
Levey-Jennings Activity Objectives
AADAPT Workshop South Asia Goa, December 17-21, 2009 Kristen Himelein 1.
Addition 1’s to 20.
25 seconds left…...
Test B, 100 Subtraction Facts
Week 1.
We will resume in: 25 Minutes.
Irwin/McGraw-Hill © The McGraw-Hill Companies, Inc., 1999 Corporate Finance Fifth Edition Ross Jaffe Westerfield Chapter 12 Risk, Return, and Capital.
1 Unit 1 Kinematics Chapter 1 Day
Anne-Laure Ladier*, Gülgün Alpan*, Allen G. Greenwood ● *G-SCOP, Grenoble INP, France ● Department of Industrial Engineering, Mississippi State University.
Part 11: Random Walks and Approximations 11-1/28 Statistics and Data Analysis Professor William Greene Stern School of Business IOMS Department Department.
CHAPTER 14: Confidence Intervals: The Basics
Why does it work? We have not addressed the question of why does this classifier performs well, given that the assumptions are unlikely to be satisfied.
A Non-Blocking Join Achieving Higher Early Result Rate with Statistical Guarantees Shimin Chen* Phillip B. Gibbons* Suman Nath + *Intel Labs Pittsburgh.
1 A Systematic Review of Cross- vs. Within-Company Cost Estimation Studies Barbara Kitchenham Emilia Mendes Guilherme Travassos.
© K.Cuthbertson, D.Nitzsche 1 LECTURE Market Risk/Value at Risk: Basic Concepts Version 1/9/2001.
©R. Schwartz Equity Markets: Trading and StructureSlide 1 Topic 3.
DECISION TREES. Decision trees  One possible representation for hypotheses.
Pretests for genetic-programming evolved trading programs : “zero-intelligence” strategies and lottery trading Nicolas NAVET INRIA - AIECON NCCU
On Predictability and Profitability: Would AI induced Trading Rules be Sensitive to the Entropy of time Series Nicolas NAVET – INRIA France
NEURAL NETWORKS FOR TECHNICAL ANALYSIS: A STUDY ON KLCI 授課教師:楊婉秀 報告人:李宗霖.
Presentation transcript:

1 Entropy Rate and Profitability of Technical Analysis: Experiments on the NYSE US 100 Stocks Nicolas NAVET – INRIA France Shu-Heng CHEN – AIECON/NCCU Taiwan CIEF /22/2007

2 Outline Entropy Rate : uncertainty remaining in the next information produced given knowledge of the past measure of predictability Entropy Rate : uncertainty remaining in the next information produced given knowledge of the past measure of predictability Questions : Questions : Do stocks exhibit differing entropy rates? Do stocks exhibit differing entropy rates? Does low entropy imply profitability of TA? Does low entropy imply profitability of TA? Methodology : Methodology : NYSE US 100 Stocks – daily data – NYSE US 100 Stocks – daily data – TA rules induced using Genetic Programming TA rules induced using Genetic Programming

3 Estimating entropy Active field of research in neuroscience Active field of research in neuroscience Maximum-likehood (Plug-in): Maximum-likehood (Plug-in): construct an n-th order Markov chain construct an n-th order Markov chain not suited to capture long/medium term dependencies not suited to capture long/medium term dependencies Compression-based techniques : Compression-based techniques : Lempel-Ziv algorithm, Context-Tree Weighting Lempel-Ziv algorithm, Context-Tree Weighting Fast convergence rate – suited to long/medium term dependencies Fast convergence rate – suited to long/medium term dependencies

4 Selected estimator Kontoyannis et al 1998 Kontoyannis et al 1998 : length of the shortest string that does not appear in the i previous symbols ^ h SM = Ã 1 n n X i = 1 ¤ i ! ¡ 1 l og 2 n ¤ i Example: ¤ 6 = 3

5 Performance of the estimator Experiments : Experiments : Uniformly distributed r.v. in {1,2,..,8} – theoretical entropy Uniformly distributed r.v. in {1,2,..,8} – theoretical entropy Boost C++ random generator Boost C++ random generator Sample of size Sample of size Note 1 : with sample of size , ^ h SM ¸ 2 : 99 Note 2 : with standard C rand() function and sample size = 10000, ^ h SM = 2 : 77 = ¡ P 8 i = 1 1 = 8 l og 2 p = 3 b. p. c. ^ h SM = 2 : 96

6 Preprocessing the data (1/2) Log ratio between closing prices: r t = l n ( p t p t ¡ 1 ) Discretization : 3,4,1,0,2,6,2,… f r t g 2 R ! f A t g 2 N

7 Preprocessing the data (2/2) Discretization is tricky – 2 problems: How many bins? (size of the alphabet) How many values in each bin? Guidelines : maximize entropy with a number of bins in link with the sample size Here : alphabet of size 8 same number of values in each bin (homogenous partitioning)

8 Entropy of NYSE US 100 stocks – period Note : a normal distribution of same mean and standard deviation is plotted for comparison. Mean = Median = 2.75 Max = 2.79 Min = 2.68 Rand() C lib = 2.77 !

9 Entropy is high but price time series are not random! Original time series Randomly shuffled time series

10 Stocks under study S ym b o l E n t ropy OXY 2 : 789 VLO 2 : 787 MRO 2 : 785 BAX 2 : 78 WAG 2 : 776 S ym b o l E n t ropy TWX 2 : 677 EMC 2 : 694 C 2 : 712 JPM 2 : 716 GE 2 : 723 Highest entropy time series Lowest entropy time series

11 Autocorrelation analysis Typical high complexity stock (OXY) Typical low complexity stock (EMC) Up to a lag 100, there are on average 6 autocorrelations outside the 99% confidence bands for the lowest entropy stocks versus 2 for the highest entropy stocks

12 Part 2 : does low entropy imply better profitability of TA? Addressed here: are GP-induced rules more efficient on low-entropy stocks ?

13 GP : the big picture 1 ) Creation of the trading rules using GP 2) Selection of the best resulting strategies Further selection on unseen data - One strategy is chosen for out-of-sample Performance evaluation Training interval Validation interval Out-of-sample interval

14 GP performance assessment Buy and Hold is not a good benchmark Buy and Hold is not a good benchmark GP is compared with lottery trading (LT) of GP is compared with lottery trading (LT) of same frequency : avg nb of transactions same frequency : avg nb of transactions same intensity : time during which a position is held same intensity : time during which a position is held Implementation of LT: random sequences with the right characteristics, e.g: 0,0,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,0,1,0,0,0,0,0,0,1,1,1,1,1,1,… Implementation of LT: random sequences with the right characteristics, e.g: 0,0,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,1,1,0,1,0,0,0,0,0,0,1,1,1,1,1,1,… GP>LT ? LT>GP ? Students t-test at 95% confidence level – 20 GP runs / 1000 LT runs GP>LT ? LT>GP ? Students t-test at 95% confidence level – 20 GP runs / 1000 LT runs

15 Experimental setup Data preprocessed with 100-days MA Data preprocessed with 100-days MA Trading systems: Trading systems: Entry (long): GP induced rule with a classical set of functions / terminals Entry (long): GP induced rule with a classical set of functions / terminals Exit: Exit: Stop loss : 5% Stop loss : 5% Profit target : 10% Profit target : 10% 90-days stop 90-days stop Fitness: net return - Initial equity: 100K$ - position sizing : 100% Fitness: net return - Initial equity: 100K$ - position sizing : 100%

16 Results: high entropy stocks GP is always profitable LT is never better than GP (at a 95% confidence level) GP outperforms LT 2 times out of 5 (at a 95% confidence level)

17 Results: low entropy stocks GP is never better than LT (at a 95% confidence level) LT outperforms GP 2 times out of 5 (at a 95% confidence level)

18 Explanations (1/2) GP is not good when training period is very different from out-of-sample e.g. GP is not good when training period is very different from out-of-sample e.g Typical low complexity stock (EMC) Typical high complexity stock (MRO)

19 Explanations (2/2) The 2 cases where GP outperforms LT : training quite similar to out-of-sample The 2 cases where GP outperforms LT : training quite similar to out-of-sample BAX WAG

20 Conclusions EOD NYSE time series have high but differing entropies EOD NYSE time series have high but differing entropies There are (weak) temporal dependencies There are (weak) temporal dependencies Here, more predictable less risks Here, more predictable less risks GP works well if training is similar to out- of-sample GP works well if training is similar to out- of-sample

21 Perspectives Higher predictability level can be observed at intraday timeframe (what about higher timeframes?) Higher predictability level can be observed at intraday timeframe (what about higher timeframes?) Experiments needed with stocks less similar than the ones from NYSE US 100 Experiments needed with stocks less similar than the ones from NYSE US 100 Predictability tells us about the existence of temporal patterns – but how easy / difficult to discover them ?? Predictability tells us about the existence of temporal patterns – but how easy / difficult to discover them ??

22 ?