Presentation is loading. Please wait.

Presentation is loading. Please wait.

Backtesting strategies based on multiple signals Robert Novy-Marx University of Rochester and NBER 1.

Similar presentations


Presentation on theme: "Backtesting strategies based on multiple signals Robert Novy-Marx University of Rochester and NBER 1."— Presentation transcript:

1 Backtesting strategies based on multiple signals Robert Novy-Marx University of Rochester and NBER 1

2 Multi-signal Strategies Proliferation in industry  E.g., MSCI Quality Index High ROE, low ROE vol., low leverage  “Smart beta” products RAFI: weight on sales, CF, BE, and dividends. Increasingly common in academia  Piotroski’s F-score (9 signals)  Asness et. al. Quality Score (21 signals) 2

3 Why the increased interest? Because finding “alpha” is hard  And they work great! Impressive backtest performance Too good?  Alpha should be hard to find Lots of smart people looking  Huge incentives to try And even to believe! 3

4 Issues Every choice has potential to bias results  Much bigger problem with multiple signals Not just which signals are used… But how they are used! Basic issue  Each signal is used so that it individually predicts positive in-sample returns Seems like a small thig—but it’s not! 4

5 Types of biases Snooping: in-sample aspect of data  guides strategy formation Two types to worry about:  Multiple testing bias Consider multiple strategies, show only best one  Overfitting E.g., Ex post MVE SRs always high  MVE strat buys “winners” and sell “losers” 5

6 Examples Bet on a series of fair coin flips What if you knew that there were: 1.More heads in the first (or second) half And could bet on just the early (or late) flips? 2.More heads than tails? What sorts of biases?  Do we account for these in finance? 6

7 First type: multiple testing (or selection)  Don’t really account for it, formally Do suspect (know) people look at more thing Second type: overfitting  Bet heads, not tails!  Account for it? One signal: Absolutely!  t 5% = 1.96 (not 1.65) Multiple signals: No! 7

8 Thought experiment? 8

9 Null hypothesis  “Signals” don’t predict differences in average returns E.g., monkeys selecting stocks by throwing darts at the WSJ Performance distribution  t-statistics ~ N(0,1) More or less  Excess kurtosis and heteroscedasticity 9

10 What if you diversify across the lucky monkeys?  Those with positive alpha Clearly “snooping”  Using in-sample aspect of data to form the strategy How does this bias the results?  Expected t-stat? 10

11 Get the average return Diversify across their risks Yields a high t-statistic:  Can also frame this in SRs 11

12 Same thing (essentially) happens if you use all the signals  But sign them so that they “predict” positive in-sample returns Standard statistics account for this…  If and only if N = 1! Again, strategy has high backtested SR  Question: expect high SR going forward? 12

13 Issues Combine things that backtest well  Get even better backtests Not surprising!  But what do the backtests mean? Biased?  Why? What biases?  If so, by how much? (Quantify!) Other intuitions? 13

14 Can address these Calculate empirical distributions  When signals are not informative  But multiple signals are used to select stocks Big boot-strapping exercise Derive theoretical distributions  In a simplified model Normal, homoscedastic returns  Use these to develop intuition 14

15 Strategy Construction 15

16 16 “Smart beta”Market

17 Signals Generate individually as pure noise!  Random normal variables Composite signals sum individual signals  Technical reason—mapping to theory Not important for the empirical work Cap multiplier is market equity  Essentially value-weighted strategies Again, not important 17

18 Best k-of-n strategies “Natural” construction  Investigate n signals  Pick the k “strongest” I.e., with most significant in-sample performance  Combine them how? Bootstrap for k ≤ n ≤ 100  Again, do it 10,000 times  Collect strategy t-statistics 18

19 Two Issues When k < n, selection bias  When k = 1 < n, multiple testing bias Well understood When k > 1, overfitting  Data snooping In-sample aspect of data used to form strategy  Pure overfitting only if k = n Interaction! 19

20 Special Cases 20 Overfitting only Multiple-testing only

21 Pure Selection 21

22 Pure Overfitting 22

23 Both Biases 23

24 General Case What sort of strategies should we worry about?  How do we think researchers design strategies in practice? 3-of-20?  How many signals did MSCI consider for its quality index? 5-of-100? 24

25 General Case 25

26 Model (theory) Strategies signal-weight stocks Returns normally dist. (assumption)  Equal volatilities  Uncorrelated Combine signals by averaging  Or weighted averaging  combined strat = portfolio of pure strats So can apply facts from portfolio theory 26

27 27

28 Best k-of-n strategies Yields t-statistic distributions: 28

29 Critical values Analytic for special cases:  k = 1  k = n, with signal-weighting Generally by numeric integration  Simple computationally But don’t provide much intuition  Also derive good analytic approximations Useful for comparative statics 29

30 Special Cases 30

31 Special Cases 31

32 General Cases 32

33 General Cases (Empirical) 33

34 General case, when k ~ n n = 100 34

35 General case, when k ~ n n = 40 35

36 General case, when k ~ n n = 20 36

37 Tension when increasing k Decreases vol.  improves performance Decreases average signal quality  lowers returns  impairs performance  Initially first effect dominates (esp. w/ large n) “Optimal” use of worst ~1/2 of signals:  Throw them away! Mean k/2-of-k t-stats. ~13% higher than k-of-k Mean k-of-2k t-stats. ~59% higher than k-of-k 37

38 Alternative Quantification Pure multiple-testing bias equivalence  How many single signals would you have to look at to get the same bias? That is, given any critical value τ (i.e., for some best k-of-n strategy), find n* s.t.  38

39 39

40 Approximate Power Law Best k-of-n strategy bias: Similar to those from a best 1-of-n k strategy!  Using analytic approximation, can show that log-n * roughly affine in log-n With slope ≈ k Can see this graphically 40

41 41

42 Conclusion View multi-signal claims skeptically  Multiple good signals  better performance when combined  Good backtested performance does NOT  any good signals “High tech” solution: use different tests “Low tech”: evaluate signals individually  Marginal power of each variable 42

43 General Approximation 43

44 How They Work Specify mean, S.D. of approx. normal  Combine with p-value  how far out in tail E.g., 5% crit.  mean + two standard deviations 44

45 General Approximation Where 45


Download ppt "Backtesting strategies based on multiple signals Robert Novy-Marx University of Rochester and NBER 1."

Similar presentations


Ads by Google