Download presentation
Presentation is loading. Please wait.
Published byShanon Miller Modified over 9 years ago
1
1 Machine Learning for Stock Selection Robert J. Yan Charles X. Ling University of Western Ontario, Canada {jyan, cling}@csd.uwo.ca
2
2 Outline Introduction The stock selection task The Prototype Ranking method Experimental results Conclusions
3
3 Introduction Objective: – Use machine learning to select a small number of “good” stocks to form a portfolio Research questions: – Learning in the noisy dataset – Learning in the imbalanced dataset Our solution: Prototype Ranking – A specially designed machine learning method
4
4 Outline Introduction The stock selection task The Prototype Ranking method Experimental results Conclusions
5
5 Stock Selection Task Given information prior to week t, predict performance of stocks of week t – Training set Predictor 1Predictor 2Predictor 3Goal Stock IDReturn of week t-1 Return of week t-2 Volume ratio of t-2/t-1 Return of week t Learning a ranking function to rank testing data – Select n highest to buy, n lowest to short-sell
6
6 Outline Introduction The stock selection task The Prototype Ranking method Experimental results Conclusions
7
7 Prototype Ranking Prototype Ranking (PR): special machine learning for noisy and imbalanced stock data The PR System Step 1. Find good “prototypes” in training data Step 2. Use k-NN on prototypes to rank test data
8
8 Step 1: Finding Prototypes Prototypes: representative points – Goal: discover the underlying density/clusters of the training samples by distributing prototypes in sample space – Reduce data size prototypes prototype neighborhood samples
9
9 Analysis??? Competitive learning for stock selection task – Pros: Noise-tolerant On-line update: practical for huge dataset Smoothly simulate the training samples – Cons: Searching the nearest prototype is tedious Poor performance for the prediction task –Design for tasks such as clustering, feature mapping… –The stock selection is a prediction task Poor performance for imbalanced dataset modeling
10
10 Finding prototypes using competitive learning General competitive learning Step 1: Randomly initialize a set of prototypes Step 2: Search the nearest prototypes Step 3: Adjust the prototypes Step 4: Output the prototypes Hidden density in training is reflected in prototypes
11
11 Modifications for Stock data In step 1: Initial prototypes organized in a tree-structure – Fast nearest prototype searching In step 2: Searching prototypes in the predictor space – Better learning effect for the prediction tasks In step 3: Adjusting prototypes in the goal attribute space – Better learning effect in the imbalanced stock data In step 4, prune the prototype tree – Prune children prototypes if they are similar to the parent – Combine leaf prototypes to form the final prototypes
12
12 Step 2: Predicting Test Data The weighted average of k nearest prototypes Online update the model with new data
13
13 Outline Introduction The stock selection task The Prototype Ranking method Experimental results Conclusions
14
14 Data CRSP daily stock database – 300 NYSE and AMEX stocks, largest market cap – From 1962 to 2004
15
15 Testing PR Experiment 1: Larger portfolio, lower average return, lower risk – diversification Experiment 2: is PR better than Cooper’s method?
16
16 Results of Experiment 1 Average Return (1978-2004) Risk (std) (1978-2004)
17
17 Experiment 2: Comparison to Cooper’s method Cooper’s method (CP): A traditional non- ML method for stock selection… Compare PR and CP in 10-stock portfolios
18
18 Results of Experiment 2 Measures: Average Return (Ret.) Sharpe Ratio (SR): a risk-adjusted return: SR= Ret. / Std.
19
19 Results PortfolioPerformance 1978-19931994-2004 PRCPPRCP 10-stock Ave. Return (%) 1.690.891.370.81 STD (%) 3.302.806.205.10 Sharpe Ratio 0.510.320.220.16 20-stock Ave. Return (%) 1.350.801.320.81 STD (%) 2.602.105.104.30 Sharpe Ratio 0.520.380.260.19 30-stock Ave. Return (%) 1.140.671.160.77 STD (%) 2.201.804.603.50 Sharpe Ratio 0.520.370.270.22
20
20 Outline Introduction The stock selection task The Prototype Ranking method Experimental results Conclusions
21
21 Conclusions PR: modified competitive learning and k-NN for noisy and imbalanced stock data PR does well in stock selection – Larger portfolio, lower return, lower risk – PR outperforms the non-ML method CP Future work: use it to invest and make money!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.