A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008.

Slides:



Advertisements
Similar presentations
1 Welcome to the Kernel-Class My name: Max (Welling) Book: There will be class-notes/slides. Homework: reading material, some exercises, some MATLAB implementations.
Advertisements

Particle swarm optimization for parameter determination and feature selection of support vector machines Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen,
Tracking Unknown Dynamics - Combined State and Parameter Estimation Tracking Unknown Dynamics - Combined State and Parameter Estimation Presenters: Hongwei.
Jeff Howbert Introduction to Machine Learning Winter Collaborative Filtering Nearest Neighbor Approach.
Learning on Probabilistic Labels Peng Peng, Raymond Chi-wing Wong, Philip S. Yu CSE, HKUST 1.
Autocorrelation and Linkage Cause Bias in Evaluation of Relational Learners David Jensen and Jennifer Neville.
Ao-Jan Su † Y. Charlie Hu ‡ Aleksandar Kuzmanovic † Cheng-Kok Koh ‡ † Northwestern University ‡ Purdue University How to Improve Your Google Ranking: Myths.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
A Generalized Model for Financial Time Series Representation and Prediction Author: Depei Bao Presenter: Liao Shu Acknowledgement: Some figures in this.
Decision Support Systems
Dr. Yukun Bao School of Management, HUST Business Forecasting: Experiments and Case Studies.
Jierui Xie, Boleslaw Szymanski, Mohammed J. Zaki Department of Computer Science Rensselaer Polytechnic Institute Troy, NY 12180, USA {xiej2, szymansk,
Neural Networks. R & G Chapter Feed-Forward Neural Networks otherwise known as The Multi-layer Perceptron or The Back-Propagation Neural Network.
Non-fixed and Asymmetrical Margin Approach to Stock Market Prediction using Support Vector Regression Haiqin Yang, Irwin King and Laiwan Chan Department.
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
+ Doing More with Less : Student Modeling and Performance Prediction with Reduced Content Models Yun Huang, University of Pittsburgh Yanbo Xu, Carnegie.
Introduction to machine learning
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
Selective Sampling on Probabilistic Labels Peng Peng, Raymond Chi-Wing Wong CSE, HKUST 1.
Graph-based consensus clustering for class discovery from gene expression data Zhiwen Yum, Hau-San Wong and Hongqiang Wang Bioinformatics, 2007.
CS 478 – Introduction1 Introduction to Machine Learning CS 478 Professor Tony Martinez.
Efficient Model Selection for Support Vector Machines
by B. Zadrozny and C. Elkan
Prediction model building and feature selection with SVM in breast cancer diagnosis Cheng-Lung Huang, Hung-Chang Liao, Mu- Chen Chen Expert Systems with.
Presented by Tienwei Tsai July, 2005
Jifeng Dai 2011/09/27.  Introduction  Structural SVM  Kernel Design  Segmentation and parameter learning  Object Feature Descriptors  Experimental.
Machine Learning CSE 681 CH2 - Supervised Learning.
GA-Based Feature Selection and Parameter Optimization for Support Vector Machine Cheng-Lung Huang, Chieh-Jen Wang Expert Systems with Applications, Volume.
A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,
Chengjie Sun,Lei Lin, Yuan Chen, Bingquan Liu Harbin Institute of Technology School of Computer Science and Technology 1 19/11/ :09 PM.
Machine Learning Using Support Vector Machines (Paper Review) Presented to: Prof. Dr. Mohamed Batouche Prepared By: Asma B. Al-Saleh Amani A. Al-Ajlan.
Jeff Howbert Introduction to Machine Learning Winter Regression Linear Regression.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Line detection Assume there is a binary image, we use F(ά,X)=0 as the parametric equation of a curve with a vector of parameters ά=[α 1, …, α m ] and X=[x.
Well Log Data Inversion Using Radial Basis Function Network Kou-Yuan Huang, Li-Sheng Weng Department of Computer Science National Chiao Tung University.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
EMBC2001 Using Artificial Neural Networks to Predict Malignancy of Ovarian Tumors C. Lu 1, J. De Brabanter 1, S. Van Huffel 1, I. Vergote 2, D. Timmerman.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
1 A fast algorithm for learning large scale preference relations Vikas C. Raykar and Ramani Duraiswami University of Maryland College Park Balaji Krishnapuram.
A Novel Local Patch Framework for Fixing Supervised Learning Models Yilei Wang 1, Bingzheng Wei 2, Jun Yan 2, Yang Hu 2, Zhi-Hong Deng 1, Zheng Chen 2.
Pattern Discovery of Fuzzy Time Series for Financial Prediction -IEEE Transaction of Knowledge and Data Engineering Presented by Hong Yancheng For COMP630P,
Page 1 Inferring Relevant Social Networks from Interpersonal Communication Munmun De Choudhury, Winter Mason, Jake Hofman and Duncan Watts WWW ’10 Summarized.
A Simulated-annealing-based Approach for Simultaneous Parameter Optimization and Feature Selection of Back-Propagation Networks (BPN) Shih-Wei Lin, Tsung-Yuan.
Hilbert Space Embeddings of Conditional Distributions -- With Applications to Dynamical Systems Le Song Carnegie Mellon University Joint work with Jonathan.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
CpSc 881: Machine Learning
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
An Improved Algorithm for Decision-Tree-Based SVM Sindhu Kuchipudi INSTRUCTOR Dr.DONGCHUL KIM.
Financial Data mining and Tools CSCI 4333 Presentation Group 6 Date10th November 2003.
Support Vector Regression in Marketing Georgi Nalbantov.
1 A latent information function to extend domain attributes to improve the accuracy of small-data-set forecasting Reporter : Zhao-Wei Luo Che-Jung Chang,Der-Chiang.
A distributed PSO – SVM hybrid system with feature selection and parameter optimization Cheng-Lung Huang & Jian-Fan Dun Soft Computing 2008.
Cluster Analysis What is Cluster Analysis? Types of Data in Cluster Analysis A Categorization of Major Clustering Methods Partitioning Methods.
Day 17: Duality and Nonlinear SVM Kristin P. Bennett Mathematical Sciences Department Rensselaer Polytechnic Institute.
Label Embedding Trees for Large Multi-class Tasks Samy Bengio Jason Weston David Grangier Presented by Zhengming Xing.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Convolutional Neural Network
C.-S. Shieh, EC, KUAS, Taiwan
Collaborative Filtering Nearest Neighbor Approach
Shih-Wei Lin, Kuo-Ching Ying, Shih-Chieh Chen, Zne-Jung Lee
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
The loss function, the normal equation,
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Mathematical Foundations of BME Reza Shadmehr
Forecasting - Introduction
Physics-guided machine learning for milling stability:
MAS 622J Course Project Classification of Affective States - GP Semi-Supervised Learning, SVM and kNN Hyungil Ahn
Presentation transcript:

A hybrid SOFM-SVR with a filter-based feature selection for stock market forecasting Huang, C. L. & Tsai, C. Y. Expert Systems with Applications 2008

Introduction   Stock market price index prediction is regarded as a challenging task of the finance.   Support vector regression (SVR) has successfully solved prediction problems in many domains, including the stock market.

Introduction   filter-based feature selection to choose important input attributes   SOFM algorithm to cluster the training samples   SVR to predict the stock market price index   Using a real future dataset – Taiwan index futures (FITX) to predict the next day’s price index

Introduction   SOFM+SVR : to improve the prediction accuracy of the traditional SVR method and to reduce its long training time,   SOFM+SVR+filter-based feature selection : improvement in training time, prediction accuracy, and the ability to select a better feature subset is achieved.

SVR   Unlike pattern recognition problems where the desired outputs are discrete values (e.g., Boolean)   support vector regression (SVR) deals with ‘real valued’ functions

Self-organizing Feature Maps; SOFM

SOFM 12 34

Training the SOFM-SVR model  1.  1. Scaling the training set   2.Clustering the training dataset   3.Training the Individual SVR Models for Each Cluster

Training the SOFM-SVR model

Parameters Optimization   setting of the SVR parameters can improve the SVR prediction accuracy   Using RBF kernel and ε-insensitive loss function, three parameters, C, r, and ε, should be determined in the SVR model   The grid search approach is a common method to search for the C, r, and ε values.

Grid Search Approach

Evaluating the SOFM-SVR model with test set   Scale the test set based on the scaling equation according to the attribute rage of the training set   Find the cluster to which the test sample in the test set   Calculate the predicted value for each sample in the test set   Calculate the prediction accuracy for the test set

SOFM-SVR model

SOFM-SVR combined with filter- based feature selection   X is Certain input variable (i.e. feature)   Y is response variable (i.e. label)   n is the number of training samples

SOFM-SVR filter-based feature selection

Performance measures   A i is the actual value of sample i   F i is a predicted value of sample i   n is the number of samples.

Experimental data set

SOFM-SVR with various numbers of clusters in dataset #1

Accuracy measures with various numbers of clusters

Wilcoxon sign rank test Wilcoxon sign rank test on the prediction errors for the SOFM-SVR with various numbers of clusters

Results of SOFM-SVR using three clusters

Results of SOFM-SVR with selected features

Original Feature VS. Original Feature  Original Feature Wilcoxon sign rank test

Important Feature   MA10: 10-day moving average.   MACD9: 9-day moving average convergence/ divergence.   +DI10: directional indicator up.   -DI10: directional indicator down.   K10: 10-day stochastic index K   PSY10: 10-day psychological line.   D9: 9-day stochastic index D

Relative importance of the selected features

Wilcoxon sign rank test: SOFM-SVR vs. single SVR

MAPE comparison: SOFM-SVR vs. single SVRs.

Training time comparisons: SOFM- SVR vs. single SVRs.

Conclusion   Hybrid SOFM-SVR with filter based feature selection to improve the prediction accuracy and to reduce the training time for the financial daily stock index prediction   Further research directions are using optimization algorithms (e.g., genetic algorithms) to optimize the SVR parameters and performing feature selection using a wrapper-based approach that combines SVR with other optimization tools

Thank You