Qi Yu, Antti Sorjamaa, Yoan Miche, and Eric Severin (Jean-Paul Murara) Wednesday 26-11-2008.

Slides:



Advertisements
Similar presentations
Multi-Layer Perceptron (MLP)
Advertisements

EE 690 Design of Embodied Intelligence
Support Vector Machines
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Mehran University of Engineering and Technology, Jamshoro Department of Electronic Engineering Neural Networks Feedforward Networks By Dr. Mukhtiar Ali.
FTP Biostatistics II Model parameter estimations: Confronting models with measurements.
1 Machine Learning: Lecture 7 Instance-Based Learning (IBL) (Based on Chapter 8 of Mitchell T.., Machine Learning, 1997)
Kostas Kontogiannis E&CE
Artificial Neural Networks
Machine Learning Neural Networks
Artificial Neural Networks
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
September 30, 2010Neural Networks Lecture 8: Backpropagation Learning 1 Sigmoidal Neurons In backpropagation networks, we typically choose  = 1 and 
Neural Networks Marco Loog.
Tree-based methods, neutral networks
November 2, 2010Neural Networks Lecture 14: Radial Basis Functions 1 Cascade Correlation Weights to each new hidden node are trained to maximize the covariance.
Chapter 6: Multilayer Neural Networks
Supervised Learning Networks. Linear perceptron networks Multi-layer perceptrons Mixture of experts Decision-based neural networks Hierarchical neural.
MACHINE LEARNING 12. Multilayer Perceptrons. Neural Networks Lecture Notes for E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)
Hazırlayan NEURAL NETWORKS Radial Basis Function Networks I PROF. DR. YUSUF OYSAL.
1 An Introduction to Nonparametric Regression Ning Li March 15 th, 2004 Biostatistics 277.
October 28, 2010Neural Networks Lecture 13: Adaptive Networks 1 Adaptive Networks As you know, there is no equation that would tell you the ideal number.
CS Instance Based Learning1 Instance Based Learning.
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Radial Basis Function Networks
Neural Network Tools. Neural Net Concepts The package provides a “standard” multi-layer perceptron –Composed of layers of neurons –All neurons in a layer.
Face Recognition Using Neural Networks Presented By: Hadis Mohseni Leila Taghavi Atefeh Mirsafian.
November 25, 2014Computer Vision Lecture 20: Object Recognition IV 1 Creating Data Representations The problem with some data representations is that the.
Radial Basis Function Networks
Biointelligence Laboratory, Seoul National University
Classification Part 3: Artificial Neural Networks
Cascade Correlation Architecture and Learning Algorithm for Neural Networks.
Multi-Layer Perceptrons Michael J. Watts
Chapter 9 Neural Network.
Neural Networks AI – Week 23 Sub-symbolic AI Multi-Layer Neural Networks Lee McCluskey, room 3/10
11 CSE 4705 Artificial Intelligence Jinbo Bi Department of Computer Science & Engineering
EMIS 8381 – Spring Netflix and Your Next Movie Night Nonlinear Programming Ron Andrews EMIS 8381.
Outline 1-D regression Least-squares Regression Non-iterative Least-squares Regression Basis Functions Overfitting Validation 2.
Machine Learning Seminar: Support Vector Regression Presented by: Heng Ji 10/08/03.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 16: NEURAL NETWORKS Objectives: Feedforward.
Radial Basis Function Networks:
Artificial Intelligence Methods Neural Networks Lecture 4 Rakesh K. Bissoondeeal Rakesh K. Bissoondeeal.
CS 478 – Tools for Machine Learning and Data Mining Backpropagation.
Overview of Supervised Learning Overview of Supervised Learning2 Outline Linear Regression and Nearest Neighbors method Statistical Decision.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Non-Bayes classifiers. Linear discriminants, neural networks.
Intro. ANN & Fuzzy Systems Lecture 14. MLP (VI): Model Selection.
Feature selection with Neural Networks Dmitrij Lagutin, T Variable Selection for Regression
Neural Nets: Something you can use and something to think about Cris Koutsougeras What are Neural Nets What are they good for Pointers to some models and.
CSC321 Introduction to Neural Networks and Machine Learning Lecture 3: Learning in multi-layer networks Geoffrey Hinton.
Back-Propagation Algorithm AN INTRODUCTION TO LEARNING INTERNAL REPRESENTATIONS BY ERROR PROPAGATION Presented by: Kunal Parmar UHID:
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Perceptrons Michael J. Watts
Artificial Neural Networks for Data Mining. Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 6-2 Learning Objectives Understand the.
Meta-controlled Boltzmann Machine toward Accelerating the Computation Tran Duc Minh (*), Junzo Watada (**) (*) Institute Of Information Technology-Viet.
Overfitting, Bias/Variance tradeoff. 2 Content of the presentation Bias and variance definitions Parameters that influence bias and variance Bias and.
CPH Dr. Charnigo Chap. 11 Notes Figure 11.2 provides a diagram which shows, at a glance, what a neural network does. Inputs X 1, X 2,.., X P are.
A Presentation on Adaptive Neuro-Fuzzy Inference System using Particle Swarm Optimization and it’s Application By Sumanta Kundu (En.R.No.
Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.
CS 9633 Machine Learning Support Vector Machines
Chapter 7. Classification and Prediction
Artificial neural networks
Neural Networks Advantages Criticism
A Robust and Optimally Pruned Extreme Learning Machine
Training a Neural Network
Neuro-Computing Lecture 4 Radial Basis Function Network
Presentation transcript:

Qi Yu, Antti Sorjamaa, Yoan Miche, and Eric Severin (Jean-Paul Murara) Wednesday

Outline: Introduction OP-KNN Experiments Conclusions

1. Introduction Return on assets (ROA) is an important indicator to explain corporate performance, showing how profitable a company is before leverage, and is frequently compared with companies in the same industry. However, it is not easy to analyse what characters of the companies mainly affect the ROA value, especially when you try to predict it, the problem becomes more risky. Guang-Bin Huang proposed ELM ==>> OP-ELM. OP-KNN which uses KNN as the kernel and solves the problems properly is presented here.

This method has several notable achievements: keeping good performance while being simpler than most learning algorithms for feedforward neural network, using KNN as the deterministic initialization, the computational time of OP-KNN being extremely low (lower than OPELM or any other algorithm), for our application, Leave-One-Out (LOO) error is used both for variables selection and OP-KNN complexity selection.

2. Optimal Pruned – k-Nearest Neighbors (OP-KNN) OP-KNN is similar to OP-ELM, which is a original and efficient way of training a Multilayer Perceptron (MLP) network. The three main steps of the OP-KNN are:  Input Selection  Single-hidden Layer Feedforward Neural Networks (SLFN) construction using k-Nearest Neighbors  Multiresponse Sparse Regression (MRSR)  Leave-One-Out (LOO)

Figure 1: The three steps of the OP-KNN algorithm SLFN construction using KNN Ranking all the neurons by MRSR Selection of the appropriate number of neurons by LOO Input selection

2.1 Input selection Obviously, the input variables can have different importance with respect to the output. The input data are preprocessed and scaled before building the model as a result of a methodology which optimizing the NNE provided by DT for input selection.

2.2 Single-hidden Layer Feedforward Neural Networks (SLFN) The first step of the OP-KNN algorithm is the core of the original ELM: the building of a single-layer feed-forward neural network. In the context of a single hidden layer perceptron network, let us denote the weights between the hidden layer and the output by b. Activation functions (linear) used with the OP-KNN differ from the original SLFN choice since the original sigmoid activation functions of the neurons are replaced by the k-Nearest Neighbors. THEOREM: The activation functions, output weights b can be computed from the hidden layer output matrix H. The only remaining parameter is the initial number of neurons N of the hidden layer.

2.3 k-Nearest Neighbors The k-Nearest Neighbors (KNN) model is a very simple, but powerful tool. The key idea behind the KNN is that similar training samples have similar output values. The model introduced in the previous section becomes: where represents the output estimation, P(i, j) is the index number of the jth nearest neighbor of sample xi and b is the results of the Moore-Penrose inverse.

2.4 Multiresponse Sparse Regression (MRSR) For the removal of the useless neurons of the hidden layer, the MRSR (Timo Similä and Jarkko Tikka) is used. It is an extension of the Least Angle Regression (LARS) algorithm. The main idea of this algorithm is the following: T = [t 1... t p ] - n×p targets, and X = [x 1... x m ] - n × m regressors matrix. MRSR adds each regressor one by one to the model Y k = XW k, Y k = [y k 1... y k p ] - target approximation. W k weight matrix (k nonzero rows at kth step of the MRSR). With each new step a new nonzero row, and a new regressor to the total model, is added. MRSR is hence used to rank the kernels of the model: the target is the actual output yi.

2.5 Leave-One-Out (LOO) Since the MRSR only provides a ranking of the kernels, the decision over the actual best number of neurons for the model is taken using a Leave-One-Out method. The final decision over the appropriate number of neurons for the model can then be taken by evaluating the LOO error versus the number of neurons used (properly ranked by MRSR already).

2.6 Advantages of the OP-KNN OP-KNN methodology leads to a fast and accurate algorithm; Input selection helps to reduce the variables dimension and the modeling complexity beforehand at the very beginning; The K-nearest neighbor ranking by the MRSR is one of the fastest ranking methods providing the exact best ranking, since the model is linear (for the output layer), when creating the neural network using KNN; Without MRSR, the number of nearest neighbor that minimizes the LOO error is not optimal and the LOO error curve has several local minima instead of a single global minimum. The linearity also enables the model structure selection step using the LOO, which is usually very time-consuming.

3 Experiments 3.1 Sine in one dimension In this experiments, a set of 1000 training points are generated, the output is a sum of two sines. This single dimension example is used to test the method without the need for variable selection beforehand. This model approximates the dataset accurately, using 18 nearest neighbors; and it reaches a LOO error close to the noise introduced in the dataset which is

3.2 Financial Modeling In this experiment, we use the data related to 200 French companies during a period of 5 years. 36 input variables on 535 samples are used without any missing value, the input variables are financial indicators that are measured every year and the last variable is ROA value of the same year. The target variable is the ROA of the next year for each sample. The results are listed in the tables where we can see the LOO error are decreased almost half with variable selection step for each cases. The minimum LOO error appears when using OP-KNN on the scaled selected input variables as expected. Moreover, the final LOO error reach roughly the same stage as the value we estimated while doing variable selection. Thus, for this financial dataset, this methodology not only build the model in a simple and fast way, but also prove the accuracy of our previous selection algorithm and the most important point is the method successfully predict the ROA value for the next year. It should also be noted that on the experiments of this financial data, the OP- KNN with Variable Scaling on the selected variables shows the best efficiency and accuracy, meanwhile it selected the most important variables with their ranking and build the model to predict.

4 Conclusions a methodology OP-KNN based on SLFN which gives better performance than existing OP-ELM or any other algorithms for the financial modeling is proposed. Using KNN as a kernel, the MRSR algorithm and the PRESS statistic are all required for this method to build an accurate model. We test our methodology on 200 French industrial firms listed on Paris Bourse (Euronext nowadays) within a period of 5 years. Our results highlight that the first 12 variables are the best combination to explain corporate performance measured by the ROA of next year. Afterwards, the new variables do not allow improving the explanation of corporate performance. Moreover, we use these selected variables to build a model for prediction. Furthermore, it is interesting to notice that the discipline of market allows to put pressure on firms to improve corporate performance.

MWAKOZE GUKURIKIRA!!!