1/15 Strengthening I-ReGEC classifier G. Attratto, D. Feminiano, and M.R. Guarracino High Performance Computing and Networking Institute Italian National.

Slides:



Advertisements
Similar presentations
ECG Signal processing (2)
Advertisements

Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
A gene expression analysis system for medical diagnosis D. Maroulis, D. Iakovidis, S. Karkanis, I. Flaounas D. Maroulis, D. Iakovidis, S. Karkanis, I.
SVM—Support Vector Machines
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Minimum Redundancy and Maximum Relevance Feature Selection
Correlation Aware Feature Selection Annalisa Barla Cesare Furlanello Giuseppe Jurman Stefano Merler Silvano Paoli Berlin – 8/10/2005.
Robust Multi-Kernel Classification of Uncertain and Imbalanced Data
Prénom Nom Document Analysis: Linear Discrimination Prof. Rolf Ingold, University of Fribourg Master course, spring semester 2008.
Incremental learning in data stream analysis High Performance Computing and Networking Institute National Research Council – Naples,
Predictive Automatic Relevance Determination by Expectation Propagation Yuan (Alan) Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani.
Lesson 8: Machine Learning (and the Legionella as a case study) Biological Sequences Analysis, MTA.
Classification: Support Vector Machine 10/10/07. What hyperplane (line) can separate the two classes of data?
Reduced Support Vector Machine
Bioinformatics Challenge  Learning in very high dimensions with very few samples  Acute leukemia dataset: 7129 # of gene vs. 72 samples  Colon cancer.
The Chicken Project Dimension Reduction-Based Penalized logistic Regression for cancer classification Using Microarray Data By L. Shen and E.C. Tan Name.
The Implicit Mapping into Feature Space. In order to learn non-linear relations with a linear machine, we need to select a set of non- linear features.
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Optimization Theory Primal Optimization Problem subject to: Primal Optimal Value:
Applications of Data Mining in Microarray Data Analysis Yen-Jen Oyang Dept. of Computer Science and Information Engineering.
Ch. Eick: Support Vector Machines: The Main Ideas Reading Material Support Vector Machines: 1.Textbook 2. First 3 columns of Smola/Schönkopf article on.
Incremental Support Vector Machine Classification Second SIAM International Conference on Data Mining Arlington, Virginia, April 11-13, 2002 Glenn Fung.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Data mining and machine learning A brief introduction.
Integration II Prediction. Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction.
Machine Learning Lecture 11 Summary G53MLE | Machine Learning | Dr Guoping Qiu1.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Filtering and Recommendation INST 734 Module 9 Doug Oard.
1 Chapter 6. Classification and Prediction Overview Classification algorithms and methods Decision tree induction Bayesian classification Lazy learning.
A presentation on the topic For CIS 595 Bioinformatics course
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
Support Vector Machine Data Mining Olvi L. Mangasarian with Glenn M. Fung, Jude W. Shavlik & Collaborators at ExonHit – Paris Data Mining Institute University.
Support Vector Machines Project מגישים : גיל טל ואורן אגם מנחה : מיקי אלעד נובמבר 1999 הטכניון מכון טכנולוגי לישראל הפקולטה להנדסת חשמל המעבדה לעיבוד וניתוח.
Linear hyperplanes as classifiers Usman Roshan. Hyperplane separators.
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Computational Approaches for Biomarker Discovery SubbaLakshmiswetha Patchamatla.
Chapter 20 Classification and Estimation Classification – Feature selection Good feature have four characteristics: –Discrimination. Features.
CZ5225: Modeling and Simulation in Biology Lecture 7, Microarray Class Classification by Machine learning Methods Prof. Chen Yu Zong Tel:
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
A Short and Simple Introduction to Linear Discriminants (with almost no math) Jennifer Listgarten, November 2002.
Text Categorization With Support Vector Machines: Learning With Many Relevant Features By Thornsten Joachims Presented By Meghneel Gore.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Supervised Machine Learning: Classification Techniques Chaleece Sandberg Chris Bradley Kyle Walsh.
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Support-Vector Networks C Cortes and V Vapnik (Tue) Computational Models of Intelligence Joon Shik Kim.
Machine Learning: A Brief Introduction Fu Chang Institute of Information Science Academia Sinica ext. 1819
The UNIVERSITY of NORTH CAROLINA at CHAPEL HILL Classification COMP Seminar BCB 713 Module Spring 2011.
1 An introduction to support vector machine (SVM) Advisor : Dr.Hsu Graduate : Ching –Wen Hong.
SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.
Computer Vision Lecture 7 Classifiers. Computer Vision, Lecture 6 Oleh Tretiak © 2005Slide 1 This Lecture Bayesian decision theory (22.1, 22.2) –General.
SUPERVISED AND UNSUPERVISED LEARNING Presentation by Ege Saygıner CENG 784.
1 Kernel Machines A relatively new learning methodology (1992) derived from statistical learning theory. Became famous when it gave accuracy comparable.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
1 CISC 841 Bioinformatics (Fall 2008) Review Session.
Machine Learning Usman Roshan Dept. of Computer Science NJIT.
An Artificial Intelligence Approach to Precision Oncology
Prepared by: Mahmoud Rafeek Al-Farra
Table 1. Advantages and Disadvantages of Traditional DM/ML Methods
Alan Qi Thomas P. Minka Rosalind W. Picard Zoubin Ghahramani
Classification with Perceptrons Reading:
Basic machine learning background with Python scikit-learn
Machine Learning Week 1.
Students: Meiling He Advisor: Prof. Brain Armstrong
Machine Learning Week 3.
The Naïve Bayes (NB) Classifier
Other Classification Models: Support Vector Machine (SVM)
Concave Minimization for Support Vector Machine Classifiers
Presentation transcript:

1/15 Strengthening I-ReGEC classifier G. Attratto, D. Feminiano, and M.R. Guarracino High Performance Computing and Networking Institute Italian National Research Council

2/15 Supervised learning Supervised learning refers to the capability of a system to learn from a set of input/output couples: Training Set.

3/15 Classification Consists of determining a model that it allows to group elements according to determined features The groups are the classes

4/15 Evaluation of classification methods It’s ability’s pointer of prediction model Some methods employ little time than others The defined rules and the accuracy do not change considerable with various set Possibility to classify dataset of great dimensions Accuracy Speed Robustness Scalability

5/15 To render more efficient the examples’ choice during the training Goals Delete the redundant examples or insufficient informative contribution Strengthening the training set, deleting the obsolete knowledge Building an efficient, scalabile and generalizable model

6/15 Classification techniques Based on tree Compute posterior probabilities with Bayes’ theorem Simulate the behavior of the biological systems Calculate hyperplanes Decision tree Bayesian Networks Neurals Networks Support Vector Machine (SVM) (Optimal Tree) (Slow in training)

7/15 SVM: The state of the art Support vector Optimal Hyperplane Separation margin Find an examples set (support vectors) representatives for classes Nonlinear case Linear case

8/15 Regec Two Hyperplanes representative for classes (GEPSVM’s family) Based on Genralized Eigenvalue

9/15 I-Regec Select k points for each class with a clustering technique (K-means) |S| = 2xK Classify the test-set with the S points Add misclassified points in incremental mode to the S set On proceede until the finish of misclassified points

10/15 Strengthening Apply I-ReGEC in order to obtain the training set Each iteration delete a point from training set Apply I-ReGEC in each iteration with new input set S Strengthening the set (save new S) if accuracy is improved

11/15 Microarray and matrix EXAMPLESEXAMPLES FEATURES CLASSES Gene expression

12/15 Results DATASET ACC. I-Regec N° of points ACC. Strengthening N° of points Alon (62x2000) Colon cancer 73,00% 7,78 74,60% 7,78 Golub (72x7129) Leukaemia 87,12% 9,44 89,88% 9,44 Nutt (50x12625) Gliome 65,20% 7,47 65,20% 7,47 BRCA1 (22x3226) Breast Cancer 67,50% 4,24 67,50% 4,24 BRCA2 (22x3226) Breast Cancer 78,50% 5,53 79,50% 5,96

13/15 Results and Diagrams Golub 2D Golub 3D I-RegecStrengthening StrengtheningI-Regec

14/15 The examples choice became more efficient Conclusions The reduntants or obsolete examples have been deleted The training set are “strengthened”

15/15 Future work In order to optimize the execution time, the Strengthening technique would to go integrated into I-Regec.