Presented by: Isabelle Guyon Machine Learning Research.

Slides:



Advertisements
Similar presentations
A gene expression analysis system for medical diagnosis D. Maroulis, D. Iakovidis, S. Karkanis, I. Flaounas D. Maroulis, D. Iakovidis, S. Karkanis, I.
Advertisements


Support Vector Machines and Kernels Adapted from slides by Tim Oates Cognition, Robotics, and Learning (CORAL) Lab University of Maryland Baltimore County.
Machine learning continued Image source:
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
Groundwater 3D Geological Modeling: Solving as Classification Problem with Support Vector Machine A. Smirnoff, E. Boisvert, S. J.Paradis Earth Sciences.
Discriminative and generative methods for bags of features
Machine Learning Bioinformatics Data Analysis and Tools
Support Vector Machine
The Nature of Statistical Learning Theory by V. Vapnik
Learning From Data Chichang Jou Tamkang University.
Support Vector Machines Kernel Machines
Scientific Data Mining: Emerging Developments and Challenges F. Seillier-Moiseiwitsch Bioinformatics Research Center Department of Mathematics and Statistics.
Ti MACHINE VISION SUPPORT VECTOR MACHINES Maxim Mikhnevich Pavel Stepanov Pankaj Sharma Ivan Ryzhov Sergey Vlasov
SVM Support Vectors Machines
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
KDD for Science Data Analysis Issues and Examples.
Computer Science Universiteit Maastricht Institute for Knowledge and Agent Technology Data mining and the knowledge discovery process Summer Course 2005.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
1 Introduction to Support Vector Machines for Data Mining Mahdi Nasereddin Ph.D. Pennsylvania State University School of Information Sciences and Technology.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Data Mining Chun-Hung Chou
Métodos de kernel. Resumen SVM - motivación SVM no separable Kernels Otros problemas Ejemplos Muchas slides de Ronald Collopert.
Whole Genome Expression Analysis
Evaluation of Supervised Learning Algorithms on Gene Expression Data CSCI 6505 – Machine Learning Adan Cosgaya Winter 2006 Dalhousie University.
Support Vector Machine & Image Classification Applications
Data Clustering 1 – An introduction
Integration II Prediction. Kernel-based data integration SVMs and the kernel “trick” Multiple-kernel learning Applications – Protein function prediction.
Feature Selection in Nonlinear Kernel Classification Olvi Mangasarian & Edward Wild University of Wisconsin Madison Workshop on Optimization-Based Data.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Exploring a Hybrid of Support Vector Machines (SVMs) and a Heuristic Based System in Classifying Web Pages Santa Clara, California, USA Ahmad Rahman, Yuliya.
CSE 473/573 Computer Vision and Image Processing (CVIP) Ifeoma Nwogu Lecture 24 – Classifiers 1.
Support Vector Machine (SVM) Based on Nello Cristianini presentation
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
Support vector machines for classification Radek Zíka
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Anti-Learning Adam Kowalczyk Statistical Machine Learning NICTA, Canberra 1 National ICT Australia Limited is funded and.
Classifiers Given a feature representation for images, how do we learn a model for distinguishing features from different classes? Zebra Non-zebra Decision.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
1 Chapter 6. Classification and Prediction Overview Classification algorithms and methods Decision tree induction Bayesian classification Lazy learning.
Consul- ting Services Outsour- cing Services Techno- logy Services Local Profes- sional Services Competence Centers Business Intelligence WebTech SAP.
CSSE463: Image Recognition Day 11 Lab 4 (shape) tomorrow: feel free to start in advance Lab 4 (shape) tomorrow: feel free to start in advance Test Monday.
NIPS 2001 Workshop on Feature/Variable Selection Isabelle Guyon BIOwulf Technologies.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
Data Mining and Decision Trees 1.Data Mining and Biological Information 2.Data Mining and Machine Learning Techniques 3.Decision trees and C5 4.Applications.
Nuria Lopez-Bigas Methods and tools in functional genomics (microarrays) BCO17.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin Support Vector Machines.
Dec 21, 2006For ICDM Panel on 10 Best Algorithms Support Vector Machines: A Survey Qiang Yang, for ICDM 2006 Panel Partially.
Consensus Group Stable Feature Selection
2 classes: ICS 280, BIT Forum Meeting only on Mondays from 5 to 6:20 in CS2 136 (BIT). (P. Baldi and L. Ralaivola) ICS 280: Baldi group meeting and projects.
Support Vector Machines Jordan Smith MUMT February 2008.
A TUTORIAL ON SUPPORT VECTOR MACHINES FOR PATTERN RECOGNITION ASLI TAŞÇI Christopher J.C. Burges, Data Mining and Knowledge Discovery 2, , 1998.
Support Vector Machines (SVM): A Tool for Machine Learning Yixin Chen Ph.D Candidate, CSE 1/10/2002.
Data Mining and Decision Support
Feature Selction for SVMs J. Weston et al., NIPS 2000 오장민 (2000/01/04) Second reference : Mark A. Holl, Correlation-based Feature Selection for Machine.
Support Vector Machines Optimization objective Machine Learning.
Effect of Alcohol on Brain Development NormalFetal Alcohol Syndrome.
Learning by Loss Minimization. Machine learning: Learn a Function from Examples Function: Examples: – Supervised: – Unsupervised: – Semisuprvised:
Lecture 1: Introduction to Machine Learning Isabelle Guyon
Copyright 2005 by David Helmbold1 Support Vector Machines (SVMs) References: Cristianini & Shawe-Taylor book; Vapnik’s book; and “A Tutorial on Support.
Machine Learning for Computer Security
Data Mining: Concepts and Techniques Course Outline
Machine Learning Week 1.
Basic Intro Tutorial on Machine Learning and Data Mining
Overview of Machine Learning
Data Mining, Machine Learning, Data Analysis, etc. scikit-learn
Presentation transcript:

Presented by: Isabelle Guyon Machine Learning Research

BIOwulf 1 - People 2 - Technology 3 - Results

BIOwulf Technologies 1- People

Research people + Isabelle Guyon + Vladimir Vapnik (c) + Peter Bartlett+ Bernhard Schölkopf + Asa Ben Hur+ André Elisseeff + Nello Cristianini + Olivier Chapelle + René Doursat - Olivier Bousquet + David Lewis (c)+ Jason Weston + Ed Reiss- Alex Smola (c) + Shelia Guberman+ Hong Zhang

BIOwulf Technologies 2 - Technology

Technology: SVM Kernel Machines: F(x) =   k K(x k, x) Sparcity: the sum runs only over support vectors Boser-Guyon-Vapnik (1992)

SVM: Universality & Generalization x1x1 x2x2 x=(x 1,x 2 ) F(x)=0 F(x)>0 F(x)<0

Neural Networks: Local Optima

SVM key properties

Core problems SVMs Kernel Methods Statistical Learning Theory Classification Clustering Regression Feature/ Pattern Selection Causality Inference Control Problems Model selection Novelty Detection

BIOwulf Technologies 3 - Results

Scope Life Sciences Imaging & Signal Processing Financial Seismic Geological Telecom Internet Security Fraud & Abuse Military BIOWulf Technologies

Strategy Data Analysis Result validation Data Collection

Medical Images Medical & Biology Literature Medical & Demographic Records Genomic Sequences Microarray data Spectra Data

Information Center IR Numerical Lab DA numerical results raw data structured info researcher prospects demo scientists tool data analyst customers service Internet Discovery Platform

Microarray Data Prostate cancer, Stamey-Guyon, Dec Microarray Data Prostate cancer, Stamey-Guyon, Dec Preprocessing Microarray Data Prostate cancer, Stamey-Guyon, Dec Preprocessing - Gene selection - Data cleaning BPH G4 Outlier

Two best genes Prostate cancer, Stamey-Guyon, Dec Golub SVM

H64807 R55310 T62947 H08393 T62947 U09564 R88740 M59040 R88740 T94579 H81558 T64012 T86444 H06524 H81558 H06524 U19969 H06524T94579 T58861 M59040 L08069 H08393 M82919 L03840 U19969 D14812 M82919 L Guyon-Doursat-Reiss, 2000 Tree Explorer

Spectroscopy Class 1 Class 2 f(t) g(t) t t Alignment kernel: K(f,g) =  f(t) g(t-x) exp(-  x 2 ) dtdx Simple kernel: K(f,g) =  f(t) g(t) dt Infrared spectra, Elisseeff-Bartlett, Feb. 2001

Prostate cancer, Elisseeff-Guyon-Weston, May Ciphergen Spectra 299 features(peak values) 385 examples (325 training, 60 test) 4 classes (15 test example/class) A=BPH, B and C cancer (B<C), D=ref. D < A < B < C SVM multi-class error rate: 15%(9/60) 59 peaks separate training set perfectly

SVM advantages in pattern recognition:  Superior prediction performance on test data.  Unique, easy to interpret solution.  Better feature selection (only 2-7 genes in  array exp.).  Use all the data, automatic data cleaning.  Incorporate knowledge about the task in Kernel.  Can be combined with other methods. Conclusions