Download presentation
Presentation is loading. Please wait.
Published byHillary Martina Moody Modified over 8 years ago
1
Presented by: Isabelle Guyon Machine Learning Research
2
BIOwulf 1 - People 2 - Technology 3 - Results
3
BIOwulf Technologies 1- People
4
Research people + Isabelle Guyon + Vladimir Vapnik (c) + Peter Bartlett+ Bernhard Schölkopf + Asa Ben Hur+ André Elisseeff + Nello Cristianini + Olivier Chapelle + René Doursat - Olivier Bousquet + David Lewis (c)+ Jason Weston + Ed Reiss- Alex Smola (c) + Shelia Guberman+ Hong Zhang
5
BIOwulf Technologies 2 - Technology
6
Technology: SVM Kernel Machines: F(x) = k K(x k, x) Sparcity: the sum runs only over support vectors Boser-Guyon-Vapnik (1992) http://www.clopinet.com/isabelle/Papers/colt92.ps.Z
7
SVM: Universality & Generalization x1x1 x2x2 x=(x 1,x 2 ) F(x)=0 F(x)>0 F(x)<0
8
Neural Networks: Local Optima
9
SVM key properties
10
Core problems SVMs Kernel Methods Statistical Learning Theory Classification Clustering Regression Feature/ Pattern Selection Causality Inference Control Problems Model selection Novelty Detection
11
BIOwulf Technologies 3 - Results
12
Scope Life Sciences Imaging & Signal Processing Financial Seismic Geological Telecom Internet Security Fraud & Abuse Military BIOWulf Technologies
13
Strategy Data Analysis Result validation Data Collection
14
Medical Images Medical & Biology Literature Medical & Demographic Records Genomic Sequences Microarray data Spectra Data
15
Information Center IR Numerical Lab DA numerical results raw data structured info researcher prospects demo scientists tool data analyst customers service Internet Discovery Platform
16
Microarray Data Prostate cancer, Stamey-Guyon, Dec. 2000 1000200030004000500060007000 10 20 30 40 50 60 Microarray Data Prostate cancer, Stamey-Guyon, Dec. 2000 1000200030004000500060007000 5 10 15 20 25 30 35 40 - Preprocessing Microarray Data Prostate cancer, Stamey-Guyon, Dec. 2000 - Preprocessing - Gene selection - Data cleaning 12345678910 5 15 20 25 30 35 40 BPH G4 Outlier
17
Two best genes Prostate cancer, Stamey-Guyon, Dec. 2000 Golub SVM
18
H64807 R55310 T62947 H08393 T62947 U09564 R88740 M59040 R88740 T94579 H81558 T64012 T86444 H06524 H81558 H06524 U19969 H06524T94579 T58861 M59040 L08069 H08393 M82919 L03840 U19969 D14812 M82919 L06895 10 20 30 40 50 60 Guyon-Doursat-Reiss, 2000 Tree Explorer
19
Spectroscopy Class 1 Class 2 f(t) g(t) t t Alignment kernel: K(f,g) = f(t) g(t-x) exp(- x 2 ) dtdx Simple kernel: K(f,g) = f(t) g(t) dt Infrared spectra, Elisseeff-Bartlett, Feb. 2001
20
Prostate cancer, Elisseeff-Guyon-Weston, May. 2001 Ciphergen Spectra 299 features(peak values) 385 examples (325 training, 60 test) 4 classes (15 test example/class) A=BPH, B and C cancer (B<C), D=ref. D < A < B < C 121 23 SVM multi-class error rate: 15%(9/60) 59 peaks separate training set perfectly
21
SVM advantages in pattern recognition: Superior prediction performance on test data. Unique, easy to interpret solution. Better feature selection (only 2-7 genes in array exp.). Use all the data, automatic data cleaning. Incorporate knowledge about the task in Kernel. Can be combined with other methods. Conclusions
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.