Presentation is loading. Please wait.

Presentation is loading. Please wait.

A presentation on the topic For CIS 595 Bioinformatics course

Similar presentations


Presentation on theme: "A presentation on the topic For CIS 595 Bioinformatics course"— Presentation transcript:

1 Classification of microarray gene expression data using support vector machines (SVM)
A presentation on the topic For CIS 595 Bioinformatics course by Despina Kontos Spring 2003 – Temple University

2 Overview… What are microarray gene expression data?
What are Support Vectors Machines? How can we use them to utilize these gene expression data? CLASSIFICATION EXPERIMENTS !!!

3 Microarrays… What are they anyway?? Gene expression levels on tissue or cell for varying environment conditions

4 Microarrays… From a machine learning point of view…
Genes Experiment g-1 g-2 …… g-n ex-1 ex-2 ……. ex-m Tissue classification Function classification

5 Support Vector Machines (SVM)
Linear classifiers Attempt to avoid overfitting by finding the optimal hyperplane that separates the data HOW??? By maximizing the Margin.. Support Vectors Introduced by V.Vapnic and co-workers in 1995

6 Support Vector Machines (SVM)
And what about datasets that are not linearly separable?? Map the data into higher dimensional space and make linear classification there (theorem!!)

7 Support Vector Machines (SVM)
Some mathematical formulations… We need ONLY the support vectors for computations!! We can use KERNEL functions to avoid computations in higher dimensional space

8 Some experiments… M.P.S.Brown, W.N.Grundy, D.Lin, N.Cristianini, C.W.Sugnet, T.S.Furey, M.Ares Jr. and D.Haussler,“Knowledge-based analysis of microarray gene expression data by using support vector machines", Proc.Natl.Acad.Sci.USA,97, 1, pp , 2000. Classification of gene function from microarray data using SVM 2,476 genes 79 DNA hybridization experiments 6 gene function families Genes Experiment g-1 g-2 …… g-n ex-1 ex-2 ……. ex-m SVM provided optimal classification!!! F1 F2 F3 ... Function Classification

9 Gene expression data on tissue
More experiments… T.furey, N.Cristianini, N. Duffy, D. Bednarski, M. Schummer and D Haussler, “Support Vector Machine Classification and Validation of Cancer Tissue Samples Using Microarray Expressioin Data”, Bioinformatics, 2000. Gene expression data on tissue 97,802 DNA clones 31 tissue samples Cancer ovarian Normal ovarian Normal non-ovarian Genes Experiment g-1 g-2 …… g-n ex-1 ex-2 ……. ex-m Cancer Not Cancer ... Tissue Classification

10 Conclusions Microarray gene expression data are a very useful format of biological information (..expensive to obtain!!) SVM new and very promising classification apprach A lot of research still to be done on Biological information processing using techniques developed in fields such as Machine Learning, Data Mining, etc..

11 Additional resources.. N. Cristianini. ICML'01 tutorial, 2001
Osuna, R. Freund, and F. Girosi. Support vector machines: Training and applications. In A.I. Memo. MIT A.I. Lab, 1996 N. Cristianini. ICML'01 tutorial, 2001

12 THANK YOU!!!!!


Download ppt "A presentation on the topic For CIS 595 Bioinformatics course"

Similar presentations


Ads by Google