CZ5225: Modeling and Simulation in Biology Lecture 7, Microarray Class Classification by Machine learning Methods Prof. Chen Yu Zong Tel: 6874-6877 Email:

Slides:



Advertisements
Similar presentations
Introduction to Support Vector Machines (SVM)
Advertisements

Based on slides by Pierre Dönnes and Ron Meir Modified by Longin Jan Latecki, Temple University Ch. 5: Support Vector Machines Stephen Marsland, Machine.
Lecture 9 Support Vector Machines
ECG Signal processing (2)
Image classification Given the bag-of-features representations of images from different classes, how do we learn a model for distinguishing them?
SVM - Support Vector Machines A new classification method for both linear and nonlinear data It uses a nonlinear mapping to transform the original training.
An Introduction of Support Vector Machine
Support Vector Machines
1 Lecture 5 Support Vector Machines Large-margin linear classifier Non-separable case The Kernel trick.
SVM—Support Vector Machines
Search Engines Information Retrieval in Practice All slides ©Addison Wesley, 2008.
Machine learning continued Image source:
CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.
Fuzzy Support Vector Machines (FSVMs) Weijia Wang, Huanren Zhang, Vijendra Purohit, Aditi Gupta.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.)
1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class.
Measuring Model Complexity (Textbook, Sections ) CS 410/510 Thurs. April 27, 2007 Given two hypotheses (models) that correctly classify the training.
L15:Microarray analysis (Classification). The Biological Problem Two conditions that need to be differentiated, (Have different treatments). EX: ALL (Acute.
Support Vector Machines
1 Computational Learning Theory and Kernel Methods Tianyi Jiang March 8, 2004.
Lecture outline Support vector machines. Support Vector Machines Find a linear hyperplane (decision boundary) that will separate the data.
SVM Support Vectors Machines
Support Vector Machines
What is Learning All about ?  Get knowledge of by study, experience, or being taught  Become aware by information or from observation  Commit to memory.
Lecture 10: Support Vector Machines
SVM (Support Vector Machines) Base on statistical learning theory choose the kernel before the learning process.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Linear Discriminant Functions Chapter 5 (Duda et al.)
An Introduction to Support Vector Machines Martin Law.
Support Vector Machines Piyush Kumar. Perceptrons revisited Class 1 : (+1) Class 2 : (-1) Is this unique?
This week: overview on pattern recognition (related to machine learning)
Support Vector Machine & Image Classification Applications
CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.
ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 – Fall 2014.
LSM3241: Bioinformatics and Biocomputing Lecture 3: Machine learning method for protein function prediction Prof. Chen Yu Zong Tel:
1 SUPPORT VECTOR MACHINES İsmail GÜNEŞ. 2 What is SVM? A new generation learning system. A new generation learning system. Based on recent advances in.
计算机学院 计算感知 Support Vector Machines. 2 University of Texas at Austin Machine Learning Group 计算感知 计算机学院 Perceptron Revisited: Linear Separators Binary classification.
Kernel Methods A B M Shawkat Ali 1 2 Data Mining ¤ DM or KDD (Knowledge Discovery in Databases) Extracting previously unknown, valid, and actionable.
SVM Support Vector Machines Presented by: Anas Assiri Supervisor Prof. Dr. Mohamed Batouche.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
An Introduction to Support Vector Machines (M. Law)
1 Chapter 6. Classification and Prediction Overview Classification algorithms and methods Decision tree induction Bayesian classification Lazy learning.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
Ohad Hageby IDC Support Vector Machines & Kernel Machines IP Seminar 2008 IDC Herzliya.
Lecture 4 Linear machine
Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.
Support Vector Machines and Gene Function Prediction Brown et al PNAS. CS 466 Saurabh Sinha.
Support Vector Machines
Support Vector Machines. Notation Assume a binary classification problem. –Instances are represented by vector x   n. –Training examples: x = (x 1,
A Short and Simple Introduction to Linear Discriminants (with almost no math) Jennifer Listgarten, November 2002.
Final Exam Review CS479/679 Pattern Recognition Dr. George Bebis 1.
Text Classification using Support Vector Machine Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.
Classification Course web page: vision.cis.udel.edu/~cv May 14, 2003  Lecture 34.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Chapter 6. Classification and Prediction Classification by decision tree induction Bayesian classification Rule-based classification Classification by.
Support-Vector Networks C Cortes and V Vapnik (Tue) Computational Models of Intelligence Joon Shik Kim.
Kernel Methods: Support Vector Machines Maximum Margin Classifiers and Support Vector Machines.
SUPPORT VECTOR MACHINES Presented by: Naman Fatehpuria Sumana Venkatesh.
A Brief Introduction to Support Vector Machine (SVM) Most slides were from Prof. A. W. Moore, School of Computer Science, Carnegie Mellon University.
CZ3253: Computer Aided Drug design Lecture 7: Drug Design Methods II: SVM Prof. Chen Yu Zong Tel:
Support Vector Machines Reading: Textbook, Chapter 5 Ben-Hur and Weston, A User’s Guide to Support Vector Machines (linked from class web page)
Linear Discriminant Functions Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Support Vector Machines (SVMs) Chapter 5 (Duda et al.) CS479/679 Pattern Recognition Dr. George Bebis.
Support Vector Machines
CS 9633 Machine Learning Support Vector Machines
PREDICT 422: Practical Machine Learning
Support Vector Machines
Support Vector Machines Introduction to Data Mining, 2nd Edition by
Other Classification Models: Support Vector Machine (SVM)
Support Vector Machines 2
Presentation transcript:

CZ5225: Modeling and Simulation in Biology Lecture 7, Microarray Class Classification by Machine learning Methods Prof. Chen Yu Zong Tel: Room 07-24, level 8, S16, National University of Singapore

2 Machine Learning Method Inductive learning: Example-based learning Descriptor Positive examples Negative examples

3 Machine Learning Method A=(1, 1, 1) B=(0, 1, 1) C=(1, 1, 1) D=(0, 1, 1) E=(0, 0, 0) F=(1, 0, 1) Feature vectors: Descriptor Feature vector Positive examples Negative examples

4 Machine Learning Method Feature vectors in input space: A=(1, 1, 1) B=(0, 1, 1) C=(1, 1, 1) D=(0, 1, 1) E=(0, 0, 0) F=(1, 0, 1) Z Input space X Y B A E F Feature vector

5 Vector A= (a1, a2, a3, …, aN) Task of machine learning transformed into the job for finding of a border-Line for optimal separation of the known positive and negative samples in a training-set Positive Negative Machine Learning Method

6 Patient_X= (gene_1, gene_2, gene_3, …, gene_N) N (number of dimensions) is normally larger than 2, so we can’t visualize the data. Cancerous Healthy Classifying Cancer Patients vs. Healthy Patients from Microarray

7 Cancerous Healthy Gene_1 expression level For simplicity, pretend that we are only looking at expression levels of 2 genes Gene_2 expression level Up-regulated Down-regulated

8 Classifying Cancer Patients vs. Healthy Patients from Microarray Cancerous Healthy Gene_1 expression level Question: How can we build a classifier for this data? Gene_2 expression level

9 Classifying Cancer Patients vs. Healthy Patients from Microarray Cancerous Healthy Gene_1 expression level Simple Classification Rule: IF gene_1 <0 AND gene_2 <0 THEN person=healthy IF gene_1 >0 AND gene_2 >0 THEN person=cancerous Gene_2 expression level

10 Classifying Cancer Patients vs. Healthy Patients from Microarray Simple Classification Rule: IF gene_1 <0 AND gene_2 <0 AND … gene 5000 < Y THEN person=healthy IF gene_1 >0 AND gene_2 >0 … gene 5000 >W THEN person=cancerous If we move away from our simple example with 2 genes to a realistic case with say 5000 genes, then 1.What will these rules look like? 2.How will we find them? Gets a little complicated, unwieldy…

11 Classifying Cancer Patients vs. Healthy Patients from Microarray Cancerous Healthy Gene_1 expression level Gene_2 expression level Reformulate the previous rule SIMPLE RULE: If data point lies to the ‘left’ of the line, then ‘healthy’. If data point lies to ‘right’ of line then ‘cancerous’ It is easier to generalize this line to 5000 genes than it is a list of rules. Also easier to solve mathematically.

12 Extension to More Than 2 Genes (dimensions) Cancerous Healthy Line in 2D: x 1 C 1 + x 2 C 2 = T If we had 3 genes, and needed to build a ‘line’ in 3-dimensional space, then we would be seeking a plane. Plane in 3D: x 1 C 1 + x 2 C 2 + x 3 C 3 = T If we were looking in more than 3 dimensions, the ‘plane’ is called a hyperplane. A hyperplane is simply a generalization of a plane to dimensions higher than 3. Hyperplane in N-dimensions: x 1 C 1 + x 2 C 2 + x 3 C 3 + … + x N C N = T

13 Classification Methods (1)

14 Classification Methods (1)

15 Classification Methods (2)

16 Classification Methods (2)

17 Classification Methods (2)

18 Classification Methods (2)

19 Classification Methods (3)

20 Classification Methods (3) K Nearest Neighbor Method

21 Classification Methods (4)

22 Classification Methods (4)

23 Classification Methods (4)

24 Classification Methods (5) SVM What is SVM? Support vector machines, a machine learning method, learning by examples, statistical learning, classify objects into one of the two classes. Advantages of SVM: Diversity of class members (no racial discrimination). Low over-fitting risk Easier to find “optimal” parameters for better class differentiation performance

25 Classification Methods (5) SVM Method Border New border Project to a higher dimensional space Protein family members Nonmembers Protein family members Nonmembers

26 Classification Methods (5) SVM method Support vector New border Protein family members Nonmembers

27 What is a good Decision Boundary? Consider a two-class, linearly separable classification problem Many decision boundaries! –The Perceptron algorithm can be used to find such a boundary –Different algorithms have been proposed Are all decision boundaries equally good? Class 1 Class 2

28 Examples of Bad Decision Boundaries Class 1 Class 2 Class 1 Class 2

29 Large-margin Decision Boundary The decision boundary should be as far away from the data of both classes as possible –We should maximize the margin, m –Distance between the origin and the line w t x=k is k/||w|| Class 1 Class 2 m

30 SVM Method Protein family members Nonmembers New border Support vector

31 SVM Method Border line is nonlinear

32 SVM method Non-linear transformation: use of kernel function

33 SVM method Non-linear transformation

34 Mathematical Algorithm of SVM

35 Mathematical Algorithm of SVM

36 Empirical errorComplexity tradeoff Mathematical Algorithm of SVM

37 Map data to higher dimensional space, feature space Construct linear classifier in this space Which can be written as Mathematical Algorithm of SVM Nonlinear decision boundaries

38 Mathematical Algorithm of SVM

39 SVM Performance Measure

40 SVM Performance Measure

41 SVM Performance Measure

42 SVM Performance Measure Sensitivity P+ =TP/(TP+FN) accuracy for positive samples Specificity P- =TN/(TN+FP) accuracy for negative samples Overall prediction accuracy Matthews correlation coefficient

43 Why SVM Works? The feature space is often very high dimensional. Why don’t we have the curse of dimensionality? A classifier in a high-dimensional space has many parameters and is hard to estimate Vapnik argues that the fundamental problem is not the number of parameters to be estimated. Rather, the problem is about the flexibility of a classifier Typically, a classifier with many parameters is very flexible, but there are also exceptions –Let x i =10 i where i ranges from 1 to n. The classifier can classify all x i correctly for all possible combination of class labels on x i –This 1-parameter classifier is very flexible

44 Why SVM works? Vapnik argues that the flexibility of a classifier should not be characterized by the number of parameters, but by the flexibility (capacity) of a classifier –This is formalized by the “VC-dimension” of a classifier Consider a linear classifier in two-dimensional space If we have three training data points, no matter how those points are labeled, we can classify them perfectly

45 VC-dimension However, if we have four points, we can find a labeling such that the linear classifier fails to be perfect We can see that 3 is the critical number The VC-dimension of a linear classifier in a 2D space is 3 because, if we have 3 points in the training set, perfect classification is always possible irrespective of the labeling, whereas for 4 points, perfect classification can be impossible

46 VC-dimension The VC-dimension of the nearest neighbor classifier is infinity, because no matter how many points you have, you get perfect classification on training data The higher the VC-dimension, the more flexible a classifier is VC-dimension, however, is a theoretical concept; the VC-dimension of most classifiers, in practice, is difficult to be computed exactly –Qualitatively, if we think a classifier is flexible, it probably has a high VC-dimension